You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 30, 2023. It is now read-only.
We use PhantomJS for large web crawls in the project ISP Data Pollution.
We observe intermittent but persistent python crash and even system freeze issues after calling PhantomJS on the order of ten thousand GET calls. Each GET is an independent PhantomJS process called after the previous process has been quit cleanly.
I've wrapped all the calls to phantomjs methods with signal timeouts that should return control to the script if phantomjs hangs.
Here is the Python error message after a crash, caused by a call to the .quit method:
$ ~/bin/isp_data_pollution.py -mm 0
Downloading the blacklist… done.
Display formats:
Downloading: website.com; NNNNN links [in library], H(domain)= B bits [entropy]
Downloaded: website.com: +LLL/NNNNN links [added], H(domain)= B bits [entropy]
Fatal Python error: Cannot recover from stack overflow.
Current thread 0x00007fff993a33c0 (most recent call first):
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/enum.py", line 228 in __call__
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/socket.py", line 88 in _intenum_converter
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/socket.py", line 539 in getaddrinfo
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/socket.py", line 498 in create_connection
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 871 in connect
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 898 in send
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 963 in _send_output
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1133 in endheaders
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1182 in _send_request
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1137 in request
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 1183 in do_open
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 1211 in http_open
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 442 in _call_chain
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 482 in _open
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 464 in open
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/remote_connection.py", line 489 in _request
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/remote_connection.py", line 415 in execute
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 236 in execute
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 522 in quit
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/phantomjs/webdriver.py", line 71 in quit
File "/Users/username/bin/isp_data_pollution.py", line 255 in phantomjs_quit
File "/Users/username/bin/isp_data_pollution.py", line 747 in call_func
File "/Users/username/bin/isp_data_pollution.py", line 256 in quit_session
File "/Users/username/bin/isp_data_pollution.py", line 764 in phantomjs_hang_handler
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/socket.py", line 507 in create_connection
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 871 in connect
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 898 in send
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 963 in _send_output
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1133 in endheaders
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1182 in _send_request
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1137 in request
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 1183 in do_open
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 1211 in http_open
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 442 in _call_chain
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 482 in _open
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 464 in open
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/remote_connection.py", line 489 in _request
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/remote_connection.py", line 415 in execute
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 236 in execute
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 522 in quit
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/phantomjs/webdriver.py", line 71 in quit
File "/Users/username/bin/isp_data_pollution.py", line 255 in phantomjs_quit
File "/Users/username/bin/isp_data_pollution.py", line 747 in call_func
File "/Users/username/bin/isp_data_pollution.py", line 256 in quit_session
File "/Users/username/bin/isp_data_pollution.py", line 764 in phantomjs_hang_handler
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/subprocess.py", line 491 in _eintr_retry_call
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/subprocess.py", line 1517 in _try_wait
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/subprocess.py", line 1569 in wait
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/common/service.py", line 163 in stop
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/phantomjs/webdriver.py", line 76 in quit
File "/Users/username/bin/isp_data_pollution.py", line 255 in phantomjs_quit
File "/Users/username/bin/isp_data_pollution.py", line 747 in call_func
File "/Users/username/bin/isp_data_pollution.py", line 256 in quit_session
File "/Users/username/bin/isp_data_pollution.py", line 764 in phantomjs_hang_handler
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/socket.py", line 507 in create_connection
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 871 in connect
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 898 in send
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 963 in _send_output
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1133 in endheaders
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1182 in _send_request
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1137 in request
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 1183 in do_open
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 1211 in http_open
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 442 in _call_chain
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 482 in _open
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 464 in open
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/remote_connection.py", line 489 in _request
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/remote_connection.py", line 415 in execute
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 236 in execute
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 522 in quit
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/phantomjs/webdriver.py", line 71 in quit
File "/Users/username/bin/isp_data_pollution.py", line 255 in phantomjs_quit
File "/Users/username/bin/isp_data_pollution.py", line 747 in call_func
File "/Users/username/bin/isp_data_pollution.py", line 256 in quit_session
File "/Users/username/bin/isp_data_pollution.py", line 764 in phantomjs_hang_handler
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/socket.py", line 507 in create_connection
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 871 in connect
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 898 in send
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 963 in _send_output
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1133 in endheaders
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1182 in _send_request
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1137 in request
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 1183 in do_open
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 1211 in http_open
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 442 in _call_chain
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 482 in _open
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/request.py", line 464 in open
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/remote_connection.py", line 489 in _request
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/remote_connection.py", line 415 in execute
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 236 in execute
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/remote/webdriver.py", line 522 in quit
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/selenium/webdriver/phantomjs/webdriver.py", line 71 in quit
File "/Users/username/bin/isp_data_pollution.py", line 255 in phantomjs_quit
File "/Users/username/bin/isp_data_pollution.py", line 747 in call_func
File "/Users/username/bin/isp_data_pollution.py", line 256 in quit_session
File "/Users/username/bin/isp_data_pollution.py", line 764 in phantomjs_hang_handler
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/socket.py", line 507 in create_connection
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 871 in connect
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 898 in send
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 963 in _send_output
...
Abort trap: 6
We use PhantomJS for large web crawls in the project ISP Data Pollution.
We observe intermittent but persistent python crash and even system freeze issues after calling PhantomJS on the order of ten thousand GET calls. Each GET is an independent PhantomJS process called after the previous process has been quit cleanly.
I've wrapped all the calls to
phantomjs
methods withsignal
timeouts that should return control to the script ifphantomjs
hangs.Here is the Python error message after a crash, caused by a call to the
.quit
method:Also see essandess/isp-data-pollution#26
phantomjs --version
.2.1.1
What steps will reproduce the problem?
Run
isp_data_pollution.py -mm 0 -g
for hours to days.Which operating system are you using?
macOS 10.12.4
Python 3.2
Binary downloaded from http://phantomjs.org.
See above.
The text was updated successfully, but these errors were encountered: