Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crawler dies and never recovers #10

Open
alexristich opened this issue Jul 28, 2017 · 0 comments
Open

Crawler dies and never recovers #10

alexristich opened this issue Jul 28, 2017 · 0 comments
Labels

Comments

@alexristich
Copy link

I ran an instance with 2 crawlers over the top 1000 sites. At some point during the test, Crawler 1 failed and never recovered. I had the timeout set at 60 seconds. Here's the readout from the console when the crawler died:

Crawler 1 timed out fetching http://www.patch.com/
Stopping Crawler 1
Starting Crawler 1
Process Crawler 1:
Traceback (most recent call last):
  File "/home/alex/chameleon-crawler/crawler/crawler_manager.py", line 43, in __init__
    timeout * ((num_timeouts + 1) ** 2)
  File "/usr/lib/python3.5/multiprocessing/queues.py", line 105, in get
    raise Empty
queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/alex/chameleon-crawler/crawler/crawler_process.py", line 35, in __init__
    self.crawl()
  File "/home/alex/chameleon-crawler/crawler/crawler_process.py", line 40, in crawl
    with self.selenium():
  File "/usr/lib/python3.5/contextlib.py", line 59, in __enter__
    return next(self.gen)
  File "/home/alex/chameleon-crawler/crawler/crawler_process.py", line 106, in selenium
    self.startup()
  File "/home/alex/chameleon-crawler/crawler/crawler_process.py", line 122, in startup
    self.driver = webdriver.Chrome(chrome_options=opts)
  File "/home/alex/.local/lib/python3.5/site-packages/selenium/webdriver/chrome/webdriver.py", line 65, in __init__
    keep_alive=True)
  File "/home/alex/.local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 73, in __init__
    self.start_session(desired_capabilities, browser_profile)
  File "/home/alex/.local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 121, in start_session
    'desiredCapabilities': desired_capabilities,
  File "/home/alex/.local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 173, in execute
    self.error_handler.check_response(response)
  File "/home/alex/.local/lib/python3.5/site-packages/selenium/webdriver/remote/errorhandler.py", line 166, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: failed to wait for extension background page to load: chrome-extension://mcgekeccgjgcmhnhbabplanchdogjcnh/_generated_background_page.html
from timeout
  (Driver info: chromedriver=2.29,platform=Linux 4.4.0-87-generic x86_64)
@ghostwords ghostwords added the bug label Mar 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants