-
Notifications
You must be signed in to change notification settings - Fork 10.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Importing reactor in spider's file can cause ReactorAlreadyInstalledError
#5525
Comments
I think importing the reactor from any code that is executed before the Scrapy code can handle reactor initialization has become a bad practice. I believe the best way forward is to avoid that import until Scrapy has setup the right reactor. Even if we made it so that your code works, it would still fail if in the future you decide to switch reactors, or Scrapy changes the default reactor. Now, you have not shared your code, but I imagine you can rewrite it so that the reactor is imported within a function or method that uses it, so that importing the spider file or class does not cause this issue. That said, @wRAR, could you have a look into this? I wonder if this could be breaking pre-2.6 code, and if we could make it so that things do not break where the installed reactor is the one Scrapy would install. If we can fix it, maybe we could also log a deprecation warning (e.g. “Importing twisted.internet.reactor before Scrapy initializes it is deprecated. You should move such imports within scopes (e.g. functions, methods) that are only called once Scrapy has initialized the reactor.”), and we should discuss if this is worth adding to 2.6.2 or 2.6.3. |
Without thinking too much about this:
|
I understand if this isn't going to be supported. I did consider importing the reactor where it's used, which is a |
Well, it's possible that it will be supported unless TWISTED_REACTOR is set to non-default, we just need to think about possible problems. |
I have verified that from scrapy import Spider
from twisted.internet import reactor
class ToScrapeComSpider(Spider):
name = 'toscrape_com'
start_urls = ['https://toscrape.com']
def parse(self, response):
yield {'html': response.text} |
|
Description
When running a spider with
scrapy crawl ... <spider>
, due to a change in the way the reactor is installed (possibly introduced in 60c8838), importing reactor from twisted at the top of that spider's python file will now cause an error (in the case whereTWISTED_REACTOR
is not defined):twisted.internet.error.ReactorAlreadyInstalledError: reactor already installed
I was able to work around the issue by explicitly setting
TWISTED_REACTOR
which causes scrapy'sinstall_reactor()
helper to be used (which explicitly suppresses that error). This doesn't allow me to use twisted's default reactor per platform (without recreating the logic to do so myself).Steps to Reproduce
from twisted.internet import reactor
at the top of a spider's python file(importing the reactor module installs the default reactor)
Versions
The text was updated successfully, but these errors were encountered: