Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] wrong reactor installed error #4406

Merged
merged 5 commits into from Mar 14, 2020
Merged

Conversation

adityaa30
Copy link
Contributor

@adityaa30 adityaa30 commented Mar 6, 2020

  • Error occured when TWISTED_REACTOR in settings.py was set
  • remove all top-level imports for twisted.internet.reactor
  • add example, reason for top-level imports in TWISTED_REACTOR in docs/settings.rst

Fixes #4401

@codecov
Copy link

@codecov codecov bot commented Mar 6, 2020

Codecov Report

Merging #4406 into master will increase coverage by 0.13%.
The diff coverage is 83.33%.

@@            Coverage Diff             @@
##           master    #4406      +/-   ##
==========================================
+ Coverage   84.57%   84.71%   +0.13%     
==========================================
  Files         166      166              
  Lines        9872     9915      +43     
  Branches     1467     1474       +7     
==========================================
+ Hits         8349     8399      +50     
+ Misses       1268     1258      -10     
- Partials      255      258       +3     
Impacted Files Coverage Δ
scrapy/utils/benchserver.py 28.12% <0.00%> (-3.13%) ⬇️
scrapy/utils/testsite.py 83.87% <50.00%> (-2.80%) ⬇️
scrapy/mail.py 73.75% <66.66%> (-0.61%) ⬇️
scrapy/core/downloader/__init__.py 89.47% <100.00%> (-1.44%) ⬇️
scrapy/core/downloader/handlers/ftp.py 98.36% <100.00%> (ø)
scrapy/core/downloader/handlers/http10.py 100.00% <100.00%> (ø)
scrapy/core/downloader/handlers/http11.py 92.96% <100.00%> (+0.10%) ⬆️
scrapy/extensions/closespider.py 95.12% <100.00%> (ø)
scrapy/utils/testproc.py 78.94% <100.00%> (+0.56%) ⬆️
scrapy/utils/trackref.py 82.85% <0.00%> (-2.86%) ⬇️
... and 9 more

@wRAR
Copy link
Contributor

@wRAR wRAR commented Mar 11, 2020

I think you've missed several top-level imports, for example one in https://github.com/scrapy/scrapy/blob/master/scrapy/core/downloader/__init__.py

How did you search for them?

scrapy/mail.py Show resolved Hide resolved
:mod:`~twisted.internet.reactor` imports in project files & imported
3rd party libraries may also raise :exc:`Exception` as Twisted will install the
default reactor before scrapy installs reactor specified in
:setting:`TWISTED_REACTOR` leading to a mismatch in installed reactor.
Copy link
Contributor

@wRAR wRAR Mar 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those top-level imports won't raise an exception, Scrapy will, when it checks which reactor is installed, so this can be rephrased.

Copy link
Contributor Author

@adityaa30 adityaa30 Mar 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I will change it to:
CrawlerRunner raises Exception if the installed reactor does not match the TWISTED_REACTOR setting; therefore, having top-level twisted.internet.reactor imports in project files & imported 3rd party libraries will make Scrapy raise Exception when it checks which reactor is installed.

docs/topics/settings.rst Show resolved Hide resolved
@adityaa30
Copy link
Contributor Author

@adityaa30 adityaa30 commented Mar 11, 2020

I think you've missed several top-level imports, for example one in https://github.com/scrapy/scrapy/blob/master/scrapy/core/downloader/__init__.py

How did you search for them?

@wRAR My bad. I will fix this now. I made a mistake in my search earlier. Now I am searching for from twisted.internet import in all the files using the VS Code's global search feature.

Also, I am a bit confused about some of the import such as those in

as these are used either at top level or inside a __main__ scope. What to do with these imports?

Should I move top-level imports in tests (test_downloadermiddleware_robotstxt.py, test_engine.py, test_pipeline_media.py, test_utils_defer.py, test_utils_signal.py, test_webclient.py) as well?

adityaa30 added 3 commits Mar 11, 2020
- Error occured when TWISTED_REACTOR in settings.py was set
- remove all top-level imports for twisted.internet.reactor
- update docs for TWISTED_REACTOR in settings.rst
@adityaa30
Copy link
Contributor Author

@adityaa30 adityaa30 commented Mar 12, 2020

@wRAR Also, @nyov suggested an idea to "replace the twisted.internet.reactor function with a custom lazy-loading reactor-loading function instead". I think this is feasible and mentioned a solution using importlib. This will reduce the twisted.internet.reactor imports and also make it declared at top-level in scrapy codebase. I wanted a review from you about this idea?

@wRAR
Copy link
Contributor

@wRAR wRAR commented Mar 12, 2020

Also, I am a bit confused about some of the import such as those in

benchserver.py is only used as a separate process. Not sure what does extras/qps-bench-server.py do as it's not used by the Scrapy code. OTOH scrapy.utils.testsite is imported by several tests so it makes sense to fix it as well, even if it's usable only in tests.

Should I move top-level imports in tests

The reactor in tests is currently installed early, via the pytest-twisted plugin, so I think there is currently no need to do it.

an idea to "replace the twisted.internet.reactor function with a custom lazy-loading reactor-loading function instead"

I don't currently see benefits of doing this, especially as it won't directly help with user or 3rd party code, but it's nice to have this as an option in case we need it.

@wRAR wRAR added this to the 2.0.1 milestone Mar 12, 2020
@adityaa30
Copy link
Contributor Author

@adityaa30 adityaa30 commented Mar 12, 2020

@wRAR Thanks for you reply. I have refactored benchserver.py and testsite.py. I feel all top-level imports are refactored now (except those in tests/* and extras/qps-bench-server.py).

but it's nice to have this as an option in case we need it

Should I add the LazyLoader class to the codebase? If yes, where will the best place to add it?

@Gallaecio
Copy link
Member

@Gallaecio Gallaecio commented Mar 12, 2020

@adityaa30 Since this is something we would like to fix rather soon, and to keep things simple, I think it’s best not to address the lazy loading in this issue.

We can address it later in a different issue if we see a use for it. A get_reactor() function or similar that imports and returns the reactor should be trivial to implement somewhere under scrapy.utils.* for anyone to use.

docs/topics/settings.rst Outdated Show resolved Hide resolved
docs/topics/settings.rst Outdated Show resolved Hide resolved
docs/topics/settings.rst Outdated Show resolved Hide resolved
@adityaa30
Copy link
Contributor Author

@adityaa30 adityaa30 commented Mar 13, 2020

@adityaa30 Since this is something we would like to fix rather soon, and to keep things simple, I think it’s best not to address the lazy loading in this issue.

Sure.
@Gallaecio Thanks for your review. I will update this pr with requested changes.

@adityaa30
Copy link
Contributor Author

@adityaa30 adityaa30 commented Mar 13, 2020

@wRAR @Gallaecio I am not really sure what happened here. TravisCI passed the tests at 8bbc445 (Link to result) but after 762a7e3 the tests have started to fail. What could be the possible reason?

@Gallaecio
Copy link
Member

@Gallaecio Gallaecio commented Mar 13, 2020

It is not your code, it looks like it is happening in master.

wRAR
wRAR approved these changes Mar 13, 2020
@Gallaecio Gallaecio merged commit f9bf4b8 into scrapy:master Mar 14, 2020
1 of 2 checks passed
@Gallaecio
Copy link
Member

@Gallaecio Gallaecio commented Mar 14, 2020

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants