Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure create_instance does not return None #4532

Merged
merged 1 commit into from May 11, 2020

Conversation

willbeaufoy
Copy link
Contributor

@willbeaufoy willbeaufoy commented May 2, 2020

Currently create_instance() can return None (or any value) if an extension is
incorrectly implemented, but the extension will still show up as
enabled in the logs. This can cause confusion, as in the linked bug.

This change prevents this occurring by throwing an error if
create_instance() will not return an instance of the correct class.

(edit) fixes #4528

@codecov
Copy link

codecov bot commented May 2, 2020

Codecov Report

Merging #4532 into master will decrease coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #4532      +/-   ##
==========================================
- Coverage   84.54%   84.52%   -0.02%     
==========================================
  Files         164      164              
  Lines        9897     9914      +17     
  Branches     1480     1476       -4     
==========================================
+ Hits         8367     8380      +13     
- Misses       1261     1267       +6     
+ Partials      269      267       -2     
Impacted Files Coverage Δ
scrapy/utils/misc.py 97.27% <100.00%> (+0.15%) ⬆️
scrapy/core/downloader/__init__.py 89.47% <0.00%> (-1.51%) ⬇️
scrapy/pipelines/files.py 59.93% <0.00%> (-1.00%) ⬇️
scrapy/utils/gz.py 96.87% <0.00%> (ø)
scrapy/utils/log.py 88.76% <0.00%> (ø)
scrapy/utils/ssl.py 53.65% <0.00%> (ø)
scrapy/core/engine.py 87.87% <0.00%> (ø)
scrapy/spiderloader.py 100.00% <0.00%> (ø)
scrapy/utils/request.py 100.00% <0.00%> (ø)
scrapy/pipelines/media.py 97.29% <0.00%> (ø)
... and 13 more

scrapy/utils/misc.py Outdated Show resolved Hide resolved
scrapy/utils/misc.py Outdated Show resolved Hide resolved
@willbeaufoy
Copy link
Contributor Author

willbeaufoy commented May 5, 2020

Should be ready now.

@willbeaufoy willbeaufoy changed the title Prevent create_instance() returning None (#4528) Ensure create_instance() returns the correct type (#4528) May 5, 2020
@Gallaecio
Copy link
Member

Gallaecio commented May 5, 2020

I wonder if there may be rare cases where isinstance may be a problem (e.g a factory class?).

Edit: On the other hand, if they are rare enough, and given https://docs.python.org/3/reference/datamodel.html#customizing-instance-and-subclass-checks exists (found out this week), the change may be worth it.

@willbeaufoy
Copy link
Contributor Author

willbeaufoy commented May 6, 2020

I wonder if there may be rare cases where isinstance may be a problem (e.g a factory class?).

Edit: On the other hand, if they are rare enough, and given https://docs.python.org/3/reference/datamodel.html#customizing-instance-and-subclass-checks exists (found out this week), the change may be worth it.

Yeah I thought a bit about this, but as this is my first contribution to this project I thought I didn't really know whether this rare case would actually come up. Let me know what you guys decide, if we want to allow for this rare case then I will just revert the last three commits so it just checks instance is not None again.

@elacuesta
Copy link
Member

elacuesta commented May 7, 2020

@willbeaufoy We discussed this internally and realized that introducing the isinstance check would be a backward-incompatible change, which would need more extensive docs and tests. The initial check for None should be enough for now, I'm sorry for the trouble.

@willbeaufoy willbeaufoy force-pushed the crawler-none-fix branch 2 times, most recently from 2efa70b to 03e3b33 Compare May 7, 2020
@willbeaufoy
Copy link
Contributor Author

willbeaufoy commented May 7, 2020

@willbeaufoy We discussed this internally and realized that introducing the isinstance check would be a backward-incompatible change, which would need more extensive docs and tests. The initial check for None should be enough for now, I'm sorry for the trouble.

No prob! Have reverted to my original commit so it should be ready now.

scrapy/utils/misc.py Outdated Show resolved Hide resolved
@elacuesta elacuesta changed the title Ensure create_instance() returns the correct type (#4528) Ensure create_instance does not return None May 8, 2020
Currently create_instance() can return None if an extension is
incorrectly implemented, but the extension will still show up as
enabled in the logs. This can cause confusion, as in the linked bug.

This change prevents this occurring by throwing an error if
create_instance() will return None.
Copy link
Member

@elacuesta elacuesta left a comment

Thanks!

@Gallaecio Gallaecio merged commit cf9be53 into scrapy:master May 11, 2020
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants