-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document Reppy Python version support #5226
Comments
I would be fine with not running tests for it and documenting that it's "not really supported" but I don't think we do this? Otherwise I'm fine with deprecating and then removing it. For the context, it was first released in 1.8, together with all other "new" robots.txt parsers as a GSoC 2019 contribution, and while it was requested in the initial GSoC issue #3656, it was attempted much earlier: #949 So I think as long as we have other supported parsers it's fine to remove this one. |
The parser comparison in our docs: https://docs.scrapy.org/en/latest/topics/downloader-middleware.html#module-scrapy.downloadermiddlewares.robotstxt The performance comparison (linked from where): https://anubhavp28.github.io/gsoc-weekly-checkin-12/ So the main feature of Reppy is being much faster (which isn't really important for spiders that scrape 0 or 1 robots.txt file per run but may be important e.g. for broad crawls). |
We’ve decided to postpone deprecation for 1 year before 3.8 end of life. If by that time the issue remains, we will deprecate so that by the time Scrapy drops 3.8 support it also drops reppy support. Right now we need to document the Python version requirement. |
Can I work on this issue @Gallaecio ? |
@umairnsr87 Yes, please! We need to update the documentation about Reppy, to clearly indicate that it only work with Python 3.8 and earlier. |
The optional dependency on reppy for one of the built-in robots.txt parsers is preventing us from running the extra-dependencies CI job with Python 3.9+. https://github.com/seomoz/reppy has not have a commit for ~1.5 years.
So I think we should deprecate the component.
If we don’t, we should document this limitation, and schedule a deprecation for 1 year before Python 3.8 reaches end of life,
i.e. in 9 months, because once we drop Python 3.8 support we will be forced to remove this component anyway, so giving a deprecation warning 1 year before is probably in the best interest of any user of the component.The text was updated successfully, but these errors were encountered: