Grab Framework Project
Status of Project
I myself have not used Grab for many years. I am not sure it is being used by anybody at present time. Nonetheless I decided to refactor the project, just for fun. I have annotated whole code base with mypy type hints (in strict mode). Also the whole code base complies to pylint and flake8 requirements. There are few exceptions: very large methods and classes with too many local atributes and variables. I will refactor them eventually.
The current and the only network backend is urllib3.
I have refactored a few components into external packages: proxylist, procstat, selection, unicodec, user_agent
Feel free to give feedback in Telegram groups: @grablab and @grablab_ru
Things to be done next
- Refactor source code to remove all pylint disable comments like:
- Make 100% test coverage, it is about 95% now
- Release new version to pypi
- Refactor more components into external packages
- More abstract interfaces
- More data structures and types
- Decouple connections between internal components
That will install old Grab released in 2018 year:
pip install -U grab
The updated Grab available in github repository is 100% not compatible with spiders and crawlers written for Grab released in 2018 year.
Updated documenation is here https://grab.readthedocs.io/en/latest/ Most updates are removings content related to features I have removed from the Grab since 2018 year.
Documentation for old Grab version 0.6.41 (released in 2018 year) is here https://grab.readthedocs.io/en/v0.6.41-doc/