-
Notifications
You must be signed in to change notification settings - Fork 10.4k
GSoC 2014 Draft Ideas
Pablo Hoffman edited this page Feb 14, 2014
·
10 revisions
These ideas need to be polished and better specified before moving them to the final ideas page.
- TODO:
- This one should be better specified.
Brief explanation | Improve Scrapy API using generators |
Expected results | Scrapy should provide an easy way to build a single item from several pages, ... |
Required skills | Python, general understanding of async code, API design |
Mentor(s) | Mikhail Korobov, Rolando Espinoza |
There are areas where Scrapy usability and efficiency can be improved by using generators, for example:
- Integrate something like Rolando's https://github.com/darkrho/scrapy-inline-requests;
- ensure generators are not exhausted needlessly in various places;
- provide an easier alternative to spider_idle signal, something in line with https://github.com/scrapy/scrapy/issues/456
- ...
Reading list:
- http://www.python.org/dev/peps/pep-0342/
- http://www.tornadoweb.org/en/stable/gen.html
- http://twistedmatrix.com/documents/13.0.0/core/howto/defer.html
- https://twistedmatrix.com/trac/wiki/DeferredGenerator
Brief explanation | Improve Javascript integration by using Splash to render and execute Javascript. |
Expected results | A Scrapy middleware to integrate with Splash |
Required skills | Scrapy |
Mentor(s) | Mikhail Korobov, Daniel Graña |