Scrapy Inline Requests

A decorator for writing coroutine-like spider callbacks.

Free software: MIT license
Documentation: https://scrapy-inline-requests.readthedocs.org.
Python versions: 2.7, 3.4+

Quickstart

The spider below shows a simple use case of scraping a page and following a few links:

from inline_requests import inline_requests
from scrapy import Spider, Request

class MySpider(Spider):
    name = 'myspider'
    start_urls = ['http://httpbin.org/html']

    @inline_requests
    def parse(self, response):
        urls = [response.url]
        for i in range(10):
            next_url = response.urljoin('?page=%d' % i)
            try:
                next_resp = yield Request(next_url)
                urls.append(next_resp.url)
            except Exception:
                self.logger.info("Failed request %s", i, exc_info=True)

        yield {'urls': urls}

See the examples/ directory for a more complex spider.

Warning

The generator resumes its execution when a request's response is processed, this means the generator won't be resume after yielding an item or a request with it's own callback.

Known Issues

Middlewares can drop or ignore non-200 status responses causing the callback to not continue its execution. This can be overcome by using the flag handle_httpstatus_all. See the httperror middleware documentation.
High concurrency and large responses can cause higher memory usage.
This decorator assumes your method have the following signature (self, response).
Wrapped requests may not be able to be serialized by persistent backends.
Unless you know what you are doing, the decorated method must be a spider method and return a generator instance.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
docs		docs
example		example
src/inline_requests		src/inline_requests
tests		tests
.bumpversion.cfg		.bumpversion.cfg
.cookiecutterrc		.cookiecutterrc
.coveragerc		.coveragerc
.editorconfig		.editorconfig
.gitignore		.gitignore
.travis.yml		.travis.yml
AUTHORS.rst		AUTHORS.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
HISTORY.rst		HISTORY.rst
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.rst		README.rst
TODO.rst		TODO.rst
VERSION		VERSION
dev-requirements.in		dev-requirements.in
dev-requirements.txt		dev-requirements.txt
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt
requirements-install.txt		requirements-install.txt
requirements-setup.txt		requirements-setup.txt
requirements-tests.txt		requirements-tests.txt
requirements.in		requirements.in
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scrapy Inline Requests

Quickstart

Known Issues

About

Releases

Packages

Contributors 3

Languages

License

rmax/scrapy-inline-requests

Folders and files

Latest commit

History

Repository files navigation

Scrapy Inline Requests

Quickstart

Known Issues

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages