A decorator to write coroutine-like spider callbacks.
Python Makefile
Permalink
Failed to load latest commit information.
docs MAINT: Update project boilerplate files improvements. Jul 5, 2016
example BUG: Fix URL in example spider. Jul 5, 2016
src/inline_requests Typo in Value error Jul 13, 2016
tests TST: Fixed test not behaving as expected. Aug 2, 2016
.bumpversion.cfg REL: Bump version to 0.3.1. Jul 5, 2016
.cookiecutterrc REL: Bump version to 0.3.1. Jul 5, 2016
.coveragerc MAINT: Update project boilerplate files improvements. Jul 5, 2016
.editorconfig Major project structure overhaul and added initial tests. Jun 24, 2016
.gitignore MAINT: Update project boilerplate files improvements. Jul 5, 2016
.travis.yml MAINT: Update project boilerplate files improvements. Jul 5, 2016
AUTHORS.rst Major project structure overhaul and added initial tests. Jun 24, 2016
CONTRIBUTING.rst Major project structure overhaul and added initial tests. Jun 24, 2016
HISTORY.rst REL: Bump version to 0.3.1. Jul 5, 2016
LICENSE Major project structure overhaul and added initial tests. Jun 24, 2016
MANIFEST.in MAINT: Update project boilerplate files improvements. Jul 5, 2016
Makefile MAINT: Update project boilerplate files improvements. Jul 5, 2016
README.rst MAINT: Update project boilerplate files improvements. Jul 5, 2016
TODO.rst Major project structure overhaul and added initial tests. Jun 24, 2016
VERSION REL: Bump version to 0.3.1. Jul 5, 2016
dev-requirements.in Major project structure overhaul and added initial tests. Jun 24, 2016
dev-requirements.txt Major project structure overhaul and added initial tests. Jun 24, 2016
pytest.ini MAINT: Update project boilerplate files improvements. Jul 5, 2016
requirements-dev.txt MAINT: Update project boilerplate files improvements. Jul 5, 2016
requirements-install.txt MAINT: Update project boilerplate files improvements. Jul 5, 2016
requirements-setup.txt MAINT: Update project boilerplate files improvements. Jul 5, 2016
requirements-tests.txt MAINT: Update project boilerplate files improvements. Jul 5, 2016
requirements.in Major project structure overhaul and added initial tests. Jun 24, 2016
requirements.txt Major project structure overhaul and added initial tests. Jun 24, 2016
setup.cfg MAINT: Update project boilerplate files improvements. Jul 5, 2016
setup.py MAINT: Update project boilerplate files improvements. Jul 5, 2016
tox.ini MAINT: Update project boilerplate files improvements. Jul 5, 2016

README.rst

Scrapy Inline Requests

Documentation Status Coverage Status Code Quality Status Requirements Status

A decorator for writing coroutine-like spider callbacks.

Quickstart

The spider below shows a simple use case of scraping a page and following a few links:

from inline_requests import inline_requests
from scrapy import Spider, Request

class MySpider(Spider):
    name = 'myspider'
    start_urls = ['http://httpbin.org/html']

    @inline_requests
    def parse(self, response):
        urls = [response.url]
        for i in range(10):
            next_url = response.urljoin('?page=%d' % i)
            try:
                next_resp = yield Request(next_url)
                urls.append(next_resp.url)
            except Exception:
                self.logger.info("Failed request %s", i, exc_info=True)

        yield {'urls': urls}

See the examples/ directory for a more complex spider.

Warning

The generator resumes its execution when a request's response is processed, this means the generator won't be resume after yielding an item or a request with it's own callback.

Known Issues

  • Middlewares can drop or ignore non-200 status responses causing the callback to not continue its execution. This can be overcome by using the flag handle_httpstatus_all. See the httperror middleware documentation.
  • High concurrency and large responses can cause higher memory usage.
  • This decorator assumes your method have the following signature (self, response).
  • Wrapped requests may not be able to be serialized by persistent backends.
  • Unless you know what you are doing, the decorated method must be a spider method and return a generator instance.