Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the details about request serialization requirements for JOBDIR #4139

Merged
merged 1 commit into from Nov 8, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
31 changes: 4 additions & 27 deletions docs/topics/jobs.rst
Expand Up @@ -71,34 +71,11 @@ on cookies.
Request serialization
---------------------

Requests must be serializable by the ``pickle`` module, in order for persistence
to work, so you should make sure that your requests are serializable.

The most common issue here is to use ``lambda`` functions on request callbacks that
can't be persisted.

So, for example, this won't work::

def some_callback(self, response):
somearg = 'test'
return scrapy.Request('http://www.example.com',
callback=lambda r: self.other_callback(r, somearg))

def other_callback(self, response, somearg):
print("the argument passed is: %s" % somearg)

But this will::

def some_callback(self, response):
somearg = 'test'
return scrapy.Request('http://www.example.com',
callback=self.other_callback, cb_kwargs={'somearg': somearg})

def other_callback(self, response, somearg):
print("the argument passed is: %s" % somearg)
For persistence to work, :class:`~scrapy.http.Request` objects must be
serializable with :mod:`pickle`, except for the ``callback`` and ``errback``
values passed to their ``__init__`` method, which must be methods of the
runnning :class:`~scrapy.spiders.Spider` class.

If you wish to log the requests that couldn't be serialized, you can set the
:setting:`SCHEDULER_DEBUG` setting to ``True`` in the project's settings page.
It is ``False`` by default.

.. _pickle: https://docs.python.org/library/pickle.html