Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scrapy > 2.3 prevents passing callback in cb_kwargs #5237

Closed
jpmckinney opened this issue Aug 31, 2021 · 3 comments · Fixed by #5251
Closed

Scrapy > 2.3 prevents passing callback in cb_kwargs #5237

jpmckinney opened this issue Aug 31, 2021 · 3 comments · Fixed by #5251
Labels
Milestone

Comments

@jpmckinney
Copy link
Contributor

jpmckinney commented Aug 31, 2021

To reproduce:

scrapy startproject tutorial
cd tutorial
scrapy genspider example example.com

In example.py

import scrapy


class ExampleSpider(scrapy.Spider):
    name = 'example'

    def start_requests(self):
        yield scrapy.Request(
            'http://example.com',
            callback=self.parse_1,
            cb_kwargs={'callback': self.parse_2}
        )

    def parse_1(self, response, **kwargs):
        yield {'url': response.request.url}
        yield scrapy.Request('https://httpbin.org/get', **kwargs)

    def parse_2(self, response):
        yield {'url': response.request.url}

This works in Scrapy 2.3. In Scrapy > 2.3, it errors:

2021-08-31 10:24:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://example.com> (referer: None)
2021-08-31 10:24:38 [scrapy.core.scraper] ERROR: Spider error processing <GET http://example.com> (referer: None)
Traceback (most recent call last):
  File "lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
StopIteration: <200 http://example.com>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "lib/python3.6/site-packages/scrapy/utils/defer.py", line 55, in mustbe_deferred
    result = f(*args, **kw)
  File "lib/python3.6/site-packages/scrapy/core/spidermw.py", line 52, in _process_spider_input
    return scrape_func(response, request, spider)
  File "lib/python3.6/site-packages/scrapy/core/scraper.py", line 151, in call_spider
    dfd.addCallback(callback, **result.request.cb_kwargs)
TypeError: addCallback() got multiple values for argument 'callback'

I don't know if this scrapy commit is relevant to the change in behavior: 2aa4f3c

This is a minimal example. Of course, in my application, I'm jumping through these hoops (controlling the callback of a subsequent request) to be able to reuse code in a complex project.

@GeorgeA92
Copy link
Contributor

I've reproduced this issue
It happened because this:

dfd.addCallbacks(callback=callback,
errback=request.errback,
callbackKeywords=request.cb_kwargs)

replaced by:
dfd.addCallback(callback, **result.request.cb_kwargs)

as result of mentioned commit 2aa4f3c

This dfd.addCallback -> refer to twisted.internet.defer.addCallback

    def addCallback(self, callback, *args, **kw):
        """
        Convenience method for adding just a callback.

        See L{addCallbacks}.
        """
        return self.addCallbacks(callback, callbackArgs=args,
                                 callbackKeywords=kw)

here **kw - **result.request.cb_kwargs , in this case callback from transferred **kw will be "merged" with original callback causing addCallback() got multiple values for argument 'callback'
Cc: @elacuesta

@jpmckinney On #1138 (cb_kwargs feature - result of discuss on this issue)
I didn't found Your case (or anything else similar to it) as potential use case of cb_kwargs

@jpmckinney
Copy link
Contributor Author

There are other ways to pass the callback (e.g. via request meta), so this doesn't necessarily need to be fixed. However, it might be worth mentioning that a 'callback' key in cb_kwargs is disallowed.

@Gallaecio Gallaecio added the bug label Sep 20, 2021
@Gallaecio Gallaecio added this to the 2.6 milestone Sep 20, 2021
@Gallaecio
Copy link
Member

If it’s trivial to allow a callback parameter, it may be best to do that. Otherwise, we should raise an error that is more helpful, in addition to mention it in the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants