Make it possible to debounce requests #305

Open
barraponto opened this Issue May 21, 2013 · 2 comments

Projects

None yet

3 participants

@barraponto
Contributor

I just read Ben Alman on jQuery's Throttle and Debounce and wondered: can DebounceRequest be added to Scrapy?

The use case: recently I've been scraping a website where I wanted to gather Facebook Likes per URL. Now that comes cheap using Facebook API, particularly the Facebook Query Language. The problem is that FB will eventually stop answering my request, on a undocumented rate limit. But what if I could define a way for those requests to be joined and called only after a certain while, asking for all the parameters in a user-defined way? I'd expect the callback to be called just once, too (or maybe several times but with the full response).

@nyov
Contributor
nyov commented Apr 5, 2015

I don't actually understand the facebook related parts here.
Were you asking for debouncing requests to same URLs instead of dupe-filtering?
Or would you mind explaining the use-case again?

@barraponto
Contributor

Let's say my CustomSpider yields DebouncedRequests with a particular parameter, like http://domain.com/getdata.xml?query=1. But instead of firing it immediately, I set rules for it to wait at least 10 requests (or maybe wait for at least 10 seconds) and join the requests on a single request to http://domain.com/getdata.xml?query=1,2,3,4,5,6,7,8,9,10.

I think this should already be possible in a middleware.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment