Schedule or callLater Request() #169

Closed
aristidesfl opened this Issue Sep 1, 2012 · 6 comments

Comments

Projects
None yet
6 participants

Is there a way to schedule a Request for a specific url for later, instead of having the yield Request() be processed asap?

Owner

pablohoffman commented Sep 3, 2012

No, that kind of custom time scheduling is not possible. The closest thing is to use a download delay, but can only be specified per domain, not on a single request. Could you explain your use case?

My use case is a set of pages each updated with different frequency.
I want the scraper to adjust the frequency of requests in function of the frequency of updates.

On Monday, September 3, 2012 at 01:49 , Pablo Hoffman wrote:

No, that kind of custom time scheduling is not possible. The closest thing is to use a download delay, but can only be specified per domain, not on a single request. Could you explain your use case?


Reply to this email directly or view it on GitHub (#169 (comment)).

Owner

dangra commented Jan 8, 2013

you can return a twisted.internet.defer.Deferred instance from the request callback that triggers after N seconds using callLater.
that will prevent Scrapy from shuttingdown your spider because is busy waiting on spider output.

dangra closed this Jan 8, 2013

osya commented Jan 2, 2015

I tried to yield twisted.internet.defer.Deferred with Scrapy 0.24.4 and the following error occures "Spider must return Request, BaseItem or None, got 'instance' in ". Please advise

@osya, This works fine for me in the latest version

@pablohoffman, Is it possible to make something on this as a part of Scrapy?

The use case is like this, we POST/Search through a form. That form generates a link. But it actually redirects you to that link after 15 seconds, when the results are generated. So the next url can be know but needs to be called only after 15 sec. Unfortunately I was using a sleep for this, which made scraper go to around 4 hours and now with the deferred approach it is just taking 22 mins. Would be nice to have some special meta or DelayedRequest class for this

Member

curita commented May 8, 2015

@tarlabs: this issue is being discussed in #802, feel free to join in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment