Skip to content

Cookbook

sente edited this page Oct 13, 2011 · 8 revisions

Automatic Sleep-Retries (used for throttling)

def make_throttle_hook(timeout = 2):

    """returns a response hook function which sleeps for <timeout> seconds if
       the response status_code/content meet certain criteria before
       re-requesting the content.  The timeout length grows for each re-request
       until it exceeds a threshhold, at which point simply return the response object"""
 
    def myhook(resp):
        if timeout and timeout < 1025:
            if resp.status_code == 403 and "too_many" in resp.content:
                time.sleep(timeout)
                proxies = resp.request.proxies
                hooks = {'response': make_throttle_hook(timeout = timeout * 3)}
                return requests.get(resp.url, proxies = proxies, hooks = hooks)
        return resp
    return myhook

resp = requests.get(url, proxies = proxies, hooks = {'response': make_throttle_hook(5)})
Clone this wiki locally