Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a delay before search_all_tweets calls #1688

Closed
jdfoote opened this issue Oct 14, 2021 · 8 comments
Closed

Add a delay before search_all_tweets calls #1688

jdfoote opened this issue Oct 14, 2021 · 8 comments
Labels
API This is regarding Twitter's API Documentation This is regarding the library's documentation

Comments

@jdfoote
Copy link

jdfoote commented Oct 14, 2021

search_all_tweets has a 1 call per second limit. Currently, tweepy quickly makes a few calls, receives a 429, and then waits for nearly 15 minutes. By simply adding a sleep of 1 second per call, you can make 300 calls in that 15 minutes.

It seems like the place to put this may be in the pagination.py file, as the delay is only a problem when making multiple calls? I'd be happy to make a one-line PR, but I'm not sure where the devs would want to put the sleep.

@mshlis
Copy link

mshlis commented Oct 15, 2021

I want to +1 this. I get 429's with:

    for response in tweepy.Paginator(twitter_call, **search_params):
        data.extend(response.data)

but not with:

    next_token = 1
    while next_token:
        response = twitter_call(**search_params)
        data.extend(response.data)
        
        next_token = response.meta.get('next_token')
        search_params['next_token'] = next_token        
        time.sleep(1)

@Harmon758
Copy link
Member

I'm aware of this issue, but I haven't determined the best way to resolve it yet.
Ideally, I'd rather not have to hard-code the rate limit into the library, but that might end up being necessary, as this rate limit isn't in the response headers and seems to only be declared in the endpoint's documentation.
For now, a workaround is simply to sleep 1 second between each request, while iterating through the Paginator responses.

@Harmon758 Harmon758 added API This is regarding Twitter's API Bug This is regarding a bug with the library labels Oct 15, 2021
@jdfoote
Copy link
Author

jdfoote commented Oct 15, 2021

That makes sense. Maybe just some documentation, then?

@Harmon758
Copy link
Member

Upon consideration, it might not be best to handle this within Tweepy.

There might be users who do processing within the loop that takes up a significant amount of time or even takes longer than a second, so a simple 1 second sleep wouldn't be ideal.

The alternative would be saving the timestamp that the last request was made at and sleeping until a second has passed, but that might end up being exactly a second and there would need to be considerations for adding some jitter.

I think you're right and the simplest way forward right now would be to document it and allow the user to handle it themselves.
If it becomes necessary at some point, handling of this rate limit can always be added later.

PS

@jdfoote I saw your video 👍

Some notes:

  • start_time and end_time can be datetime objects if you don't want to have to remember the string format that Twitter's API requires
  • Tweepy's API v2 models have a data attribute (still needs to be documented, but it's not going anywhere) that provides the entire data dictionary

@Harmon758 Harmon758 added Documentation This is regarding the library's documentation and removed Bug This is regarding a bug with the library labels Oct 20, 2021
@jdfoote

This comment has been minimized.

@Harmon758

This comment has been minimized.

@Harmon758
Copy link
Member

I've added a FAQ section on this for now.

@xrisk
Copy link

xrisk commented Mar 12, 2022

@Harmon758 This rate limit should be present in at least the flatten api call I think. There is no way to insert 1 second sleeps there. On a related note, I feel this should be documented on the main docs page for pagination, not in the FAQ.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API This is regarding Twitter's API Documentation This is regarding the library's documentation
Projects
None yet
Development

No branches or pull requests

4 participants