Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_followers() more than 75,000 not possible, even with retryonratelimit = TRUE #732

Closed
TobiasKellner opened this issue Aug 31, 2022 · 3 comments

Comments

@TobiasKellner
Copy link

Problem

I want to download all followers of a Twitter user with the function get_followers() from the rtweet package. With an older version of the rtweet package, this worked without any problems. Since I updated the package (current version is 1.0.2), I only get the first 75,000 followers, even though I set the retryonratelimit-option to TRUE.
The function downloads the first 75,000 followers, then waits 15 minutes and then ends the download process without any message.

Reproduce the problem

library(rtweet)

# I have authenticated myself with auth_setup_default()

df_follower <- get_followers("CDU", n = 800000, retryonratelimit = TRUE)

> df_follower 
# A tibble: 75,000 × 2
   from_id             to_id
   <chr>               <chr>

rtweet version

I am using rtweet version: 1.0.2

Session info

R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

@llrs
Copy link
Member

llrs commented Aug 31, 2022

Sorry, testing around rate limits is not easy to test and I missed in the release.
I see the following:

df_follower <- get_followers("CDU", n = 800000, retryonratelimit = TRUE)
# Loading multiple pages ====>.....................
# Rate limit exceeded for Twitter endpoint '/1.1/followers/ids'. Waiting for ...
dim(df_followers)
## [1] 75000     2

The first lines, starting with #, are flushed once the R gets out of get_followers.

If you need so much data, I would recommend to use the app (or the bot) authentication method. However, it seems there is a problem with the pagination.

I'll look in more detail soon. Thanks for the report.

llrs added a commit that referenced this issue Sep 1, 2022
@llrs
Copy link
Member

llrs commented Sep 1, 2022

The pagination via cursor after hitting the rate limit it waited and then found that the last API call returned an error and silently stopped. I might have broken this inadvertently as I think this worked well previously.

Currently if the rate limit is hit and retryonratelimit is TRUE it will try again and continue working (I pushed it to the devel branch). I also modified the other internal iterators (chunked, premium) besides the one using cursor.

To be sure I added an additional (manual) test to check before next release.

@llrs llrs closed this as completed Sep 1, 2022
@TobiasKellner
Copy link
Author

Yes now it works after re-installing the package with install_github("ropensci/rtweet", ref = "devel")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants