Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Certain re-tweets are not successfully scraped #14

Open
morris-y opened this issue Oct 4, 2023 · 4 comments
Open

Certain re-tweets are not successfully scraped #14

morris-y opened this issue Oct 4, 2023 · 4 comments

Comments

@morris-y
Copy link

morris-y commented Oct 4, 2023

It worked. But there are cases where certain re-tweets are not successfully scraped.

And this feels like fixable, because the instance I choose did include these re-tweets when I checked (instance = 'https://nitter.salastil.com'), and I tried different instances, it all show up same amounts of tweets in a certain time range.

@morris-y
Copy link
Author

morris-y commented Oct 5, 2023

After testing, it seems that retweets over a certain time frame are not successfully scraped from the nitter. Recent retweets are often successfully scraped.

@bocchilorenzo
Copy link
Owner

bocchilorenzo commented Oct 6, 2023

Hi,
the issue appears to be on nitter's part. I tried the advanced search on the instance you provided and other instances as well, on different accounts, but it seems that nitter either does not show retweets or only shows the most recent. For example, on X's profile, if you try to filter the search results between 01 July 2023 and 31 August 2023, you will notice that there are no retweets. But, if you check the same period through the main "tweets" page for the profile, there are 2 retweets. The same thing happens on nitter's official instance, nitter.net.

Looking through the issues on nitter's repo there is an open issue that describes the retweet filter as broken, so they are aware of the bug. For now we'll have to wait until it's solved.

@TomatoGreen2
Copy link

Can't you scrape them indirectly - through the "is-retweet" descriptor of tweets? Maybe I am understanding something wrong...

@TomatoGreen2
Copy link

Just checked - not possible... sorry...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants