Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cant scrape tweets anymore - Twitter wierd response error #276

Open
tsy1274 opened this issue Sep 19, 2020 · 15 comments
Open

Cant scrape tweets anymore - Twitter wierd response error #276

tsy1274 opened this issue Sep 19, 2020 · 15 comments

Comments

@tsy1274
Copy link

tsy1274 commented Sep 19, 2020

I am trying to scrape tweets and I get the attached error.

Same error for any username I try.

@wuliwei9278
Copy link

I think twitter did something and this repo does not work anymore

@tsy1274
Copy link
Author

tsy1274 commented Sep 20, 2020

Okay, is there a way to bypass the twitters block?

@SAVI150593
Copy link

same here, even I am not able to use Getoldtweets to fetch data, Getting error like below:
An error occured during an HTTP request: HTTP Error 404: Not Found
Try to open in browser: https://twitter.com/search?q=europe%20refugees%20since%3A2015-05-01%20until%3A2015-09-30&src=typd
An exception has occurred, use %tb to see the full traceback.

@tsy1274
Copy link
Author

tsy1274 commented Sep 20, 2020

@SAVI150593 - did it work for you before?

@SAVI150593
Copy link

@SAVI150593 - did it work for you before?

Yes, it was working 2 days back. Same code I was trying today, but it is not working. Are you also getting the same error?

@tsy1274
Copy link
Author

tsy1274 commented Sep 20, 2020

I see.
Yes, it is a similar error whereby it directs me to go to the Twitter URL.
Maybe the script Authors' can figure out a way and bypass it.

@RAJEN-Git
Copy link

It seems like the http header definition X-Requested-With used in the code no longer support by the Twitter. (TweetManager.py)
headers = [
('Host', "twitter.com"),
('User-Agent', "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36"),
('Accept', "application/json, text/javascript, /; q=0.01"),
('Accept-Language', "de,en-US;q=0.7,en;q=0.3"),
('X-Requested-With', "XMLHttpRequest"),
('Referer', url),
('Connection', "keep-alive")
]

Any workaround? Help.

@lfwells
Copy link

lfwells commented Oct 6, 2020

Guys, it looks like the legacy Twitter page that this repo (and others) was scraping from (twitter.com/i/search/[...]) is no longer a thing anymore.
Solution looks to be to scrape the proper search page https://twitter.com/search?q=[...] which is a little bit trickier to scrape the html elements but possible.
Part of that solution involves (for me at least) adding in Selenium web browser to my python code, otherwise the page thinks javascript is disabled, or at least doesn't properly wait for the React-based page to populate.
Biggest problem with the "new" search is that it can only filter by dates (not datetimes), and seems to completely ignore the max_position parameter in the code, which makes scraping everything AFTER a specific tweet pretty difficult...

@opeyemibami
Copy link

What's the way forward? Any solution yet?

@itsayushisaxena
Copy link

What's the way forward? Any solution yet?

https://github.com/AyushiiSaxena/Get_Old_Tweets-Python
This will help you:)

@itsayushisaxena
Copy link

It seems like the http header definition X-Requested-With used in the code no longer support by the Twitter. (TweetManager.py)
headers = [
('Host', "twitter.com"),
('User-Agent', "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36"),
('Accept', "application/json, text/javascript, /; q=0.01"),
('Accept-Language', "de,en-US;q=0.7,en;q=0.3"),
('X-Requested-With', "XMLHttpRequest"),
('Referer', url),
('Connection', "keep-alive")
]

Any workaround? Help.

https://github.com/AyushiiSaxena/Get_Old_Tweets-Python
Please check this

@DV777
Copy link

DV777 commented Nov 12, 2020

Same problem here... hopefully someone will manage to fix it soon...

@DV777
Copy link

DV777 commented Nov 12, 2020

It seems like the http header definition X-Requested-With used in the code no longer support by the Twitter. (TweetManager.py)
headers = [
('Host', "twitter.com"),
('User-Agent', "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36"),
('Accept', "application/json, text/javascript, /; q=0.01"),
('Accept-Language', "de,en-US;q=0.7,en;q=0.3"),
('X-Requested-With', "XMLHttpRequest"),
('Referer', url),
('Connection', "keep-alive")
]
Any workaround? Help.

https://github.com/AyushiiSaxena/Get_Old_Tweets-Python
Please check this

Unfortunately it does not work when one tries to scrap the full timeline of a particular user...

@abushoeb
Copy link

It seems like the http header definition X-Requested-With used in the code no longer support by the Twitter. (TweetManager.py)
headers = [
('Host', "twitter.com"),
('User-Agent', "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36"),
('Accept', "application/json, text/javascript, /; q=0.01"),
('Accept-Language', "de,en-US;q=0.7,en;q=0.3"),
('X-Requested-With', "XMLHttpRequest"),
('Referer', url),
('Connection', "keep-alive")
]
Any workaround? Help.

https://github.com/AyushiiSaxena/Get_Old_Tweets-Python
Please check this

Unfortunately it does not work when one tries to scrap the full timeline of a particular user...

Can't you add before and until options for a particular user to get the job done? I remember I used that once with before and until and it worked for me.

@BradKML
Copy link

BradKML commented Apr 12, 2021

All hands on deck, let's discuss and fix this #280

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants