Some questions #5

DoctorDream · 2020-12-22T01:47:28Z

Thank you too much for this repository！I have spent nearly two weeks to research on how to crawl tweets with reply, but all repository like TWINT didn't work.
Do you know TWINT? I'm a developer from China. After using the proxy, TWINT still keeps reporting errors.

WARNING:root:Error retrieving https://twitter.com/: ReadTimeout(ReadTimeoutError("HTTPSConnectionPool(host='twitter.com', port=443): Read timed out. (read timeout=10)",),), retrying

I saw you said that Twitter has blocked all crawlers during this period of time. Is twitter unable to use it for this reason, or is it because I am in China and set up a proxy incorrectly?

The text was updated successfully, but these errors were encountered:

Altimis · 2020-12-22T15:43:55Z

Hi @DoctorDream, Thank you for your feedback. In fact I'm not sure that twint works these days, at least it didnt work for me and thats why I worked on Scweet. The thing that I'm sure about is that all API based scrapers dont work because they changed it to version 2. Did Scweet meet your requirements ? What should I add to improve it ?

DoctorDream · 2020-12-23T01:30:51Z

@Altimis
Thank you very much for your reply, your program basically met my needs, but I also encountered a little bit of problems in the process of using.
I use Twitter crawler to collect conversations for academic research, but the timeline based structure of Twitter has caused me some difficulties.
When I crawl the tweets, there may be two consecutive tweets replying to different tweets, which makes it impossible for me to use them to form a dialogue.
Do you have a way to crawl tweets based on the main tweet, just like browsing on the web?
Thank you very much for your enthusiasm！

Altimis · 2020-12-23T01:47:22Z

@DoctorDream If I understood correctly, you want to scrape replies of every tweet, is it ? like for this tweet :

You want to click on the comments and gather all the replies (1k7 replies) . If that's true, it may be a true challenger for Scweet. Because first, you may be required to sign in to be able to view replies of a giver tweet, and seconde, the process may take too long since the script needs to have access to the replies (click) and scroll to scrape all of them.

DoctorDream · 2020-12-23T02:27:44Z

@Altimis
Yes, that's what I means.
For a tweet, I don't have to collect all the responses. I just need to collect the highly praised ones, because those replies tend to be followed by more people.
I expect to spend weeks collecting data, so the length of time it takes won't have a big impact on me.
So, is it convenient for you to implement this function?
Thank you very much!

Altimis · 2020-12-23T15:33:10Z

@DoctorDream I think it is possible. I'll work on that.

Altimis · 2020-12-23T18:35:03Z

@DoctorDream I have a question for you. Are you supposed to have the tweet_id of a given tweet that you want to scrape its replies ? or you want to crawl all tweets and get their replies ?

DoctorDream · 2020-12-24T03:06:44Z

@Altimis
Thank you very much!
Actually，I just need to crawl tweets with replies to form dialogues，so i dont need to crwal tweet with specific tweet_id.

Altimis added the TODO label Dec 30, 2020

PairplaneN mentioned this issue Mar 3, 2022

get_user_follow fails since new chrome version #113

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some questions #5

Some questions #5

DoctorDream commented Dec 22, 2020

Altimis commented Dec 22, 2020

DoctorDream commented Dec 23, 2020

Altimis commented Dec 23, 2020

DoctorDream commented Dec 23, 2020

Altimis commented Dec 23, 2020

Altimis commented Dec 23, 2020

DoctorDream commented Dec 24, 2020

Some questions #5

Some questions #5

Comments

DoctorDream commented Dec 22, 2020

Altimis commented Dec 22, 2020

DoctorDream commented Dec 23, 2020

Altimis commented Dec 23, 2020

DoctorDream commented Dec 23, 2020

Altimis commented Dec 23, 2020

Altimis commented Dec 23, 2020

DoctorDream commented Dec 24, 2020