-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Twint doesn't get all followers list #340
Comments
It could be possible that Twitter stops returning new entities because you (as everyone else) in that case requested too many queries. |
Thanks @pielco11 , I was wondering what is the way around it. e.g., If it raises an error and it can continue pulling the information from that point. or controlling the requests within some limit so that doesn't happen. thanks |
I'll look deeper (can't determine a dead-line), for now I can say that the issue does not seem to have an unique pattern. I A solution could be to re-try the query when it fails, anyway the code should be changed after a deeper look at what-is-going-on |
Thank you @pielco11 ! |
Hi, first of all, thanks for doing such amazing tool and public the code, I am sure I will learn a lot from your work. I tried getting a long list, ~130k and it stopped in random number of followers in each query. Finally, If you have a PayPal account, I would like to buy you a coffe for posting the source code, because as I said, I would like to learn who do you did the tool, which would be impossible without the source code. |
~130k followers are a lot so Twitter might be blocking requests at a random time For the second point, that's why Twitter blocks an IP if it makes too many requests, that's why using a VPN solves the problem. What we could try is handling that "followers count" issue and ask the user to change the IP and then retry the query, and see if this solves the issue. Unfortunately I do not have enough time to solve every issue, so the patch will be delayed. Every kind of help in the development is widely accepted |
@mmosleh Here is what's going on In the first case there is a I think that we found the origin of the issue and sadly we can't do anything, at least for now |
@pielco11 I made a quick dirty patch into the previous version of Twint (the one with a single file). Just few retrial on the last curser-id when receive the error massage. I managed to download all 32M NASA followers this way. (I'm not familiar the code base on the new version though) |
@mmosleh oh, nice... may you provide me the commit id? |
Adding timeout seems to solve the issue. Without timeouts I'm able to get upto 40 followers/following, adding |
In the current iteration of get.py - has this issue been resolved? I'm not seeing the time.sleep(3) line within the script thanks once again! |
@KristopherMakuch I did not apply that "patch" since I'm not sure that's a patch. More testing is needed, everyone is welcome to find a workaround |
how do I include a control file to know on which page it stopped? example: username_page.txt = file with last id page followers. If processing is interrupted, it may after a while try to run again and will continue from the followers page according to the id of the page in file. My original command: my error today: file with 1036 followers, but this profile have 3.800 followers. Thanks again for all work. Sorry my poor english.... |
Hi @Matiusco to add some timeouts you have just to add a line as descried above You could resume the scrape with something like When Twint will stop (most probably because Twitter does not return more data) you will have just to re-run the command to resume from where it stopped |
hi @pielco11 edited report |
I can not get all followers. my command in terminal: twint -u zehdeabreu --followers -o user_followers.txt --resume zehdeabreu_followers_resume.txt -t 15 -l 50 -t 15 and -l 50 not working... How could I do inside a python file to control the time of each request to give a much longer time between requests? ===== wc -l user_followers.txt 16470 user_followers.txttks for all help |
Your query should be something like I also tried the resume option and it works fine |
tks @pielco11 , I'll try |
I had similar issue. The problem seems to be sometimes the "more" button
doesn't appear on Twitter end and Twint assumes that it has reached the end
of the list of followers. However, if it tries and sends another request,
the button might be available to the scraper ... .. So once quick fix could
be just get the number of followers first then keep retrying till the
number of followers matches.
…On Thu, Jun 20, 2019 at 12:51 PM Matiusco ***@***.***> wrote:
tks @pielco11 <https://github.com/pielco11> , I'll try
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#340?email_source=notifications&email_token=AFIVAC5GAVFM5FJCHCHIFY3P3NOJ3A5CNFSM4GS5C642YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYFBXIQ#issuecomment-503978914>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFIVAC6RLTX7EKFIHGCYJQTP3NOJ3ANCNFSM4GS5C64Q>
.
|
ok @mmosleh , but I do not know how I could change this part in code get.py I still can not get all the followers. |
How can I get the IDs of the followers instead of the username, please? Thank you |
@nxhuy-github please write comments about the topic of the issue. Anyway you can do that using |
Hi, what is the current status for the code that retrieves all the followers of one person? I am still having the problem that only a subset of followers is downloaded. I am using the command |
@datduong Twitter works effectively to not allow Twint to get all the followers, I highly suggest you to use the API |
Hi. I faced same issue now. My found is this:
My Proposal is this:
|
It seems Twint doesn't get the list of all followers for accounts with large number of followers and stoped abruptly at some random number. For example tried
twint -u nasa --followers
and each time the script stopped at some random number with few thousand screen_names.pip3 install --upgrade -e git+https://github.com/twintproject/twint.git@origin/master#egg=twint
;The text was updated successfully, but these errors were encountered: