[Question] Is it possible to get the replies (and replies to the replies) to a tweet with twint? #513

pbabvey · 2019-09-06T23:00:25Z

No description provided.

pielco11 · 2019-09-07T07:41:23Z

Yes&No

No because there's no way, as of now, to place the ID of the tweet and get only the replies to it

Yes because if you know the ID of the tweet which you want the replies of (let's call it TweetID, for example), you have just to search for tweets sent to your target, and then filter for conversation_id == TweetID

pbabvey · 2019-09-08T23:09:57Z

Actually I am trying to collect all the reply threads to a primary tweet. Since the reply_to_id does not exist among tweet attributes, using the above method the model mix up the replies threads in some cases.

Assume user A replies to a tweet, then it gets a reply from user B. After, user A responds to B's reply. Or assume user A replies to the main tweet several times and gets replies from a group of users in each case. Then, using, "To" configuration the model cannot get the correct tree structure of replies. Is there any other method to troubleshoot the problem?

I wonder the only way to resolve some ambiguation is using the 'reply_to' attributes, unless, we can add some fields to the attributes.

pielco11 · 2019-09-09T18:12:08Z

Assume user A replies to a tweet, then it gets a reply from user B. After, user A responds to B's reply.

Technically speaking, in such case a new discussion is involved

In your case I would get all the replies to the "mother tweet", and than for every "child tweet", get the corresponding replies

In the from field you can add up to 20 users (about, more or less), also considering that when you reply to a tweet and (maybe) start a new discussion, all the users involved in the "inherited discussion(s)" get notified. So you can filter with from field and to field, and then organize the data using conversation_id as driver

That's what I'd do, as starting point at least

Hope this helps

pushshift · 2019-09-10T01:16:22Z

I am fairly certain that searching "to:@user" picks up all replies regardless of where they are in the tree. This should get replies to the original user, replies to replies, etc.

If you notice on Twitter itself, all names involved in a reply are there. So if @user1 makes a tweet and @user2 replies to @user1 and @user3 replies to @user2, when @user3 makes a reply, you will see both @user2 and @user1 in that tweet.

In this case, simply doing a search for "to:@original_user" should eventually find all tweets in the tree.

(I'm ~80% this is the case.)

pbabvey · 2019-09-10T03:58:47Z

I am fairly certain that searching "to:@user" picks up all replies regardless of where they are in the tree. This should get replies to the original user, replies to replies, etc.

If you notice on Twitter itself, all names involved in a reply are there. So if @user1 makes a tweet and @user2 replies to @user1 and @user3 replies to @user2, when @user3 makes a reply, you will see both @user2 and @user1 in that tweet.

In this case, simply doing a search for "to:@original_user" should eventually find all tweets in the tree.

(I'm ~80% this is the case.)

Thank you for your response.
Unfortunately, "to:@original_user" does not give replies to the replies. Actually, it finds all the tweets that involve merely original_user. For example, if we run the code below, the "to:@original_user" gets just the direct replies.

import twint
from collections import Counter

mothers = twint.Config()
mothers.Username = "@JonAcuff"
mothers.Since = "2019-09-04"
mothers.Until = "2019-09-08"
mothers.Lang = 'en'
mothers.Pandas = True
mothers.Store_csv = True
mothers.Hide_output = True
twint.run.Search(mothers)
df = twint.storage.panda.Tweets_df
Replies = {x:y for x,y in zip(df['conversation_id'],df['nreplies'])}

replies = twint.Config()
replies.Since = "2019-09-04"
replies.Until = "2019-09-10"
replies.Pandas = True
replies.To = "@JonAcuff"
replies.Hide_output = True
twint.run.Search(replies)
df = twint.storage.panda.Tweets_df

fetchedReplies =Counter(df['conversation_id'])
for tweet in Replies:
    print(tweet, "\t{}\t{}\t".format(Replies[tweet],fetchedReplies[tweet]))

If we could customize to "to:@original_user" that made everything easier.

pushshift · 2019-09-10T08:50:06Z

My bad -- the mistake I made was using "to:@user1" instead of "@user1" -- if you do a search for "@user1" it picks up all tweets in the tree (replies, replies to replies, etc.)

I just tested this and was able to reconstruct the entire tree for a few sample cases.

So this works well -- you will also get all user mentions with "@user1" but you can throw everything out except the tweets with "in_reply_to_status_id" or "in_reply_to_screen_name".

pbabvey · 2019-09-12T22:44:35Z

*Sure. Thank you. *

…

On Thu, Sep 12, 2019 at 3:35 PM Ed Summers ***@***.***> wrote: Would it be ok to close this ticket now? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#513?email_source=notifications&email_token=AH3WOSQ6A5ELH3LOSEMDBDTQJKKZPA5CNFSM4IUN2HPKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6TAGUA#issuecomment-530973520>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AH3WOSVQFPCVCVQZXIBIVQLQJKKZPANCNFSM4IUN2HPA> .

dotgodly · 2020-05-16T11:34:05Z

Is in_reply_to_status_id still a response? Using the search query "@username" or "to:@username" does not give me that field.

I can track replies with reply_to and conversation_id but I can't put them in the correct order with just that information.

AmanKabra · 2020-11-04T06:53:57Z

To fetch replies to a tweet, one needs the tweet ID. The tweet ID (of 20 characters) scraped by twint is getting rounded off to the 5th digit from left. Therefore, it is not the correct representation of the tweet. Can anyone help?

himanshudabas · 2020-11-04T07:30:07Z

@AmanKabra
You must be opening the CSV in MS Excel?
MS excel cannot represent integers bigger than 15 digits correctly.
Try importing the CSV in pandas as a dataframe, that will give the correct Tweet IDs, Or perhaps try opening the CSV in a text editor like Notepad++, but I'd suggest you to use pandas instead.

AmanKabra · 2020-11-04T09:02:51Z

@himanshudabas
To give an example, here's how the first 10 rows of my dataset looks like in pandas:

0 1.289350e+18
1 1.289350e+18
2 1.289340e+18
3 1.289330e+18
4 1.289330e+18
5 1.289330e+18
6 1.289310e+18
7 1.289310e+18
8 1.289310e+18
9 1.289310e+18

These are for all unique tweets. 4 IDs are being shown as duplicate (using drop.duplicate() function in python). This is only possible if there last few digits of conversation_id are being set to zero.

Could you suggest something else? Thanks in advance.

himanshudabas · 2020-11-04T14:32:22Z

@AmanKabra
use something like this to extract all the ids in the desired format.
Note tweet_data_pd_df here is your csv data and np is Numpy

all_ids = tweet_data_pd_df["id"].fillna(0.0).astype(np.int64)

git175 · 2020-11-04T17:38:10Z

int64

@himanshudabas
To give an example, here's how the first 10 rows of my dataset looks like in pandas:

0 1.289350e+18
1 1.289350e+18
2 1.289340e+18
3 1.289330e+18
4 1.289330e+18
5 1.289330e+18
6 1.289310e+18
7 1.289310e+18
8 1.289310e+18
9 1.289310e+18

These are for all unique tweets. 4 IDs are being shown as duplicate (using drop.duplicate() function in python). This is only possible if there last few digits of conversation_id are being set to zero.

Could you suggest something else? Thanks in advance.

No description provided.

@AmanKabra
use something like this to extract all the ids in the desired format.
Note tweet_data_pd_df here is your csv data and np is Numpy
all_ids = tweet_data_pd_df["id"].fillna(0.0).astype(np.int64)

Yes&No

No because there's no way, as of now, to place the ID of the tweet and get only the replies to it

Yes because if you know the ID of the tweet which you want the replies of (let's call it TweetID, for example), you have just to search for tweets sent to your target, and then filter for conversation_id == TweetID

AmanKabra · 2020-11-05T10:08:46Z

@himanshudabas

Response:

0 1289350000000000000
1 1289350000000000000
2 1289340000000000000
3 1289330000000000000
4 1289330000000000000
...
14750 1289375576856330240
14751 1289370524196298752
14752 1289354168709165056
14753 1289354145258856448
14754 1289354143362985984
Name: id, Length: 14755, dtype: int64

Should I scrape from scratch?

himanshudabas · 2020-11-05T10:13:48Z

@AmanKabra
For some reason your initial tweet ids have been changed.
Perhaps you tried to read them somewhere and modified the truncated values.
Try to do a fresh scrape and check, load the CSV in pandas.
That should give you the desired tweet ids.

pielco11 added the question label Sep 7, 2019

pbabvey changed the title ~~[Question] Is it possible to get the replies to a tweet with twint?~~ [Question] Is it possible to get the replies (and replies to the replies) to a tweet with twint? Sep 12, 2019

pielco11 mentioned this issue Sep 18, 2019

Download tweet responses #520

Closed

git175 mentioned this issue Nov 4, 2020

Yes&No twintproject/twint-explorer#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Is it possible to get the replies (and replies to the replies) to a tweet with twint? #513

[Question] Is it possible to get the replies (and replies to the replies) to a tweet with twint? #513

pbabvey commented Sep 6, 2019

pielco11 commented Sep 7, 2019

pbabvey commented Sep 8, 2019 •

edited

pielco11 commented Sep 9, 2019

pushshift commented Sep 10, 2019 •

edited

pbabvey commented Sep 10, 2019 •

edited

pushshift commented Sep 10, 2019

pbabvey commented Sep 12, 2019 via email

dotgodly commented May 16, 2020

AmanKabra commented Nov 4, 2020

himanshudabas commented Nov 4, 2020

AmanKabra commented Nov 4, 2020

himanshudabas commented Nov 4, 2020

git175 commented Nov 4, 2020

AmanKabra commented Nov 5, 2020 •

edited

himanshudabas commented Nov 5, 2020

[Question] Is it possible to get the replies (and replies to the replies) to a tweet with twint? #513

[Question] Is it possible to get the replies (and replies to the replies) to a tweet with twint? #513

Comments

pbabvey commented Sep 6, 2019

pielco11 commented Sep 7, 2019

pbabvey commented Sep 8, 2019 • edited

pielco11 commented Sep 9, 2019

pushshift commented Sep 10, 2019 • edited

pbabvey commented Sep 10, 2019 • edited

pushshift commented Sep 10, 2019

pbabvey commented Sep 12, 2019 via email

dotgodly commented May 16, 2020

AmanKabra commented Nov 4, 2020

himanshudabas commented Nov 4, 2020

AmanKabra commented Nov 4, 2020

himanshudabas commented Nov 4, 2020

git175 commented Nov 4, 2020

AmanKabra commented Nov 5, 2020 • edited

himanshudabas commented Nov 5, 2020

pbabvey commented Sep 8, 2019 •

edited

pushshift commented Sep 10, 2019 •

edited

pbabvey commented Sep 10, 2019 •

edited

AmanKabra commented Nov 5, 2020 •

edited