# 07_investigating_peculiarities_in_tweets
Following the analysis conducted on the overall set of tweets collected in the context of this project (see `../analysis/02_tweet_analysis.ipynb`), JT has identified a bunch of tweets/peculiarities in tweets which deserve more attention. 

We will have a look at this stuff in a bit more depth here. 

NL, 04/01/23

## IMPORTS

In [1]:
from dotenv import load_dotenv
import tweepy
import pandas as pd
from time import sleep

## PATHS & CONSTANTS

In [2]:
TWEETS_PATH = '/home/nikloynes/projects/world_cup_misinfo_tracking/data/tweets_clean/'
EXPORT_PATH = '/home/nikloynes/projects/world_cup_misinfo_tracking/data/exports/'

In [3]:
USER_FIELDS = ['id', 'name', 'username', 'created_at', 'description', 'location', 'public_metrics']

## INIT

In [4]:
pd.set_option('display.max_rows', 100)
pd.set_option('display.max_columns', 50)

In [5]:
load_dotenv()

True

In [6]:
BEARER_TOKEN = os.getenv('TWITTER_BEARER_TOKEN')

In [7]:
client = tweepy.Client(bearer_token=BEARER_TOKEN)

## THE THING!

### 1. Tweet with id `1604520962959020032`

Text: `anya taylor-joy and timothee chalamet when they get cast in a movie together after this world cup`  
Also: a vid containing a meme-y clip of ben stiller and the blonde guy.

We are surprised at the impact of this tweet, given the *modest* follower/following numbers of its author (412 following / 787 followers). In order to go a bit deeper with this, let's do the following:

- get the users' followers
- get all the interactions to this tweet, and see how many of the interactees are actually followers. 
- likelihood is that someone RTd their tweet and this led to big interactions. but, we will find out.   

In [9]:
target_user_id = '1269288366954237952'
target_tweet_id = '1604520962959020032'

1. get followers

In [17]:
followers = []

for response in tweepy.Paginator(client.get_users_followers, 
                                 id=target_user_id,
                                 user_fields=USER_FIELDS,
                                 max_results=1000, 
                                 limit=10):
    for item in response[0]:
        followers.append(item.data)

In [18]:
print(f'We have retrieved info for {len(followers)} followers')

We have retrieved info for 790 followers


let's have a look at followers by 'clout'

In [19]:
followers_df = pd.DataFrame(followers)
followers_df = pd.concat([followers_df.drop(['public_metrics'], axis=1), followers_df['public_metrics'].apply(pd.Series)], axis=1)

In [20]:
followers_df.sort_values('followers_count', ascending=False)[0:25]

Unnamed: 0,description,created_at,username,id,location,name,followers_count,following_count,tweet_count,listed_count
442,friendly Black hottie ✰ films ✰ he/him ✰ swift...,2010-04-17T16:58:27.000Z,nothnghppens,134171362,New York,alex 💭 ⁷,45112,8054,362465,728
477,fan account | studying,2016-10-02T18:13:37.000Z,civiIswar,782644724322578432,,zach,33831,1984,30717,233
226,"@flolikethis & @sza stan | ratioed by many, ha...",2022-07-08T18:07:13.000Z,SUMMERTlMEFLO,1545468746411950080,fan account\n\n4/4 flo follow\n\nhe/him,ben. | fan account,19299,7930,16621,0
365,andrew garfield 🫶,2013-08-27T19:57:31.000Z,dickgrayscn,1705463874,she/her,han,18143,300,21976,82
298,that one girl who doesn’t play about imogen te...,2013-09-16T16:18:07.000Z,temuIts,1872174966,she/her lesbian 23,boob ross,17877,1330,72658,76
619,"🏳️‍🌈 🫧❤️‍🔥🫀🩸tv & film, horror enthusiast / kir...",2014-11-22T13:47:01.000Z,mcumagik,2887836579,he/him,joe ❄️,12062,6279,208869,64
630,fan account. @poisonivynews,2016-07-19T18:33:18.000Z,dcsivy,755470587334684672,she/her • 24,rose,11696,1368,3107,57
2,hot chocolate and nijiro murakami 🫶,2011-03-24T19:29:38.000Z,deppchapel,271565543,25 / she/her / sisterwives,laura 🏹,10623,5991,41743,65
358,riri williams over morals.,2022-03-04T05:51:54.000Z,GEMINISVERSE,1499623614877999109,hispanic she/them,TARARIRI HOMECOMING THIS YEAR.,10560,1917,7208,17
636,unfortunately in love with annoying old men | ...,2013-02-01T22:24:17.000Z,poeskys,1140870296,br ☾ she/her,bea bond,10244,9201,56813,43


Some takeaways:
- some users with pretty decent clout. Even the 100th-most followed of our target user's followers has over 1700 followers.
- these users strike me as being real people. seem to be fairly young/woke/alternative. 

2. get retweeters of target tweet

In [10]:
retweeters = []

for response in tweepy.Paginator(client.get_retweeters, 
                                 id=target_tweet_id,
                                 user_fields=USER_FIELDS,
                                 max_results=100, 
                                 limit=100):
    try:
        for item in response[0]:
            retweeters.append(item.data)
            sleep(0.5) # to handle rate limit error
    except TypeError:
        print('there was a typeError for some reason... ')
        continue

there was a typeError for some reason... 


In [13]:
len(retweeters)

5175

We have a total of 5175 retweeter user objects. This is significantly fewer than shown on the web frontend. The likeliest reason is that several of the retweeters of this particular tweet have their **profile set to private**, meaning that we can't pull their user info.

One thing we're interested in now however, is how many of these retweeters actually follow our target user.

In [24]:
len([x['id'] for x in retweeters if x['id'] in [y['id'] for y in followers]])

28

Wow! only **28** of the accounts that retweeted our target user's tweet also follow them. This is a weirdly, weirdly low ratio. 

Obviously, there can be a myriad of reasons why non-followers may be retweeting something... but you'd expect a higher ratio of followers to retweet this tweet once it appears at the top of their feeds, given how successful it is being.

The obvious conjecture would be to think of bots, or other somehow organised actors?? Something worth discussing.

#### EXPORTS

In [14]:
retweeters_df = pd.DataFrame(retweeters)
retweeters_df = pd.concat([retweeters_df.drop(['public_metrics'], axis=1), retweeters_df['public_metrics'].apply(pd.Series)], axis=1)

In [25]:
followers_df.to_csv(EXPORT_PATH+'mocnknights_followers.csv', index=False)
retweeters_df.to_csv(EXPORT_PATH+'mocnknights_tweet_1604520962959020032_retweeters.csv', index=False)