## This notebook uses [Botometer](https://github.com/IUNetSci/botometer-python) for investigating a list of Twitter accounts

### Installs and imports

In [31]:
!pip install botometer requests tweepy

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [32]:
import botometer
import pandas as pd

### Declare API keys




In [30]:
rapidapi_key = "XXYYYZ"

twitter_app_auth = {
    'consumer_key': 'XXYZZ',
    'consumer_secret': 'XXYYZZ',
    'access_token': 'XXYYZZ',
    'access_token_secret': 'XXYYZZ',
  }

### Define a `Botometer` API call request

In [33]:
bom = botometer.Botometer(wait_on_ratelimit=True,
                          rapidapi_key=rapidapi_key,
                          **twitter_app_auth)

### Define functions to extract and print different scores

Extract specific scores

In [34]:
def extract_scores (result):
  return [result['cap']['english'], 
          result['cap']['universal'], 
          result['raw_scores']['english']['overall'], 
          result['raw_scores']['universal']['overall'], 
          result['display_scores']['english']['overall'], 
          result['display_scores']['universal']['overall']]

Print scores

In [35]:
def print_scores (result):
  print ("--- Cap:  Conditional probability that accounts with a score equal to or greater than this are automated (based on inferred language)---")
  print ("Cap in English:", result['cap']['english'])
  print ("Cap in Universal:", result['cap']['universal'])

  print ("--- Raw scores: bot score in the [0,1] range, both using English (all features) and Universal (language-independent) features ---")
  print ("Overall raw score in English:", result['raw_scores']['english']['overall'])
  print ("Overall raw score in Universal:", result['raw_scores']['universal']['overall'])

  print ("--- Display scores: same as raw scores, but in the [0,5] range ---")
  print ("Overall display score in English:", result['display_scores']['english']['overall'])
  print ("Overall display score in Universal:", result['display_scores']['universal']['overall'])

### Test on an account

**Meanings of the elements in the response**

* `user`: Twitter user object (from the user) plus the language inferred from majority of tweets

* `raw scores`: bot score in the [0,1] range, both using English (all features) and Universal (language-independent) features; in each case we have the overall score and the sub-scores for each bot class (see below for subclass names and definitions)

* `display scores`: same as raw scores, but in the [0,5] range

* `cap`: conditional probability that accounts with a score equal to or greater than this are automated; based on inferred language

**Meanings of the bot type scores**

* `fake_follower`: bots purchased to increase follower counts

* `self_declared`: bots from botwiki.org

* `astroturf`: manually labeled political bots and accounts involved in follow trains that systematically delete content

* `spammer`: accounts labeled as spambots from several datasets

* `financial`: bots that post using cashtags

* `other`: miscellaneous other bots obtained from manual annotation, user feedback, etc.

In [36]:
result = bom.check_account('narendramodi')
#print (result)
print (extract_scores (result))
print_scores(result)

[0.793087009461318, 0.8053787930995948, 0.38, 0.5, 1.9, 2.5]
--- Cap:  Conditional probability that accounts with a score equal to or greater than this are automated (based on inferred language)---
Cap in English: 0.793087009461318
Cap in Universal: 0.8053787930995948
--- Raw scores: bot score in the [0,1] range, both using English (all features) and Universal (language-independent) features ---
Overall raw score in English: 0.38
Overall raw score in Universal: 0.5
--- Display scores: same as raw scores, but in the [0,5] range ---
Overall display score in English: 1.9
Overall display score in Universal: 2.5


### Load the dataset of Tweet replies

Mount drive

In [37]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


Load the dataframe

In [38]:
#df = pd.read_csv ('/content/drive/MyDrive/replies-alviinaalametsa-annotated.csv')
df = pd.read_csv ('/content/drive/MyDrive/replies-larrouturou-annotated.csv')

Check the shape of the pandas dataframe

In [39]:
#df.shape
df.columns

Index(['id', 'conversation_id', 'created_at', 'date', 'timezone', 'place',
       'tweet', 'language', 'hashtags', 'cashtags', 'user_id', 'user_id_str',
       'username', 'name', 'day', 'hour', 'link', 'urls', 'photos', 'video',
       'thumbnail', 'retweet', 'nlikes', 'nreplies', 'nretweets', 'quote_url',
       'search', 'near', 'geo', 'source', 'user_rt_id', 'user_rt',
       'retweet_id', 'reply_to', 'retweet_date', 'translate', 'trans_src',
       'trans_dest', 'sentiment', 'annotator', 'annotation_id', 'updated_at',
       'lead_time'],
      dtype='object')

In [40]:
df.head(5)

Unnamed: 0,id,conversation_id,created_at,date,timezone,place,tweet,language,hashtags,cashtags,...,reply_to,retweet_date,translate,trans_src,trans_dest,sentiment,annotator,annotation_id,updated_at,lead_time
0,31706,1541498855061164034,2022-10-09T19:11:18.210103Z,2022-07-01 10:56:03,200,,@larrouturou @zoo_bear @AltNews @PMOIndia @Ind...,en,[],[],...,"[{'screen_name': 'larrouturou', 'name': 'Pierr...",,,,,Favour,2,281,2022-10-09T19:11:18.210132Z,25.844
1,31705,1541498855061164034,2022-10-09T19:10:49.199776Z,2022-06-27 21:10:48,200,,@larrouturou @zoo_bear @AltNews @PMOIndia @Ind...,qam,[],[],...,"[{'screen_name': 'larrouturou', 'name': 'Pierr...",,,,,unrecoganisable,2,280,2022-10-09T19:10:49.199805Z,8.488
2,31704,1541498855061164034,2022-10-09T19:10:39.108231Z,2022-06-27 21:42:24,200,,@larrouturou @zoo_bear @AltNews @PMOIndia @Ind...,qme,"['istandwithzubair', 'releasezubair']",[],...,"[{'screen_name': 'larrouturou', 'name': 'Pierr...",,,,,Favour,2,279,2022-10-09T19:10:39.108261Z,6.661
3,31703,1541498855061164034,2022-10-09T19:10:29.229150Z,2022-06-27 21:50:02,200,,@larrouturou @zoo_bear @AltNews @PMOIndia @Ind...,en,"['bjp', 'istandwithzubair', 'standwithteestase...",[],...,"[{'screen_name': 'larrouturou', 'name': 'Pierr...",,,,,Favour,2,278,2022-10-09T19:10:29.229179Z,12.636
4,31702,1541498855061164034,2022-10-09T19:10:14.808636Z,2022-06-27 21:50:41,200,,@larrouturou @zoo_bear @AltNews @PMOIndia @Ind...,en,[],[],...,"[{'screen_name': 'larrouturou', 'name': 'Pierr...",,,,,Questioning stature of the speaker (not widely...,2,277,2022-10-09T19:10:14.808675Z,9.15


### Scrape all usernames

In [41]:
username_list = df ['username'].tolist()
print (len(username_list))

171


Print the list of usernames

In [23]:
print (username_list)

['Nmenon777', 'AteequeKLD', 'CoolHugs_', 'ICIMonline', 'mudde_rama', 'MuhammadXunaid0', 'shahruk92461711', 'abdullahsir36', 'Kaavi2021', 'CHILLPI25037592', 'ProudIndian_253', 'Hashimalhanfi', 'KMortha', 'ProudIndian_253', 'chandrats06', 'lemonchusleee', 'zubair7409', 'Md35556134', 'Sreedha24324771', 'Iamraje78412219', 'Shaikhm04230983', 'cricketspr', 'yaji63', 'mrImteyaz8', 'SkSuhael', 'ehram123', '_irrationality0', 'tradingfutures1', 'SankarsanBarik4', 'TarunKGupta2', 'canarayan1', 'KiranKumarGK1', 'Avijit3k2', 'ramanvk819', 'Pj_Keshriyagrp', 'JaiPoori', 'aamehta1980', 'Subash37006856', 'underpaidsevak', 'Manishis050', 'NikhilRaod', 'asifmohd63', 'ra54077891', 'rajarshi_sg', 'MaskMan17122213', 'Notsoyoung5', 'Sid95879', 'anbokshi', 'dryahyakazi', 'shanthala_kumar', 'Shweta_India', 'Srikarch11', 'dinakar24', 'Namami_Ganga', 'Lisali89880277', 'funnyface234', 'wind_duo', 'nayakSSekhar', 'tuubol', 'Sanju40382206', 'Udays73', 'Lakshmoji', 'ReddyR2D2', 'ranveer2252', 'nimminavya', 'rajesh10

### Run Botometer over the list of accounts

Run through the list of accounts and print output on shell

In [24]:
for account_name, result in bom.check_accounts_in (username_list):
  print ("=======================", account_name, "=======================")
  print (result)

{'cap': {'english': 0.3912159239072283, 'universal': 0.37678996390568303}, 'display_scores': {'english': {'astroturf': 1.3, 'fake_follower': 0.3, 'financial': 0.1, 'other': 1.4, 'overall': 0.4, 'self_declared': 0.1, 'spammer': 0.0}, 'universal': {'astroturf': 1.8, 'fake_follower': 0.4, 'financial': 0.1, 'other': 1.2, 'overall': 0.3, 'self_declared': 0.0, 'spammer': 0.0}}, 'raw_scores': {'english': {'astroturf': 0.26, 'fake_follower': 0.06, 'financial': 0.02, 'other': 0.28, 'overall': 0.07, 'self_declared': 0.02, 'spammer': 0.0}, 'universal': {'astroturf': 0.35, 'fake_follower': 0.08, 'financial': 0.02, 'other': 0.24, 'overall': 0.06, 'self_declared': 0.0, 'spammer': 0.01}}, 'user': {'majority_lang': 'en', 'user_data': {'id_str': '799998182209961984', 'screen_name': 'Nmenon777'}}}
{'cap': {'english': 0.8395863545645614, 'universal': 0.8054092120440788}, 'display_scores': {'english': {'astroturf': 0.4, 'fake_follower': 1.8, 'financial': 1.6, 'other': 4.3, 'overall': 4.3, 'self_declared':

### Extract specific Botometer scores for the list of usernames

Function to return specific output scores as lists

In [42]:
def forward (username_list):

  cap_english = []
  cap_universal = []

  overall_raw_score_eng = []
  overall_raw_score_universal = []

  overall_display_score_eng = []
  overall_display_score_universal = []

  for account_name, result in bom.check_accounts_in (username_list):

    #if (account_name=='spandakarika108' or account_name=='Prasenjit97m' or account_name=='bhattketan1468' or account_name=='Hilale_pakistan'):
    if (account_name=='mrImteyaz8' or account_name=='Notsoyoung5' or account_name=='dinakar24' or account_name=='Prasenjit97m' or account_name=='jainarahari108' or account_name=='maverick9762' or account_name=='TheWhiteWaIker' or account_name=='frankblunt2021'):
      cap_english.append (0)
      cap_universal.append (0)
      overall_raw_score_eng.append (0)
      overall_raw_score_universal.append (0)
      overall_display_score_eng.append (0)
      overall_display_score_universal.append (0)

    else:
    
      print ("=======================", account_name, "=======================")
    
      print_scores (result)
    
      scores = extract_scores (result)

      cap_english.append (scores[0])
      cap_universal.append (scores[1])

      overall_raw_score_eng.append (scores[2])
      overall_raw_score_universal.append (scores[3])

      overall_display_score_eng.append (scores[4])
      overall_display_score_universal.append (scores[5])

  return (cap_english, 
          cap_universal, 
          overall_raw_score_eng,
          overall_raw_score_universal,
          overall_display_score_eng,
          overall_display_score_universal)

Run the forward function

In [43]:
(cap_english_list, 
 cap_universal_list, 
 overall_raw_score_eng_list, 
 overall_raw_score_universal_list, 
 overall_display_score_eng_list, 
 overall_display_score_universal_list) = forward (username_list)

--- Cap:  Conditional probability that accounts with a score equal to or greater than this are automated (based on inferred language)---
Cap in English: 0.4479465575931997
Cap in Universal: 0.3469210182245441
--- Raw scores: bot score in the [0,1] range, both using English (all features) and Universal (language-independent) features ---
Overall raw score in English: 0.09
Overall raw score in Universal: 0.05
--- Display scores: same as raw scores, but in the [0,5] range ---
Overall display score in English: 0.4
Overall display score in Universal: 0.2
--- Cap:  Conditional probability that accounts with a score equal to or greater than this are automated (based on inferred language)---
Cap in English: 0.8395863545645614
Cap in Universal: 0.8054092120440788
--- Raw scores: bot score in the [0,1] range, both using English (all features) and Universal (language-independent) features ---
Overall raw score in English: 0.86
Overall raw score in Universal: 0.66
--- Display scores: same as raw s

### Append the Botometer scores and save the dataframe

Append the Botometer scores to the original pandas dataframe

In [44]:
df ['botometerCapEng.score'] = cap_english_list
df ['botometerCapUni.score'] = cap_universal_list
df ['botometerOverallRawEng.score'] = overall_raw_score_eng_list
df ['botometerOverallRawUni.score'] = overall_raw_score_universal_list
df ['botometerDisplayEng.score'] = overall_display_score_eng_list
df ['botometerDisplayUni.score'] = overall_display_score_universal_list

Save the dataframe with appended Botometer scores as CSV

In [45]:
#df.to_csv('/content/drive/MyDrive/replies-alviinaalametsa-annotated-botometer.csv', index=False)
df.to_csv('/content/drive/MyDrive/replies-larrouturou-annotated-botometer.csv', index=False)