## 26250 - ST115 Project Code

Introduction: This project aims to find out whether LSE departments should use Twitter as a means to communicate with LSE students. The way we can explore the statement above is to carry out an analysis similar to social media analytics. A couple of factors that we can look at are the follower counts, total likes, retweets, quotes, mention sentiments, and engagement ratios.

First, we will call the list of departments at the LSE so that we know which accounts to look for on Twitter.

In [272]:
from bs4 import BeautifulSoup
import requests
import json

In [273]:
url = "https://info.lse.ac.uk/staff/departments-and-institutes"
response = requests.get(url)
soup = BeautifulSoup(response.content, "lxml")

In [274]:
raw_list = soup.find_all('ul')

In [275]:
dept = []

for tag in raw_list:
    list_el = tag.text.strip()
    dept.append(list_el)

In [276]:
depts = dept[2:]
depts

['Department of Accounting\nDepartment of Anthropology\nData Science Institute\nDepartment of Economics\nDepartment of Economic History\nEuropean Institute\nDepartment of Finance\nFiroz Lalji Institute for Africa\nDepartment of Gender Studies\nDepartment of Geography and Environment\nDepartment of Government\nDepartment of Health Policy\nDepartment of International Development\nDepartment of International History\nInternational Inequalities Institute\nDepartment of International Relations\nLanguage Centre\nLSE Law School\nDepartment of Management\nMarshall Institute\nDepartment of Mathematics\nDepartment of Media and Communications\nDepartment of Methodology\nDepartment of Philosophy, Logic and Scientific Method\nDepartment of Psychological and Behavioural Science\nSchool of Public Policy (formerly Institute of Public Affairs)\nDepartment of Social Policy\nDepartment of Sociology\nDepartment of Statistics']

In [277]:
dept_list = depts[0].split("\n")
dept_list

['Department of Accounting',
 'Department of Anthropology',
 'Data Science Institute',
 'Department of Economics',
 'Department of Economic History',
 'European Institute',
 'Department of Finance',
 'Firoz Lalji Institute for Africa',
 'Department of Gender Studies',
 'Department of Geography and Environment',
 'Department of Government',
 'Department of Health Policy',
 'Department of International Development',
 'Department of International History',
 'International Inequalities Institute',
 'Department of International Relations',
 'Language Centre',
 'LSE Law School',
 'Department of Management',
 'Marshall Institute',
 'Department of Mathematics',
 'Department of Media and Communications',
 'Department of Methodology',
 'Department of Philosophy, Logic and Scientific Method',
 'Department of Psychological and Behavioural Science',
 'School of Public Policy (formerly Institute of Public Affairs)',
 'Department of Social Policy',
 'Department of Sociology',
 'Department of Statisti

In [278]:
#Remove all elements in the list with the word Institute as we are looking only at Departments
indices = [dept_list.index("Data Science Institute"), dept_list.index("European Institute"), 
           dept_list.index("Firoz Lalji Institute for Africa"), dept_list.index("International Inequalities Institute"),
           dept_list.index("Marshall Institute"), 
           dept_list.index("School of Public Policy (formerly Institute of Public Affairs)")]

In [279]:
dept_list = [i for j, i in enumerate(dept_list) if j not in indices]
dept_list

['Department of Accounting',
 'Department of Anthropology',
 'Department of Economics',
 'Department of Economic History',
 'Department of Finance',
 'Department of Gender Studies',
 'Department of Geography and Environment',
 'Department of Government',
 'Department of Health Policy',
 'Department of International Development',
 'Department of International History',
 'Department of International Relations',
 'Language Centre',
 'LSE Law School',
 'Department of Management',
 'Department of Mathematics',
 'Department of Media and Communications',
 'Department of Methodology',
 'Department of Philosophy, Logic and Scientific Method',
 'Department of Psychological and Behavioural Science',
 'Department of Social Policy',
 'Department of Sociology',
 'Department of Statistics']

Based on the department list above, I've collected the list of Twitter usernames for each department manually because there isn't a compiled list of the department's Twitter handles.

In [280]:
usernames = ['LSE_Accounting', 'LSEAnthropology', 'LSEEcon',
            'LSEEcHist', 'LSEfinance', 'LSEGenderTweet', 'LSEGeography',
            'LSEGovernment', 'LSEHealthPolicy', 'LSE_ID', 'lsehistory',
            'LSEIRDept', 'lselangcentre', 'LSELaw', 'LSEManagement', 'LSEMaths',
            'MediaLSE', 'MethodologyLSE', 'LSEPhilosophy', 'LSEBehavioural', 
            'LSESocialPolicy', 'LSEsociology', 'LSEStatistics']

In [316]:
import pandas as pd

dept_usernames = pd.DataFrame({
    'Department': dept_list,
    'Twitter username': usernames
})

dept_usernames['Department'] = dept_usernames['Department'].str.replace('Department of ', '')
dept_usernames

Unnamed: 0,Department,Twitter username
0,Accounting,LSE_Accounting
1,Anthropology,LSEAnthropology
2,Economics,LSEEcon
3,Economic History,LSEEcHist
4,Finance,LSEfinance
5,Gender Studies,LSEGenderTweet
6,Geography and Environment,LSEGeography
7,Government,LSEGovernment
8,Health Policy,LSEHealthPolicy
9,International Development,LSE_ID


Next, we start identifying what kind of tweets we would like to look at for each department. We want to find out which department has better engagement overall and see what factors contribute to its success. The first thing we will look at is each department's follower counts. The next step is to find out the overall engagement using the metrics of likes, retweets, replies, and quotes. In the next step, we will look at comments (which may be limited). All the data analysis will only go back to the last 100 tweets posted by each account because this is the limit that Twitter allows.

Because of the restrictions of the Twitter developer account, the engagement counts are public metrics. If there were no restrictions, I would add more metrics like impression count, total views, and profile views.

In [282]:
with open('keys.json') as f:
    keys = json.load(f)

bearer_token = keys['twitter']['bearer_token']
headers = {
    'Authorization': f"Bearer {bearer_token}"
}

In [283]:
r = requests.get('https://api.twitter.com/2/users/by?usernames=LSE_Accounting,LSEAnthropology,LSEEcon,LSEEcHist,LSEfinance,LSEGenderTweet,LSEGeography,LSEGovernment,LSEHealthPolicy,LSE_ID,lsehistory,LSEIRDept,lselangcentre,LSELaw,LSEManagement,LSEMaths,MediaLSE,MethodologyLSE,LSEPhilosophy,LSEBehavioural,LSESocialPolicy,LSEsociology,LSEStatistics&user.fields=public_metrics', headers=headers)
r.text

'{"data":[{"id":"4900666161","name":"LSE Accounting","public_metrics":{"followers_count":2513,"following_count":119,"tweet_count":494,"listed_count":25},"username":"LSE_Accounting"},{"id":"850888387","name":"LSE Anthropology","public_metrics":{"followers_count":6625,"following_count":93,"tweet_count":896,"listed_count":0},"username":"LSEAnthropology"},{"id":"1200727465","name":"LSE Department of Economics","public_metrics":{"followers_count":35620,"following_count":636,"tweet_count":10179,"listed_count":567},"username":"LSEEcon"},{"id":"224639696","name":"LSE Economic History","public_metrics":{"followers_count":3873,"following_count":304,"tweet_count":1742,"listed_count":99},"username":"LSEEcHist"},{"id":"972257048","name":"LSE Finance","public_metrics":{"followers_count":2477,"following_count":179,"tweet_count":605,"listed_count":51},"username":"LSEfinance"},{"id":"189090262","name":"LSE Gender","public_metrics":{"followers_count":19648,"following_count":2622,"tweet_count":7209,"list

In [284]:
ids = json.loads(r.text)['data']

dept_ids = []
dept_followers = []

for i in range(0, len(ids)):
    idno = ids[i]['id']
    followers = ids[i]['public_metrics']['followers_count']
    dept_ids.append(idno)
    dept_followers.append(followers)

dept_usernames['Twitter ID'] = dept_ids
dept_usernames['Follower count'] = dept_followers

In [285]:
dept_usernames

Unnamed: 0,Department,Twitter username,Twitter ID,Follower count
0,Accounting,LSE_Accounting,4900666161,2513
1,Anthropology,LSEAnthropology,850888387,6625
2,Economics,LSEEcon,1200727465,35620
3,Economic History,LSEEcHist,224639696,3873
4,Finance,LSEfinance,972257048,2477
5,Gender Studies,LSEGenderTweet,189090262,19648
6,Geography and Environment,LSEGeography,240262055,12550
7,Government,LSEGovernment,303823238,24724
8,Health Policy,LSEHealthPolicy,472009727,7631
9,International Development,LSE_ID,317018025,12126


To start, I will extract the number of engagements for the last 100 tweets of every LSE department on Twitter.

In [286]:
#Accounting department
r = requests.get('https://api.twitter.com/2/users/4900666161/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
acc = json.loads(r.text)

acc_dept_tweets = acc['data']
acc_tweetid = []; acc_retweet = []; acc_reply = []; acc_like = []; acc_quote = []; dept = []

for i in range(0, len(acc_dept_tweets)):
    dept.append('Accounting')
    acc_tweetid.append(acc_dept_tweets[i]['id'])
    acc_retweet.append(acc_dept_tweets[i]['public_metrics']['retweet_count'])
    acc_reply.append(acc_dept_tweets[i]['public_metrics']['reply_count'])
    acc_like.append(acc_dept_tweets[i]['public_metrics']['like_count'])
    acc_quote.append(acc_dept_tweets[i]['public_metrics']['quote_count'])
    
acc_twitter = pd.DataFrame({
    'Twitter Id': acc_tweetid,
    'Department': dept,
    'Retweet count': acc_retweet,
    'Reply count': acc_reply,
    'Like count': acc_like,
    'Quote count': acc_quote
})

In [287]:
#Anthropology department
r = requests.get('https://api.twitter.com/2/users/850888387/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
anth = json.loads(r.text)

anth_dept_tweets = anth['data']
anth_tweetid = []; anth_retweet = []; anth_reply = []; anth_like = []; anth_quote = []; dept = []

for i in range(0, len(anth_dept_tweets)):
    dept.append('Anthropology')
    anth_tweetid.append(anth_dept_tweets[i]['id'])
    anth_retweet.append(anth_dept_tweets[i]['public_metrics']['retweet_count'])
    anth_reply.append(anth_dept_tweets[i]['public_metrics']['reply_count'])
    anth_like.append(anth_dept_tweets[i]['public_metrics']['like_count'])
    anth_quote.append(anth_dept_tweets[i]['public_metrics']['quote_count'])
    
anth_twitter = pd.DataFrame({
    'Twitter Id': anth_tweetid,
    'Department': dept,
    'Retweet count': anth_retweet,
    'Reply count': anth_reply,
    'Like count': anth_like,
    'Quote count': anth_quote
})

In [288]:
#Economics department
r = requests.get('https://api.twitter.com/2/users/1200727465/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
econ = json.loads(r.text)

econ_dept_tweets = econ['data']
econ_tweetid = []; econ_retweet = []; econ_reply = []; econ_like = []; econ_quote = []; dept = []

for i in range(0, len(econ_dept_tweets)):
    dept.append('Economics')
    econ_tweetid.append(econ_dept_tweets[i]['id'])
    econ_retweet.append(econ_dept_tweets[i]['public_metrics']['retweet_count'])
    econ_reply.append(econ_dept_tweets[i]['public_metrics']['reply_count'])
    econ_like.append(econ_dept_tweets[i]['public_metrics']['like_count'])
    econ_quote.append(econ_dept_tweets[i]['public_metrics']['quote_count'])
    
econ_twitter = pd.DataFrame({
    'Twitter Id': econ_tweetid,
    'Department': dept,
    'Retweet count': econ_retweet,
    'Reply count': econ_reply,
    'Like count': econ_like,
    'Quote count': econ_quote
})

In [289]:
#Economics history department
r = requests.get('https://api.twitter.com/2/users/224639696/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
econhist = json.loads(r.text)

econhist_dept_tweets = econhist['data']
econhist_tweetid = []; econhist_retweet = []; econhist_reply = []; econhist_like = []; econhist_quote = []; dept = []

for i in range(0, len(econhist_dept_tweets)):
    dept.append('Economics History')
    econhist_tweetid.append(econhist_dept_tweets[i]['id'])
    econhist_retweet.append(econhist_dept_tweets[i]['public_metrics']['retweet_count'])
    econhist_reply.append(econhist_dept_tweets[i]['public_metrics']['reply_count'])
    econhist_like.append(econhist_dept_tweets[i]['public_metrics']['like_count'])
    econhist_quote.append(econhist_dept_tweets[i]['public_metrics']['quote_count'])
    
econhist_twitter = pd.DataFrame({
    'Twitter Id': econhist_tweetid,
    'Department': dept,
    'Retweet count': econhist_retweet,
    'Reply count': econhist_reply,
    'Like count': econhist_like,
    'Quote count': econhist_quote
})

In [290]:
#Finance department
r = requests.get('https://api.twitter.com/2/users/972257048/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
finance = json.loads(r.text)

finance_dept_tweets = finance['data']
finance_tweetid = []; finance_retweet = []; finance_reply = []; finance_like = []; finance_quote = []; dept = []

for i in range(0, len(finance_dept_tweets)):
    dept.append('Finance')
    finance_tweetid.append(finance_dept_tweets[i]['id'])
    finance_retweet.append(finance_dept_tweets[i]['public_metrics']['retweet_count'])
    finance_reply.append(finance_dept_tweets[i]['public_metrics']['reply_count'])
    finance_like.append(finance_dept_tweets[i]['public_metrics']['like_count'])
    finance_quote.append(finance_dept_tweets[i]['public_metrics']['quote_count'])
    
finance_twitter = pd.DataFrame({
    'Twitter Id': finance_tweetid,
    'Department': dept,
    'Retweet count': finance_retweet,
    'Reply count': finance_reply,
    'Like count': finance_like,
    'Quote count': finance_quote
})

In [291]:
#Gender studies department
r = requests.get('https://api.twitter.com/2/users/189090262/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
gender = json.loads(r.text)

gender_dept_tweets = gender['data']
gender_tweetid = []; gender_retweet = []; gender_reply = []; gender_like = []; gender_quote = []; dept = []

for i in range(0, len(gender_dept_tweets)):
    dept.append('Gender Studies')
    gender_tweetid.append(gender_dept_tweets[i]['id'])
    gender_retweet.append(gender_dept_tweets[i]['public_metrics']['retweet_count'])
    gender_reply.append(gender_dept_tweets[i]['public_metrics']['reply_count'])
    gender_like.append(gender_dept_tweets[i]['public_metrics']['like_count'])
    gender_quote.append(gender_dept_tweets[i]['public_metrics']['quote_count'])
    
gender_twitter = pd.DataFrame({
    'Twitter Id': gender_tweetid,
    'Department': dept,
    'Retweet count': gender_retweet,
    'Reply count': gender_reply,
    'Like count': gender_like,
    'Quote count': gender_quote
})

In [292]:
#Geography department
r = requests.get('https://api.twitter.com/2/users/240262055/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
geo = json.loads(r.text)

geo_dept_tweets = geo['data']
geo_tweetid = []; geo_retweet = []; geo_reply = []; geo_like = []; geo_quote = []; dept = []

for i in range(0, len(geo_dept_tweets)):
    dept.append('Geography and Environment')
    geo_tweetid.append(geo_dept_tweets[i]['id'])
    geo_retweet.append(geo_dept_tweets[i]['public_metrics']['retweet_count'])
    geo_reply.append(geo_dept_tweets[i]['public_metrics']['reply_count'])
    geo_like.append(geo_dept_tweets[i]['public_metrics']['like_count'])
    geo_quote.append(geo_dept_tweets[i]['public_metrics']['quote_count'])
    
geo_twitter = pd.DataFrame({
    'Twitter Id': geo_tweetid,
    'Department': dept,
    'Retweet count': geo_retweet,
    'Reply count': geo_reply,
    'Like count': geo_like,
    'Quote count': geo_quote
})

In [293]:
#Government department
r = requests.get('https://api.twitter.com/2/users/303823238/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
gov = json.loads(r.text)

gov_dept_tweets = gov['data']
gov_tweetid = []; gov_retweet = []; gov_reply = []; gov_like = []; gov_quote = []; dept = []

for i in range(0, len(gov_dept_tweets)):
    dept.append('Government')
    gov_tweetid.append(gov_dept_tweets[i]['id'])
    gov_retweet.append(gov_dept_tweets[i]['public_metrics']['retweet_count'])
    gov_reply.append(gov_dept_tweets[i]['public_metrics']['reply_count'])
    gov_like.append(gov_dept_tweets[i]['public_metrics']['like_count'])
    gov_quote.append(gov_dept_tweets[i]['public_metrics']['quote_count'])
    
gov_twitter = pd.DataFrame({
    'Twitter Id': gov_tweetid,
    'Department': dept,
    'Retweet count': gov_retweet,
    'Reply count': gov_reply,
    'Like count': gov_like,
    'Quote count': gov_quote
})

In [294]:
#Health Policy Department
r = requests.get('https://api.twitter.com/2/users/472009727/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
hp = json.loads(r.text)

hp_dept_tweets = hp['data']
hp_tweetid = []; hp_retweet = []; hp_reply = []; hp_like = []; hp_quote = []; dept = []

for i in range(0, len(hp_dept_tweets)):
    dept.append('Health Policy')
    hp_tweetid.append(hp_dept_tweets[i]['id'])
    hp_retweet.append(hp_dept_tweets[i]['public_metrics']['retweet_count'])
    hp_reply.append(hp_dept_tweets[i]['public_metrics']['reply_count'])
    hp_like.append(hp_dept_tweets[i]['public_metrics']['like_count'])
    hp_quote.append(hp_dept_tweets[i]['public_metrics']['quote_count'])
    
hp_twitter = pd.DataFrame({
    'Twitter Id': hp_tweetid,
    'Department': dept,
    'Retweet count': hp_retweet,
    'Reply count': hp_reply,
    'Like count': hp_like,
    'Quote count': hp_quote
})

In [295]:
#International Development Department
r = requests.get('https://api.twitter.com/2/users/317018025/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
id = json.loads(r.text)

id_dept_tweets = id['data']
id_tweetid = []; id_retweet = []; id_reply = []; id_like = []; id_quote = []; dept = []

for i in range(0, len(id_dept_tweets)):
    dept.append('International Development')
    id_tweetid.append(id_dept_tweets[i]['id'])
    id_retweet.append(id_dept_tweets[i]['public_metrics']['retweet_count'])
    id_reply.append(id_dept_tweets[i]['public_metrics']['reply_count'])
    id_like.append(id_dept_tweets[i]['public_metrics']['like_count'])
    id_quote.append(id_dept_tweets[i]['public_metrics']['quote_count'])
    
id_twitter = pd.DataFrame({
    'Twitter Id': id_tweetid,
    'Department': dept,
    'Retweet count': id_retweet,
    'Reply count': id_reply,
    'Like count': id_like,
    'Quote count': id_quote
})

In [296]:
#International History Department
r = requests.get('https://api.twitter.com/2/users/253471591/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
ih = json.loads(r.text)

ih_dept_tweets = ih['data']
ih_tweetid = []; ih_retweet = []; ih_reply = []; ih_like = []; ih_quote = []; dept = []

for i in range(0, len(ih_dept_tweets)):
    dept.append('International History')
    ih_tweetid.append(ih_dept_tweets[i]['id'])
    ih_retweet.append(ih_dept_tweets[i]['public_metrics']['retweet_count'])
    ih_reply.append(ih_dept_tweets[i]['public_metrics']['reply_count'])
    ih_like.append(ih_dept_tweets[i]['public_metrics']['like_count'])
    ih_quote.append(ih_dept_tweets[i]['public_metrics']['quote_count'])
    
ih_twitter = pd.DataFrame({
    'Twitter Id': ih_tweetid,
    'Department': dept,
    'Retweet count': ih_retweet,
    'Reply count': ih_reply,
    'Like count': ih_like,
    'Quote count': ih_quote
})

In [297]:
#International Relations Department
r = requests.get('https://api.twitter.com/2/users/237225532/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
ir = json.loads(r.text)

ir_dept_tweets = ir['data']
ir_tweetid = []; ir_retweet = []; ir_reply = []; ir_like = []; ir_quote = []; dept = []

for i in range(0, len(ir_dept_tweets)):
    dept.append('International Relations')
    ir_tweetid.append(ir_dept_tweets[i]['id'])
    ir_retweet.append(ir_dept_tweets[i]['public_metrics']['retweet_count'])
    ir_reply.append(ir_dept_tweets[i]['public_metrics']['reply_count'])
    ir_like.append(ir_dept_tweets[i]['public_metrics']['like_count'])
    ir_quote.append(ir_dept_tweets[i]['public_metrics']['quote_count'])
    
ir_twitter = pd.DataFrame({
    'Twitter Id': ir_tweetid,
    'Department': dept,
    'Retweet count': ir_retweet,
    'Reply count': ir_reply,
    'Like count': ir_like,
    'Quote count': ir_quote
})

In [298]:
#Language Centre Department
r = requests.get('https://api.twitter.com/2/users/179888345/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
lang = json.loads(r.text)

lang_dept_tweets = lang['data']
lang_tweetid = []; lang_retweet = []; lang_reply = []; lang_like = []; lang_quote = []; dept = []

for i in range(0, len(lang_dept_tweets)):
    dept.append('Language Centre')
    lang_tweetid.append(lang_dept_tweets[i]['id'])
    lang_retweet.append(lang_dept_tweets[i]['public_metrics']['retweet_count'])
    lang_reply.append(lang_dept_tweets[i]['public_metrics']['reply_count'])
    lang_like.append(lang_dept_tweets[i]['public_metrics']['like_count'])
    lang_quote.append(lang_dept_tweets[i]['public_metrics']['quote_count'])
    
lang_twitter = pd.DataFrame({
    'Twitter Id': lang_tweetid,
    'Department': dept,
    'Retweet count': lang_retweet,
    'Reply count': lang_reply,
    'Like count': lang_like,
    'Quote count': lang_quote
})

In [299]:
#Law Department
r = requests.get('https://api.twitter.com/2/users/532172035/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
law = json.loads(r.text)

law_dept_tweets = law['data']
law_tweetid = []; law_retweet = []; law_reply = []; law_like = []; law_quote = []; dept = []

for i in range(0, len(law_dept_tweets)):
    dept.append('LSE Law School')
    law_tweetid.append(law_dept_tweets[i]['id'])
    law_retweet.append(law_dept_tweets[i]['public_metrics']['retweet_count'])
    law_reply.append(law_dept_tweets[i]['public_metrics']['reply_count'])
    law_like.append(law_dept_tweets[i]['public_metrics']['like_count'])
    law_quote.append(law_dept_tweets[i]['public_metrics']['quote_count'])
    
law_twitter = pd.DataFrame({
    'Twitter Id': law_tweetid,
    'Department': dept,
    'Retweet count': law_retweet,
    'Reply count': law_reply,
    'Like count': law_like,
    'Quote count': law_quote
})

In [300]:
#Management Department
r = requests.get('https://api.twitter.com/2/users/26465977/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
man = json.loads(r.text)

man_dept_tweets = man['data']
man_tweetid = []; man_retweet = []; man_reply = []; man_like = []; man_quote = []; dept = []

for i in range(0, len(man_dept_tweets)):
    dept.append('Management')
    man_tweetid.append(man_dept_tweets[i]['id'])
    man_retweet.append(man_dept_tweets[i]['public_metrics']['retweet_count'])
    man_reply.append(man_dept_tweets[i]['public_metrics']['reply_count'])
    man_like.append(man_dept_tweets[i]['public_metrics']['like_count'])
    man_quote.append(man_dept_tweets[i]['public_metrics']['quote_count'])
    
man_twitter = pd.DataFrame({
    'Twitter Id': man_tweetid,
    'Department': dept,
    'Retweet count': man_retweet,
    'Reply count': man_reply,
    'Like count': man_like,
    'Quote count': man_quote
})

In [301]:
#Mathematics Department
r = requests.get('https://api.twitter.com/2/users/3044880371/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
math = json.loads(r.text)

math_dept_tweets = math['data']
math_tweetid = []; math_retweet = []; math_reply = []; math_like = []; math_quote = []; dept = []

for i in range(0, len(math_dept_tweets)):
    dept.append('Mathematics')
    math_tweetid.append(math_dept_tweets[i]['id'])
    math_retweet.append(math_dept_tweets[i]['public_metrics']['retweet_count'])
    math_reply.append(math_dept_tweets[i]['public_metrics']['reply_count'])
    math_like.append(math_dept_tweets[i]['public_metrics']['like_count'])
    math_quote.append(math_dept_tweets[i]['public_metrics']['quote_count'])
    
math_twitter = pd.DataFrame({
    'Twitter Id': math_tweetid,
    'Department': dept,
    'Retweet count': math_retweet,
    'Reply count': math_reply,
    'Like count': math_like,
    'Quote count': math_quote
})

In [302]:
#Media and Communications Department
r = requests.get('https://api.twitter.com/2/users/207534677/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
mc = json.loads(r.text)

mc_dept_tweets = mc['data']
mc_tweetid = []; mc_retweet = []; mc_reply = []; mc_like = []; mc_quote = []; dept = []

for i in range(0, len(mc_dept_tweets)):
    dept.append('Media and Communications')
    mc_tweetid.append(mc_dept_tweets[i]['id'])
    mc_retweet.append(mc_dept_tweets[i]['public_metrics']['retweet_count'])
    mc_reply.append(mc_dept_tweets[i]['public_metrics']['reply_count'])
    mc_like.append(mc_dept_tweets[i]['public_metrics']['like_count'])
    mc_quote.append(mc_dept_tweets[i]['public_metrics']['quote_count'])
    
mc_twitter = pd.DataFrame({
    'Twitter Id': mc_tweetid,
    'Department': dept,
    'Retweet count': mc_retweet,
    'Reply count': mc_reply,
    'Like count': mc_like,
    'Quote count': mc_quote
})

In [303]:
#Methodology Department
r = requests.get('https://api.twitter.com/2/users/86921024/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
met = json.loads(r.text)

met_dept_tweets = met['data']
met_tweetid = []; met_retweet = []; met_reply = []; met_like = []; met_quote = []; dept = []

for i in range(0, len(met_dept_tweets)):
    dept.append('Methodology')
    met_tweetid.append(met_dept_tweets[i]['id'])
    met_retweet.append(met_dept_tweets[i]['public_metrics']['retweet_count'])
    met_reply.append(met_dept_tweets[i]['public_metrics']['reply_count'])
    met_like.append(met_dept_tweets[i]['public_metrics']['like_count'])
    met_quote.append(met_dept_tweets[i]['public_metrics']['quote_count'])
    
met_twitter = pd.DataFrame({
    'Twitter Id': met_tweetid,
    'Department': dept,
    'Retweet count': met_retweet,
    'Reply count': met_reply,
    'Like count': met_like,
    'Quote count': met_quote
})

In [304]:
#Philosophy Department
r = requests.get('https://api.twitter.com/2/users/904251031/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
phil = json.loads(r.text)

phil_dept_tweets = phil['data']
phil_tweetid = []; phil_retweet = []; phil_reply = []; phil_like = []; phil_quote = []; dept = []

for i in range(0, len(phil_dept_tweets)):
    dept.append('Philosophy, Logic and Scientific Method')
    phil_tweetid.append(phil_dept_tweets[i]['id'])
    phil_retweet.append(phil_dept_tweets[i]['public_metrics']['retweet_count'])
    phil_reply.append(phil_dept_tweets[i]['public_metrics']['reply_count'])
    phil_like.append(phil_dept_tweets[i]['public_metrics']['like_count'])
    phil_quote.append(phil_dept_tweets[i]['public_metrics']['quote_count'])
    
phil_twitter = pd.DataFrame({
    'Twitter Id': phil_tweetid,
    'Department': dept,
    'Retweet count': phil_retweet,
    'Reply count': phil_reply,
    'Like count': phil_like,
    'Quote count': phil_quote
})

In [305]:
#Pscyhology Department
r = requests.get('https://api.twitter.com/2/users/1965000560/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
psych = json.loads(r.text)

psych_dept_tweets = psych['data']
psych_tweetid = []; psych_retweet = []; psych_reply = []; psych_like = []; psych_quote = []; dept = []

for i in range(0, len(psych_dept_tweets)):
    dept.append('Psychological and Behavioural Science')
    psych_tweetid.append(psych_dept_tweets[i]['id'])
    psych_retweet.append(psych_dept_tweets[i]['public_metrics']['retweet_count'])
    psych_reply.append(psych_dept_tweets[i]['public_metrics']['reply_count'])
    psych_like.append(psych_dept_tweets[i]['public_metrics']['like_count'])
    psych_quote.append(psych_dept_tweets[i]['public_metrics']['quote_count'])
    
psych_twitter = pd.DataFrame({
    'Twitter Id': psych_tweetid,
    'Department': dept,
    'Retweet count': psych_retweet,
    'Reply count': psych_reply,
    'Like count': psych_like,
    'Quote count': psych_quote
})

In [306]:
#Social Policy Department
r = requests.get('https://api.twitter.com/2/users/2472172578/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
sp = json.loads(r.text)

sp_dept_tweets = sp['data']
sp_tweetid = []; sp_retweet = []; sp_reply = []; sp_like = []; sp_quote = []; dept = []

for i in range(0, len(sp_dept_tweets)):
    dept.append('Social Policy')
    sp_tweetid.append(sp_dept_tweets[i]['id'])
    sp_retweet.append(sp_dept_tweets[i]['public_metrics']['retweet_count'])
    sp_reply.append(sp_dept_tweets[i]['public_metrics']['reply_count'])
    sp_like.append(sp_dept_tweets[i]['public_metrics']['like_count'])
    sp_quote.append(sp_dept_tweets[i]['public_metrics']['quote_count'])
    
sp_twitter = pd.DataFrame({
    'Twitter Id': sp_tweetid,
    'Department': dept,
    'Retweet count': sp_retweet,
    'Reply count': sp_reply,
    'Like count': sp_like,
    'Quote count': sp_quote
})

In [307]:
#Sociology Department
r = requests.get('https://api.twitter.com/2/users/1671486960/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
socio = json.loads(r.text)

socio_dept_tweets = socio['data']
socio_tweetid = []; socio_retweet = []; socio_reply = []; socio_like = []; socio_quote = []; dept = []

for i in range(0, len(socio_dept_tweets)):
    dept.append('Sociology')
    socio_tweetid.append(socio_dept_tweets[i]['id'])
    socio_retweet.append(socio_dept_tweets[i]['public_metrics']['retweet_count'])
    socio_reply.append(socio_dept_tweets[i]['public_metrics']['reply_count'])
    socio_like.append(socio_dept_tweets[i]['public_metrics']['like_count'])
    socio_quote.append(socio_dept_tweets[i]['public_metrics']['quote_count'])
    
socio_twitter = pd.DataFrame({
    'Twitter Id': socio_tweetid,
    'Department': dept,
    'Retweet count': socio_retweet,
    'Reply count': socio_reply,
    'Like count': socio_like,
    'Quote count': socio_quote
})

In [308]:
#Statistics Department
r = requests.get('https://api.twitter.com/2/users/420282103/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
stats = json.loads(r.text)

stats_dept_tweets = stats['data']
stats_tweetid = []; stats_retweet = []; stats_reply = []; stats_like = []; stats_quote = []; dept = []

for i in range(0, len(stats_dept_tweets)):
    dept.append('Statistics')
    stats_tweetid.append(stats_dept_tweets[i]['id'])
    stats_retweet.append(stats_dept_tweets[i]['public_metrics']['retweet_count'])
    stats_reply.append(stats_dept_tweets[i]['public_metrics']['reply_count'])
    stats_like.append(stats_dept_tweets[i]['public_metrics']['like_count'])
    stats_quote.append(stats_dept_tweets[i]['public_metrics']['quote_count'])
    
stats_twitter = pd.DataFrame({
    'Twitter Id': stats_tweetid,
    'Department': dept,
    'Retweet count': stats_retweet,
    'Reply count': stats_reply,
    'Like count': stats_like,
    'Quote count': stats_quote
})

In [309]:
tweets_stats = pd.concat([acc_twitter, anth_twitter, econ_twitter, econhist_twitter,
         finance_twitter, gender_twitter, geo_twitter, gov_twitter,
         hp_twitter, id_twitter, ih_twitter, ir_twitter, lang_twitter,
         law_twitter, man_twitter, math_twitter, mc_twitter, met_twitter,
         phil_twitter, psych_twitter, sp_twitter, socio_twitter, stats_twitter])

tweets_stats 

Unnamed: 0,Twitter Id,Department,Retweet count,Reply count,Like count,Quote count
0,1506004918016094208,Accounting,1,0,0,0
1,1502055596488540164,Accounting,1,0,0,0
2,1501594873509691393,Accounting,2,0,4,0
3,1498763109401612296,Accounting,3,0,0,0
4,1498762781457321986,Accounting,1,0,0,0
...,...,...,...,...,...,...
95,1485949504906051589,Statistics,0,0,2,0
96,1484103476263194624,Statistics,0,0,1,0
97,1483801755452387329,Statistics,0,0,0,0
98,1483756212420321283,Statistics,0,0,1,0


Earlier, we found the follower count of each department's social media. Now that we have both the followers and engagement totals, we can calculate the engagement ratio for each account. For engagement ratio, we can take the total engagements(retweets, replies, likes and quotes) and divide them by followers.

In [314]:
dept_eng = tweets_stats.groupby(['Department'], as_index=False).sum()
dept_eng['Follower count'] = dept_usernames['Follower count']
dept_eng['Total engagement'] = dept_eng_count['Retweet count']+dept_eng_count['Reply count']+dept_eng_count['Like count']+dept_eng_count['Quote count']

In [315]:
dept_eng['Engagement ratio'] = dept_eng['Total engagement']/dept_eng['Follower count']
dept_eng

Unnamed: 0,Department,Retweet count,Reply count,Like count,Quote count,Follower count,Total engagement,Engagement ratio
0,Accounting,288,1,94,3,2513,386,0.153601
1,Anthropology,776,4,472,19,6625,1271,0.191849
2,Economics,366,6,277,8,35620,657,0.018445
3,Economics History,152,63,344,13,3873,572,0.147689
4,Finance,390,1,38,3,2477,432,0.174405
5,Gender Studies,1592,10,307,19,19648,1928,0.098127
6,Geography and Environment,466,4,267,13,12550,750,0.059761
7,Government,377,7,130,7,24724,521,0.021073
8,Health Policy,308,22,393,12,7631,735,0.096318
9,International Development,308,4,201,15,12126,528,0.043543


The last thing we need to collect is information about the mentions of the department's accounts. After extracting this, I will also perform sentiment analyses for the mentions. Once the sentiment analysis is complete, we can count how many mentions are positive.

In [335]:
#Sentiment analysis code
from textblob import TextBlob
import tweepy
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer

In [339]:
#Accounting department
r = requests.get('https://api.twitter.com/2/users/420282103/mentions?expansions=author_id&max_results=50', headers=headers)
acc_m = json.loads(r.text)['data']

acc_user_mentions = []; acc_sentiment = []
for i in range(0, len(acc_m)):
    user = acc_m[i]['author_id']
    mention = acc_m[i]['text']
    analysis = TextBlob(mention)
    score = SentimentIntensityAnalyzer().polarity_scores(mention)
    neg = score['neg']
    neu = score['neu']
    pos = score['pos']
    comp = score['compound']
    polarity += analysis.sentiment.polarity
    if neg > pos:
        acc_sent = 'neg'
    elif pos > neg:
        acc_sent = 'pos'
    elif pos == neg:
        acc_sent = 'neu'
    acc_user_mentions.append(user)
    acc_sentiment.append(acc_sent)

LookupError: 
**********************************************************************
  Resource [93mvader_lexicon[0m not found.
  Please use the NLTK Downloader to obtain the resource:

  [31m>>> import nltk
  >>> nltk.download('vader_lexicon')
  [0m
  For more information see: https://www.nltk.org/data.html

  Attempted to load [93msentiment/vader_lexicon.zip/vader_lexicon/vader_lexicon.txt[0m

  Searched in:
    - 'C:\\Users\\User/nltk_data'
    - 'C:\\Users\\User\\anaconda3\\nltk_data'
    - 'C:\\Users\\User\\anaconda3\\share\\nltk_data'
    - 'C:\\Users\\User\\anaconda3\\lib\\nltk_data'
    - 'C:\\Users\\User\\AppData\\Roaming\\nltk_data'
    - 'C:\\nltk_data'
    - 'D:\\nltk_data'
    - 'E:\\nltk_data'
    - ''
**********************************************************************


Based on the complete data frame above, these are some graphs that I could draw to decide whether Twitter is an effective social media platform:
* Departments and their retweet count 
* Departments and their like count 
* Retweet engagements vs normal tweet engagements 
* Overall following count vs follower to engagement ratios 
* Total comments and ratio of comment sentiments
* Network graph of the users who have mentioned the department's accounts