### Part 2: Data Collection

To answer the questions of this project, we have to collect variations of information for each department. For example, we could gather information like the number of followers, types, numbers of engagements, and the sentiments of replies and mentions for each account. Using different combinations of information, we can find out whether Twitter is an adequate social media platform. The relevant code for data collection can be found in the data collection Jupyter notebook.

We can collect all the information needed from Twitter itself. There are a few limitations to using Twitter API calls. For example, there are limits to the API calls we can make. I will explain more of the constraints when extracting more data. 

To start, we will call the list of departments at the LSE. We do this in order to search up the related accounts on Twitter.

In [1]:
from bs4 import BeautifulSoup
import requests
import json

In [2]:
url = "https://info.lse.ac.uk/staff/departments-and-institutes" #List of LSE departments and institutes
response = requests.get(url)
soup = BeautifulSoup(response.content, "lxml")

In [3]:
raw_list = soup.find_all('ul')

In [4]:
dept = []

for tag in raw_list:
    list_el = tag.text.strip()
    dept.append(list_el)

In [5]:
depts = dept[2:]
depts

['Department of Accounting\nDepartment of Anthropology\nData Science Institute\nDepartment of Economics\nDepartment of Economic History\nEuropean Institute\nDepartment of Finance\nFiroz Lalji Institute for Africa\nDepartment of Gender Studies\nDepartment of Geography and Environment\nDepartment of Government\nDepartment of Health Policy\nDepartment of International Development\nDepartment of International History\nInternational Inequalities Institute\nDepartment of International Relations\nLanguage Centre\nLSE Law School\nDepartment of Management\nMarshall Institute\nDepartment of Mathematics\nDepartment of Media and Communications\nDepartment of Methodology\nDepartment of Philosophy, Logic and Scientific Method\nDepartment of Psychological and Behavioural Science\nSchool of Public Policy (formerly Institute of Public Affairs)\nDepartment of Social Policy\nDepartment of Sociology\nDepartment of Statistics']

In [6]:
dept_list = depts[0].split("\n")
dept_list

['Department of Accounting',
 'Department of Anthropology',
 'Data Science Institute',
 'Department of Economics',
 'Department of Economic History',
 'European Institute',
 'Department of Finance',
 'Firoz Lalji Institute for Africa',
 'Department of Gender Studies',
 'Department of Geography and Environment',
 'Department of Government',
 'Department of Health Policy',
 'Department of International Development',
 'Department of International History',
 'International Inequalities Institute',
 'Department of International Relations',
 'Language Centre',
 'LSE Law School',
 'Department of Management',
 'Marshall Institute',
 'Department of Mathematics',
 'Department of Media and Communications',
 'Department of Methodology',
 'Department of Philosophy, Logic and Scientific Method',
 'Department of Psychological and Behavioural Science',
 'School of Public Policy (formerly Institute of Public Affairs)',
 'Department of Social Policy',
 'Department of Sociology',
 'Department of Statisti

In this project, I have only decided to look at LSE departments. Institutes tend to have massive followings and digital presences because they're more of a community than a department. It wouldn't be a fair comparison. Therefore, we won't be looking into Institutes.

In [7]:
#Remove all elements in the list with the word Institute as we are looking only at Departments
indices = [dept_list.index("Data Science Institute"), dept_list.index("European Institute"), 
           dept_list.index("Firoz Lalji Institute for Africa"), dept_list.index("International Inequalities Institute"),
           dept_list.index("Marshall Institute"), 
           dept_list.index("School of Public Policy (formerly Institute of Public Affairs)")]

In [8]:
dept_list = [i for j, i in enumerate(dept_list) if j not in indices]
dept_list #Final list of departments

['Department of Accounting',
 'Department of Anthropology',
 'Department of Economics',
 'Department of Economic History',
 'Department of Finance',
 'Department of Gender Studies',
 'Department of Geography and Environment',
 'Department of Government',
 'Department of Health Policy',
 'Department of International Development',
 'Department of International History',
 'Department of International Relations',
 'Language Centre',
 'LSE Law School',
 'Department of Management',
 'Department of Mathematics',
 'Department of Media and Communications',
 'Department of Methodology',
 'Department of Philosophy, Logic and Scientific Method',
 'Department of Psychological and Behavioural Science',
 'Department of Social Policy',
 'Department of Sociology',
 'Department of Statistics']

Based on the department list above, I've collected the list of Twitter usernames for each department manually because there isn't a compiled list of the department's Twitter handles.

In [9]:
usernames = ['LSE_Accounting', 'LSEAnthropology', 'LSEEcon',
            'LSEEcHist', 'LSEfinance', 'LSEGenderTweet', 'LSEGeography',
            'LSEGovernment', 'LSEHealthPolicy', 'LSE_ID', 'lsehistory',
            'LSEIRDept', 'lselangcentre', 'LSELaw', 'LSEManagement', 'LSEMaths',
            'MediaLSE', 'MethodologyLSE', 'LSEPhilosophy', 'LSEBehavioural', 
            'LSESocialPolicy', 'LSEsociology', 'LSEStatistics']

In [10]:
import pandas as pd

dept_usernames = pd.DataFrame({
    'Department': dept_list,
    'Twitter username': usernames
})

dept_usernames['Department'] = dept_usernames['Department'].str.replace('Department of ', '')
dept_usernames

Unnamed: 0,Department,Twitter username
0,Accounting,LSE_Accounting
1,Anthropology,LSEAnthropology
2,Economics,LSEEcon
3,Economic History,LSEEcHist
4,Finance,LSEfinance
5,Gender Studies,LSEGenderTweet
6,Geography and Environment,LSEGeography
7,Government,LSEGovernment
8,Health Policy,LSEHealthPolicy
9,International Development,LSE_ID


Next, we start identifying what kind of tweets we would like to look at for each department. We want to determine which department has better engagement overall and see what factors contribute to its success. The first thing we will look at is each department's follower counts. The next step is to find out the overall engagement using the metrics of likes, retweets, replies, and quotes. In the next step, we will look at comments (which may be limited). All the data analysis will only go back to the last 100 tweets posted by each account because this is the limit that Twitter allows.

Because of the restrictions of the Twitter developer account, the engagement counts are public metrics. If there were no restrictions, I would add more metrics like impression count, total views, and profile views.

In [11]:
#Twitter authentication for API calls
with open('keys.json') as f:
    keys = json.load(f)

bearer_token = keys['twitter']['bearer_token']
headers = {
    'Authorization': f"Bearer {bearer_token}"
}

In [12]:
#Getting information about each department's account
r = requests.get('https://api.twitter.com/2/users/by?usernames=LSE_Accounting,LSEAnthropology,LSEEcon,LSEEcHist,LSEfinance,LSEGenderTweet,LSEGeography,LSEGovernment,LSEHealthPolicy,LSE_ID,lsehistory,LSEIRDept,lselangcentre,LSELaw,LSEManagement,LSEMaths,MediaLSE,MethodologyLSE,LSEPhilosophy,LSEBehavioural,LSESocialPolicy,LSEsociology,LSEStatistics&user.fields=public_metrics', headers=headers)
r.text

'{"data":[{"username":"LSE_Accounting","public_metrics":{"followers_count":2520,"following_count":119,"tweet_count":495,"listed_count":25},"id":"4900666161","name":"LSE Accounting"},{"username":"LSEAnthropology","public_metrics":{"followers_count":6687,"following_count":99,"tweet_count":918,"listed_count":106},"id":"850888387","name":"LSE Anthropology"},{"username":"LSEEcon","public_metrics":{"followers_count":35783,"following_count":636,"tweet_count":10194,"listed_count":0},"id":"1200727465","name":"LSE Department of Economics"},{"username":"LSEEcHist","public_metrics":{"followers_count":3916,"following_count":309,"tweet_count":1765,"listed_count":101},"id":"224639696","name":"LSE Economic History"},{"username":"LSEfinance","public_metrics":{"followers_count":2482,"following_count":179,"tweet_count":605,"listed_count":51},"id":"972257048","name":"LSE Finance"},{"username":"LSEGenderTweet","public_metrics":{"followers_count":19727,"following_count":2626,"tweet_count":7243,"listed_count

In [13]:
#Collecting information about each department's Twitter accounts
ids = json.loads(r.text)['data']

dept_ids = []; dept_followers = []; tweet_count = []

for i in range(0, len(ids)):
    idno = ids[i]['id'] #Twitter ID for API calls later on
    no_tweets = ids[i]['public_metrics']['tweet_count'] #Tweet count for each account
    followers = ids[i]['public_metrics']['followers_count'] #Follow count for each count
    dept_ids.append(idno)
    dept_followers.append(followers)
    tweet_count.append(no_tweets)

dept_usernames['Tweet count'] = tweet_count
dept_usernames['Twitter ID'] = dept_ids
dept_usernames['Follower count'] = dept_followers

In [14]:
dept_usernames

Unnamed: 0,Department,Twitter username,Tweet count,Twitter ID,Follower count
0,Accounting,LSE_Accounting,495,4900666161,2520
1,Anthropology,LSEAnthropology,918,850888387,6687
2,Economics,LSEEcon,10194,1200727465,35783
3,Economic History,LSEEcHist,1765,224639696,3916
4,Finance,LSEfinance,605,972257048,2482
5,Gender Studies,LSEGenderTweet,7243,189090262,19727
6,Geography and Environment,LSEGeography,5339,240262055,12606
7,Government,LSEGovernment,8673,303823238,24781
8,Health Policy,LSEHealthPolicy,4559,472009727,7702
9,International Development,LSE_ID,5753,317018025,12198


Next, I will extract the number of engagements for the last 100 tweets of every LSE department on Twitter. I've chosen to use the most recent 100 tweets because of the limits. However, if the restriction on the call limit did not exist, I would've retrieved tweets for all departments from January. Taking tweets based after a precise date would be a fair comparison.

In [15]:
#Retrieving information about the past 100 tweets for each department

#Accounting department
r = requests.get('https://api.twitter.com/2/users/4900666161/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
acc = json.loads(r.text)

acc_dept_tweets = acc['data']
acc_retweet = []; acc_reply = []; acc_like = []; acc_quote = []; dept = []; tweet_no = []

for i in range(0, len(acc_dept_tweets)):
    tweet_no.append(i) #For the purpose of plotting tweets 1-100 in the visualisation section
    dept.append('Accounting') #Department name
    acc_retweet.append(acc_dept_tweets[i]['public_metrics']['retweet_count']) #How many times the tweet was retweeted
    acc_reply.append(acc_dept_tweets[i]['public_metrics']['reply_count']) #How many replies the tweet got
    acc_like.append(acc_dept_tweets[i]['public_metrics']['like_count']) #How many likes the tweet got
    acc_quote.append(acc_dept_tweets[i]['public_metrics']['quote_count']) #How many quotes the tweet got
    
acc_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': acc_retweet,
    'Reply count': acc_reply,
    'Like count': acc_like,
    'Quote count': acc_quote
})

In [16]:
#Anthropology department
r = requests.get('https://api.twitter.com/2/users/850888387/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
anth = json.loads(r.text)

anth_dept_tweets = anth['data']
anth_retweet = []; anth_reply = []; anth_like = []; anth_quote = []; dept = []; tweet_no = []

for i in range(0, len(anth_dept_tweets)):
    tweet_no.append(i)
    dept.append('Anthropology')
    anth_retweet.append(anth_dept_tweets[i]['public_metrics']['retweet_count'])
    anth_reply.append(anth_dept_tweets[i]['public_metrics']['reply_count'])
    anth_like.append(anth_dept_tweets[i]['public_metrics']['like_count'])
    anth_quote.append(anth_dept_tweets[i]['public_metrics']['quote_count'])
    
anth_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': anth_retweet,
    'Reply count': anth_reply,
    'Like count': anth_like,
    'Quote count': anth_quote
})

In [17]:
#Economics department
r = requests.get('https://api.twitter.com/2/users/1200727465/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
econ = json.loads(r.text)

econ_dept_tweets = econ['data']
econ_retweet = []; econ_reply = []; econ_like = []; econ_quote = []; dept = []; tweet_no = []

for i in range(0, len(econ_dept_tweets)):
    tweet_no.append(i)
    dept.append('Economics')
    econ_retweet.append(econ_dept_tweets[i]['public_metrics']['retweet_count'])
    econ_reply.append(econ_dept_tweets[i]['public_metrics']['reply_count'])
    econ_like.append(econ_dept_tweets[i]['public_metrics']['like_count'])
    econ_quote.append(econ_dept_tweets[i]['public_metrics']['quote_count'])
    
econ_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': econ_retweet,
    'Reply count': econ_reply,
    'Like count': econ_like,
    'Quote count': econ_quote
})

In [18]:
#Economics history department
r = requests.get('https://api.twitter.com/2/users/224639696/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
econhist = json.loads(r.text)

econhist_dept_tweets = econhist['data']
econhist_retweet = []; econhist_reply = []; econhist_like = []; econhist_quote = []; dept = []; tweet_no = []

for i in range(0, len(econhist_dept_tweets)):
    tweet_no.append(i)
    dept.append('Economics History')
    econhist_retweet.append(econhist_dept_tweets[i]['public_metrics']['retweet_count'])
    econhist_reply.append(econhist_dept_tweets[i]['public_metrics']['reply_count'])
    econhist_like.append(econhist_dept_tweets[i]['public_metrics']['like_count'])
    econhist_quote.append(econhist_dept_tweets[i]['public_metrics']['quote_count'])
    
econhist_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': econhist_retweet,
    'Reply count': econhist_reply,
    'Like count': econhist_like,
    'Quote count': econhist_quote
})

In [19]:
#Finance department
r = requests.get('https://api.twitter.com/2/users/972257048/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
finance = json.loads(r.text)

finance_dept_tweets = finance['data']
finance_retweet = []; finance_reply = []; finance_like = []; finance_quote = []; dept = []; tweet_no = []

for i in range(0, len(finance_dept_tweets)):
    tweet_no.append(i)
    dept.append('Finance')
    finance_retweet.append(finance_dept_tweets[i]['public_metrics']['retweet_count'])
    finance_reply.append(finance_dept_tweets[i]['public_metrics']['reply_count'])
    finance_like.append(finance_dept_tweets[i]['public_metrics']['like_count'])
    finance_quote.append(finance_dept_tweets[i]['public_metrics']['quote_count'])
    
finance_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': finance_retweet,
    'Reply count': finance_reply,
    'Like count': finance_like,
    'Quote count': finance_quote
})

In [20]:
#Gender studies department
r = requests.get('https://api.twitter.com/2/users/189090262/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
gender = json.loads(r.text)

gender_dept_tweets = gender['data']
gender_retweet = []; gender_reply = []; gender_like = []; gender_quote = []; dept = []; tweet_no = []

for i in range(0, len(gender_dept_tweets)):
    tweet_no.append(i)
    dept.append('Gender Studies')
    gender_retweet.append(gender_dept_tweets[i]['public_metrics']['retweet_count'])
    gender_reply.append(gender_dept_tweets[i]['public_metrics']['reply_count'])
    gender_like.append(gender_dept_tweets[i]['public_metrics']['like_count'])
    gender_quote.append(gender_dept_tweets[i]['public_metrics']['quote_count'])
    
gender_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': gender_retweet,
    'Reply count': gender_reply,
    'Like count': gender_like,
    'Quote count': gender_quote
})

In [21]:
#Geography department
r = requests.get('https://api.twitter.com/2/users/240262055/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
geo = json.loads(r.text)

geo_dept_tweets = geo['data']
geo_retweet = []; geo_reply = []; geo_like = []; geo_quote = []; dept = []; tweet_no = []

for i in range(0, len(geo_dept_tweets)):
    tweet_no.append(i)
    dept.append('Geography and Environment')
    geo_retweet.append(geo_dept_tweets[i]['public_metrics']['retweet_count'])
    geo_reply.append(geo_dept_tweets[i]['public_metrics']['reply_count'])
    geo_like.append(geo_dept_tweets[i]['public_metrics']['like_count'])
    geo_quote.append(geo_dept_tweets[i]['public_metrics']['quote_count'])
    
geo_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': geo_retweet,
    'Reply count': geo_reply,
    'Like count': geo_like,
    'Quote count': geo_quote
})

In [22]:
#Government department
r = requests.get('https://api.twitter.com/2/users/303823238/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
gov = json.loads(r.text)

gov_dept_tweets = gov['data']
gov_retweet = []; gov_reply = []; gov_like = []; gov_quote = []; dept = []; tweet_no = []

for i in range(0, len(gov_dept_tweets)):
    tweet_no.append(i)
    dept.append('Government')
    gov_retweet.append(gov_dept_tweets[i]['public_metrics']['retweet_count'])
    gov_reply.append(gov_dept_tweets[i]['public_metrics']['reply_count'])
    gov_like.append(gov_dept_tweets[i]['public_metrics']['like_count'])
    gov_quote.append(gov_dept_tweets[i]['public_metrics']['quote_count'])
    
gov_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': gov_retweet,
    'Reply count': gov_reply,
    'Like count': gov_like,
    'Quote count': gov_quote
})

In [23]:
#Health Policy Department
r = requests.get('https://api.twitter.com/2/users/472009727/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
hp = json.loads(r.text)

hp_dept_tweets = hp['data']
hp_retweet = []; hp_reply = []; hp_like = []; hp_quote = []; dept = []; tweet_no = []

for i in range(0, len(hp_dept_tweets)):
    tweet_no.append(i)
    dept.append('Health Policy')
    hp_retweet.append(hp_dept_tweets[i]['public_metrics']['retweet_count'])
    hp_reply.append(hp_dept_tweets[i]['public_metrics']['reply_count'])
    hp_like.append(hp_dept_tweets[i]['public_metrics']['like_count'])
    hp_quote.append(hp_dept_tweets[i]['public_metrics']['quote_count'])
    
hp_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': hp_retweet,
    'Reply count': hp_reply,
    'Like count': hp_like,
    'Quote count': hp_quote
})

In [24]:
#International Development Department
r = requests.get('https://api.twitter.com/2/users/317018025/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
id = json.loads(r.text)

id_dept_tweets = id['data']
id_retweet = []; id_reply = []; id_like = []; id_quote = []; dept = []; tweet_no = []

for i in range(0, len(id_dept_tweets)):
    tweet_no.append(i)
    dept.append('International Development')
    id_retweet.append(id_dept_tweets[i]['public_metrics']['retweet_count'])
    id_reply.append(id_dept_tweets[i]['public_metrics']['reply_count'])
    id_like.append(id_dept_tweets[i]['public_metrics']['like_count'])
    id_quote.append(id_dept_tweets[i]['public_metrics']['quote_count'])
    
id_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': id_retweet,
    'Reply count': id_reply,
    'Like count': id_like,
    'Quote count': id_quote
})

In [25]:
#International History Department
r = requests.get('https://api.twitter.com/2/users/253471591/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
ih = json.loads(r.text)

ih_dept_tweets = ih['data']
ih_retweet = []; ih_reply = []; ih_like = []; ih_quote = []; dept = []; tweet_no = []

for i in range(0, len(ih_dept_tweets)):
    tweet_no.append(i)
    dept.append('International History')
    ih_retweet.append(ih_dept_tweets[i]['public_metrics']['retweet_count'])
    ih_reply.append(ih_dept_tweets[i]['public_metrics']['reply_count'])
    ih_like.append(ih_dept_tweets[i]['public_metrics']['like_count'])
    ih_quote.append(ih_dept_tweets[i]['public_metrics']['quote_count'])
    
ih_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': ih_retweet,
    'Reply count': ih_reply,
    'Like count': ih_like,
    'Quote count': ih_quote
})

In [26]:
#International Relations Department
r = requests.get('https://api.twitter.com/2/users/237225532/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
ir = json.loads(r.text)

ir_dept_tweets = ir['data']
ir_retweet = []; ir_reply = []; ir_like = []; ir_quote = []; dept = []; tweet_no = []

for i in range(0, len(ir_dept_tweets)):
    tweet_no.append(i)
    dept.append('International Relations')
    ir_retweet.append(ir_dept_tweets[i]['public_metrics']['retweet_count'])
    ir_reply.append(ir_dept_tweets[i]['public_metrics']['reply_count'])
    ir_like.append(ir_dept_tweets[i]['public_metrics']['like_count'])
    ir_quote.append(ir_dept_tweets[i]['public_metrics']['quote_count'])
    
ir_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': ir_retweet,
    'Reply count': ir_reply,
    'Like count': ir_like,
    'Quote count': ir_quote
})

In [27]:
#Language Centre Department
r = requests.get('https://api.twitter.com/2/users/179888345/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
lang = json.loads(r.text)

lang_dept_tweets = lang['data']
lang_retweet = []; lang_reply = []; lang_like = []; lang_quote = []; dept = []; tweet_no = []

for i in range(0, len(lang_dept_tweets)):
    tweet_no.append(i)
    dept.append('Language Centre')
    lang_retweet.append(lang_dept_tweets[i]['public_metrics']['retweet_count'])
    lang_reply.append(lang_dept_tweets[i]['public_metrics']['reply_count'])
    lang_like.append(lang_dept_tweets[i]['public_metrics']['like_count'])
    lang_quote.append(lang_dept_tweets[i]['public_metrics']['quote_count'])
    
lang_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': lang_retweet,
    'Reply count': lang_reply,
    'Like count': lang_like,
    'Quote count': lang_quote
})

In [28]:
#Law Department
r = requests.get('https://api.twitter.com/2/users/532172035/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
law = json.loads(r.text)

law_dept_tweets = law['data']
law_retweet = []; law_reply = []; law_like = []; law_quote = []; dept = []; tweet_no = []

for i in range(0, len(law_dept_tweets)):
    tweet_no.append(i)
    dept.append('LSE Law School')
    law_retweet.append(law_dept_tweets[i]['public_metrics']['retweet_count'])
    law_reply.append(law_dept_tweets[i]['public_metrics']['reply_count'])
    law_like.append(law_dept_tweets[i]['public_metrics']['like_count'])
    law_quote.append(law_dept_tweets[i]['public_metrics']['quote_count'])
    
law_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': law_retweet,
    'Reply count': law_reply,
    'Like count': law_like,
    'Quote count': law_quote
})

In [29]:
#Management Department
r = requests.get('https://api.twitter.com/2/users/26465977/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
man = json.loads(r.text)

man_dept_tweets = man['data']
man_retweet = []; man_reply = []; man_like = []; man_quote = []; dept = []; tweet_no = []

for i in range(0, len(man_dept_tweets)):
    tweet_no.append(i)
    dept.append('Management')
    man_retweet.append(man_dept_tweets[i]['public_metrics']['retweet_count'])
    man_reply.append(man_dept_tweets[i]['public_metrics']['reply_count'])
    man_like.append(man_dept_tweets[i]['public_metrics']['like_count'])
    man_quote.append(man_dept_tweets[i]['public_metrics']['quote_count'])
    
man_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': man_retweet,
    'Reply count': man_reply,
    'Like count': man_like,
    'Quote count': man_quote
})

In [30]:
#Mathematics Department
r = requests.get('https://api.twitter.com/2/users/3044880371/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
math = json.loads(r.text)

math_dept_tweets = math['data']
math_retweet = []; math_reply = []; math_like = []; math_quote = []; dept = []; tweet_no = []

for i in range(0, len(math_dept_tweets)):
    tweet_no.append(i)
    dept.append('Mathematics')
    math_retweet.append(math_dept_tweets[i]['public_metrics']['retweet_count'])
    math_reply.append(math_dept_tweets[i]['public_metrics']['reply_count'])
    math_like.append(math_dept_tweets[i]['public_metrics']['like_count'])
    math_quote.append(math_dept_tweets[i]['public_metrics']['quote_count'])
    
math_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': math_retweet,
    'Reply count': math_reply,
    'Like count': math_like,
    'Quote count': math_quote
})

In [31]:
#Media and Communications Department
r = requests.get('https://api.twitter.com/2/users/207534677/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
mc = json.loads(r.text)

mc_dept_tweets = mc['data']
mc_retweet = []; mc_reply = []; mc_like = []; mc_quote = []; dept = []; tweet_no = []

for i in range(0, len(mc_dept_tweets)):
    tweet_no.append(i)
    dept.append('Media and Communications')
    mc_retweet.append(mc_dept_tweets[i]['public_metrics']['retweet_count'])
    mc_reply.append(mc_dept_tweets[i]['public_metrics']['reply_count'])
    mc_like.append(mc_dept_tweets[i]['public_metrics']['like_count'])
    mc_quote.append(mc_dept_tweets[i]['public_metrics']['quote_count'])
    
mc_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': mc_retweet,
    'Reply count': mc_reply,
    'Like count': mc_like,
    'Quote count': mc_quote
})

In [32]:
#Methodology Department
r = requests.get('https://api.twitter.com/2/users/86921024/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
met = json.loads(r.text)

met_dept_tweets = met['data']
met_retweet = []; met_reply = []; met_like = []; met_quote = []; dept = []; tweet_no = []

for i in range(0, len(met_dept_tweets)):
    tweet_no.append(i)
    dept.append('Methodology')
    met_retweet.append(met_dept_tweets[i]['public_metrics']['retweet_count'])
    met_reply.append(met_dept_tweets[i]['public_metrics']['reply_count'])
    met_like.append(met_dept_tweets[i]['public_metrics']['like_count'])
    met_quote.append(met_dept_tweets[i]['public_metrics']['quote_count'])
    
met_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': met_retweet,
    'Reply count': met_reply,
    'Like count': met_like,
    'Quote count': met_quote
})

In [33]:
#Philosophy Department
r = requests.get('https://api.twitter.com/2/users/904251031/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
phil = json.loads(r.text)

phil_dept_tweets = phil['data']
phil_retweet = []; phil_reply = []; phil_like = []; phil_quote = []; dept = []; tweet_no = []

for i in range(0, len(phil_dept_tweets)):
    tweet_no.append(i)
    dept.append('Philosophy, Logic and Scientific Method')
    phil_retweet.append(phil_dept_tweets[i]['public_metrics']['retweet_count'])
    phil_reply.append(phil_dept_tweets[i]['public_metrics']['reply_count'])
    phil_like.append(phil_dept_tweets[i]['public_metrics']['like_count'])
    phil_quote.append(phil_dept_tweets[i]['public_metrics']['quote_count'])
    
phil_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': phil_retweet,
    'Reply count': phil_reply,
    'Like count': phil_like,
    'Quote count': phil_quote
})

In [34]:
#Pscyhology Department
r = requests.get('https://api.twitter.com/2/users/1965000560/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
psych = json.loads(r.text)

psych_dept_tweets = psych['data']
psych_retweet = []; psych_reply = []; psych_like = []; psych_quote = []; dept = []; tweet_no = []

for i in range(0, len(psych_dept_tweets)):
    tweet_no.append(i)
    dept.append('Psychological and Behavioural Science')
    psych_retweet.append(psych_dept_tweets[i]['public_metrics']['retweet_count'])
    psych_reply.append(psych_dept_tweets[i]['public_metrics']['reply_count'])
    psych_like.append(psych_dept_tweets[i]['public_metrics']['like_count'])
    psych_quote.append(psych_dept_tweets[i]['public_metrics']['quote_count'])
    
psych_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': psych_retweet,
    'Reply count': psych_reply,
    'Like count': psych_like,
    'Quote count': psych_quote
})

In [35]:
#Social Policy Department
r = requests.get('https://api.twitter.com/2/users/2472172578/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
sp = json.loads(r.text)

sp_dept_tweets = sp['data']
sp_retweet = []; sp_reply = []; sp_like = []; sp_quote = []; dept = []; tweet_no = []

for i in range(0, len(sp_dept_tweets)):
    tweet_no.append(i)
    dept.append('Social Policy')
    sp_retweet.append(sp_dept_tweets[i]['public_metrics']['retweet_count'])
    sp_reply.append(sp_dept_tweets[i]['public_metrics']['reply_count'])
    sp_like.append(sp_dept_tweets[i]['public_metrics']['like_count'])
    sp_quote.append(sp_dept_tweets[i]['public_metrics']['quote_count'])
    
sp_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': sp_retweet,
    'Reply count': sp_reply,
    'Like count': sp_like,
    'Quote count': sp_quote
})

In [36]:
#Sociology Department
r = requests.get('https://api.twitter.com/2/users/1671486960/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
socio = json.loads(r.text)

socio_dept_tweets = socio['data']
socio_retweet = []; socio_reply = []; socio_like = []; socio_quote = []; dept = []; tweet_no = []

for i in range(0, len(socio_dept_tweets)):
    tweet_no.append(i)
    dept.append('Sociology')
    socio_retweet.append(socio_dept_tweets[i]['public_metrics']['retweet_count'])
    socio_reply.append(socio_dept_tweets[i]['public_metrics']['reply_count'])
    socio_like.append(socio_dept_tweets[i]['public_metrics']['like_count'])
    socio_quote.append(socio_dept_tweets[i]['public_metrics']['quote_count'])
    
socio_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': socio_retweet,
    'Reply count': socio_reply,
    'Like count': socio_like,
    'Quote count': socio_quote
})

In [37]:
#Statistics Department
r = requests.get('https://api.twitter.com/2/users/420282103/tweets?expansions=attachments.poll_ids,attachments.media_keys&tweet.fields=public_metrics&poll.fields=end_datetime&max_results=100', headers=headers)
stats = json.loads(r.text)

stats_dept_tweets = stats['data']
stats_retweet = []; stats_reply = []; stats_like = []; stats_quote = []; dept = []; tweet_no = []

for i in range(0, len(stats_dept_tweets)):
    tweet_no.append(i)
    dept.append('Statistics')
    stats_retweet.append(stats_dept_tweets[i]['public_metrics']['retweet_count'])
    stats_reply.append(stats_dept_tweets[i]['public_metrics']['reply_count'])
    stats_like.append(stats_dept_tweets[i]['public_metrics']['like_count'])
    stats_quote.append(stats_dept_tweets[i]['public_metrics']['quote_count'])
    
stats_twitter = pd.DataFrame({
    'Tweet no': tweet_no,
    'Department': dept,
    'Retweet count': stats_retweet,
    'Reply count': stats_reply,
    'Like count': stats_like,
    'Quote count': stats_quote
})

In [38]:
tweets_stats = pd.concat([acc_twitter, anth_twitter, econ_twitter, econhist_twitter,
         finance_twitter, gender_twitter, geo_twitter, gov_twitter,
         hp_twitter, id_twitter, ih_twitter, ir_twitter, lang_twitter,
         law_twitter, man_twitter, math_twitter, mc_twitter, met_twitter,
         phil_twitter, psych_twitter, sp_twitter, socio_twitter, stats_twitter])

tweets_stats #Dataframe including information about individual tweets

Unnamed: 0,Tweet no,Department,Retweet count,Reply count,Like count,Quote count
0,0,Accounting,0,0,4,0
1,1,Accounting,1,0,0,0
2,2,Accounting,1,0,0,0
3,3,Accounting,2,0,4,0
4,4,Accounting,3,0,0,0
...,...,...,...,...,...,...
95,95,Statistics,0,0,1,0
96,96,Statistics,0,0,0,0
97,97,Statistics,0,0,0,0
98,98,Statistics,0,0,0,0


We have collected all the information about engagements for every tweet for each department. We can sum up the result to visualise the information better. Additionally, we found the follower count of each department's social media. Now that we have both the followers and engagement totals, we can calculate the engagement ratio for each account. For engagement ratio, we can take the total engagements(retweets, replies, likes and quotes) and divide them by followers.

In [39]:
dept_eng = tweets_stats.groupby(['Department'], as_index=False).sum()
dept_eng['Follower count'] = dept_usernames['Follower count']
dept_eng['Tweet count'] = dept_usernames['Tweet count']
dept_eng['Total engagement'] = dept_eng['Retweet count']+dept_eng['Reply count']+dept_eng['Like count']+dept_eng['Quote count']

In [40]:
dept_eng.drop('Tweet no', axis = 1, inplace = True) 
dept_eng['Engagement ratio'] = dept_eng['Total engagement']/dept_eng['Follower count'] #Calculating engagement ratio
dept_eng #Dataframe with aggregate information

Unnamed: 0,Department,Retweet count,Reply count,Like count,Quote count,Follower count,Tweet count,Total engagement,Engagement ratio
0,Accounting,277,1,99,3,2520,495,380,0.150794
1,Anthropology,586,3,306,9,6687,918,904,0.135188
2,Economics,417,7,297,11,35783,10194,732,0.020457
3,Economics History,187,54,351,17,3916,1765,609,0.155516
4,Finance,390,1,40,3,2482,605,434,0.174859
5,Gender Studies,1128,15,398,19,19727,7243,1560,0.079079
6,Geography and Environment,514,12,374,19,12606,5339,919,0.072902
7,Government,335,5,116,4,24781,8673,460,0.018563
8,Health Policy,313,21,396,10,7702,4559,740,0.096079
9,International Development,286,4,163,8,12198,5753,461,0.037793


The last thing we need to collect is information about the mentions of the department's accounts. After extracting this, I will also perform sentiment classification. Once the sentiment classification is complete, we can count how many mentions have positive, neutral, and negative sentiments. 
Knowing the sentiment of an account's tweets are valuable. For instance, if a department has a high engagement rate, but most of the engagements are negative, we can't necessarily conclude that using Twitter is effective for a department if all its responses are negative.

In [41]:
#Retrieving the last 100 mentions for each department

#Accounting department
r = requests.get('https://api.twitter.com/2/users/4900666161/mentions?expansions=author_id&max_results=100', headers=headers)
acc_m = json.loads(r.text)['data']

acc_m_user = []; acc_m_content = []; dept = []
for i in range(0, len(acc_m)):
    dept.append('Accounting') #The department name
    user = acc_m[i]['author_id'] #The user's Twitter ID
    content = acc_m[i]['text'] #Content of the mention
    acc_m_user.append(user)
    acc_m_content.append(content)
    
acc_mentions = pd.DataFrame({
    'User ID': acc_m_user,
    'Department': dept,
    'Tweet content': acc_m_content
})

In [42]:
#Anthropology department
r = requests.get('https://api.twitter.com/2/users/850888387/mentions?expansions=author_id&max_results=100', headers=headers)
anth_m = json.loads(r.text)['data']

anth_m_user = []; anth_m_content = []; dept = []
for i in range(0, len(anth_m)):
    dept.append('Anthropology')
    user = anth_m[i]['author_id']
    content = anth_m[i]['text']
    anth_m_user.append(user)
    anth_m_content.append(content)
    
anth_mentions = pd.DataFrame({
    'User ID': anth_m_user,
    'Department': dept,
    'Tweet content': anth_m_content
})

In [43]:
#Economics department
r = requests.get('https://api.twitter.com/2/users/1200727465/mentions?expansions=author_id&max_results=100', headers=headers)
econ_m = json.loads(r.text)['data']

econ_m_user = []; econ_m_content = []; dept = []
for i in range(0, len(econ_m)):
    dept.append('Economics')
    user = econ_m[i]['author_id']
    content = econ_m[i]['text']
    econ_m_user.append(user)
    econ_m_content.append(content)
    
econ_mentions = pd.DataFrame({
    'User ID': econ_m_user,
    'Department': dept,
    'Tweet content': econ_m_content
})

In [44]:
#Economic History department
r = requests.get('https://api.twitter.com/2/users/224639696/mentions?expansions=author_id&max_results=100', headers=headers)
econhist_m = json.loads(r.text)['data']

econhist_m_user = []; econhist_m_content = []; dept = []
for i in range(0, len(econhist_m)):
    dept.append('Economic History')
    user = econhist_m[i]['author_id']
    content = econhist_m[i]['text']
    econhist_m_user.append(user)
    econhist_m_content.append(content)
    
econhist_mentions = pd.DataFrame({
    'User ID': econhist_m_user,
    'Department': dept,
    'Tweet content': econhist_m_content
})

In [45]:
#Finance Department
r = requests.get('https://api.twitter.com/2/users/972257048/mentions?expansions=author_id&max_results=100', headers=headers)
finance_m = json.loads(r.text)['data']

finance_m_user = []; finance_m_content = []; dept = []
for i in range(0, len(finance_m)):
    dept.append('Finance')
    user = finance_m[i]['author_id']
    content = finance_m[i]['text']
    finance_m_user.append(user)
    finance_m_content.append(content)
    
finance_mentions = pd.DataFrame({
    'User ID': finance_m_user,
    'Department': dept,
    'Tweet content': finance_m_content
})

In [46]:
#Gender Studies Department
r = requests.get('https://api.twitter.com/2/users/189090262/mentions?expansions=author_id&max_results=100', headers=headers)
gender_m = json.loads(r.text)['data']

gender_m_user = []; gender_m_content = []; dept = []
for i in range(0, len(gender_m)):
    dept.append('Gender Studies')
    user = gender_m[i]['author_id']
    content = gender_m[i]['text']
    gender_m_user.append(user)
    gender_m_content.append(content)
    
gender_mentions = pd.DataFrame({
    'User ID': gender_m_user,
    'Department': dept,
    'Tweet content': gender_m_content
})

In [47]:
#Geography department
r = requests.get('https://api.twitter.com/2/users/240262055/mentions?expansions=author_id&max_results=100', headers=headers)
geo_m = json.loads(r.text)['data']

geo_m_user = []; geo_m_content = []; dept = []
for i in range(0, len(geo_m)):
    dept.append('Geography and Evironment')
    user = geo_m[i]['author_id']
    content = geo_m[i]['text']
    geo_m_user.append(user)
    geo_m_content.append(content)
    
geo_mentions = pd.DataFrame({
    'User ID': geo_m_user,
    'Department': dept,
    'Tweet content': geo_m_content
})

In [48]:
#Government Department
r = requests.get('https://api.twitter.com/2/users/303823238/mentions?expansions=author_id&max_results=100', headers=headers)
gov_m = json.loads(r.text)['data']

gov_m_user = []; gov_m_content = []; dept = []
for i in range(0, len(gov_m)):
    dept.append('Government')
    user = gov_m[i]['author_id']
    content = gov_m[i]['text']
    gov_m_user.append(user)
    gov_m_content.append(content)
    
gov_mentions = pd.DataFrame({
    'User ID': gov_m_user,
    'Department': dept,
    'Tweet content': gov_m_content
})

In [49]:
#Health Policy Department
r = requests.get('https://api.twitter.com/2/users/472009727/mentions?expansions=author_id&max_results=100', headers=headers)
hp_m = json.loads(r.text)['data']

hp_m_user = []; hp_m_content = []; dept = []
for i in range(0, len(hp_m)):
    dept.append('Health Policy')
    user = hp_m[i]['author_id']
    content = hp_m[i]['text']
    hp_m_user.append(user)
    hp_m_content.append(content)
    
hp_mentions = pd.DataFrame({
    'User ID': hp_m_user,
    'Department': dept,
    'Tweet content': hp_m_content
})

In [50]:
#International Development Department
r = requests.get('https://api.twitter.com/2/users/317018025/mentions?expansions=author_id&max_results=100', headers=headers)
ID_m = json.loads(r.text)['data']

ID_m_user = []; ID_m_content = []; dept = []
for i in range(0, len(ID_m)):
    dept.append('International Development')
    user = ID_m[i]['author_id']
    content = ID_m[i]['text']
    ID_m_user.append(user)
    ID_m_content.append(content)
    
ID_mentions = pd.DataFrame({
    'User ID': ID_m_user,
    'Department': dept,
    'Tweet content': ID_m_content
})

In [51]:
#International History department
r = requests.get('https://api.twitter.com/2/users/253471591/mentions?expansions=author_id&max_results=100', headers=headers)
ih_m = json.loads(r.text)['data']

ih_m_user = []; ih_m_content = []; dept = []
for i in range(0, len(ih_m)):
    dept.append('International History')
    user = ih_m[i]['author_id']
    content = ih_m[i]['text']
    ih_m_user.append(user)
    ih_m_content.append(content)
    
ih_mentions = pd.DataFrame({
    'User ID': ih_m_user,
    'Department': dept,
    'Tweet content': ih_m_content
})

In [52]:
#International Relations department
r = requests.get('https://api.twitter.com/2/users/237225532/mentions?expansions=author_id&max_results=100', headers=headers)
ir_m = json.loads(r.text)['data']

ir_m_user = []; ir_m_content = []; dept = []
for i in range(0, len(ir_m)):
    dept.append('International Relations')
    user = ir_m[i]['author_id']
    content = ir_m[i]['text']
    ir_m_user.append(user)
    ir_m_content.append(content)
    
ir_mentions = pd.DataFrame({
    'User ID': ir_m_user,
    'Department': dept,
    'Tweet content': ir_m_content
})

In [53]:
#Language Centre Department
r = requests.get('https://api.twitter.com/2/users/179888345/mentions?expansions=author_id&max_results=100', headers=headers)
lang_m = json.loads(r.text)['data']

lang_m_user = []; lang_m_content = []; dept = []
for i in range(0, len(lang_m)):
    dept.append('Language Centre')
    user = lang_m[i]['author_id']
    content = lang_m[i]['text']
    lang_m_user.append(user)
    lang_m_content.append(content)
    
lang_mentions = pd.DataFrame({
    'User ID': lang_m_user,
    'Department': dept,
    'Tweet content': lang_m_content
})

In [54]:
#Law Department
r = requests.get('https://api.twitter.com/2/users/532172035/mentions?expansions=author_id&max_results=100', headers=headers)
law_m = json.loads(r.text)['data']

law_m_user = []; law_m_content = []; dept = []
for i in range(0, len(law_m)):
    dept.append('LSE Law School')
    user = law_m[i]['author_id']
    content = law_m[i]['text']
    law_m_user.append(user)
    law_m_content.append(content)
    
law_mentions = pd.DataFrame({
    'User ID': law_m_user,
    'Department': dept,
    'Tweet content': law_m_content
})

In [55]:
#Management Department
r = requests.get('https://api.twitter.com/2/users/26465977/mentions?expansions=author_id&max_results=100', headers=headers)
man_m = json.loads(r.text)['data']

man_m_user = []; man_m_content = []; dept = []
for i in range(0, len(man_m)):
    dept.append('Management')
    user = man_m[i]['author_id']
    content = man_m[i]['text']
    man_m_user.append(user)
    man_m_content.append(content)
    
man_mentions = pd.DataFrame({
    'User ID': man_m_user,
    'Department': dept,
    'Tweet content': man_m_content
})

In [56]:
#Mathematics Department
r = requests.get('https://api.twitter.com/2/users/3044880371/mentions?expansions=author_id&max_results=100', headers=headers)
math_m = json.loads(r.text)['data']

math_m_user = []; math_m_content = []; dept = []
for i in range(0, len(math_m)):
    dept.append('Mathematics')
    user = math_m[i]['author_id']
    content = math_m[i]['text']
    math_m_user.append(user)
    math_m_content.append(content)
    
math_mentions = pd.DataFrame({
    'User ID': math_m_user,
    'Department': dept,
    'Tweet content': math_m_content
})

In [57]:
#Media and Communications Department
r = requests.get('https://api.twitter.com/2/users/207534677/mentions?expansions=author_id&max_results=100', headers=headers)
mc_m = json.loads(r.text)['data']

mc_m_user = []; mc_m_content = []; dept = []
for i in range(0, len(mc_m)):
    dept.append('Media and Communications')
    user = mc_m[i]['author_id']
    content = mc_m[i]['text']
    mc_m_user.append(user)
    mc_m_content.append(content)
    
mc_mentions = pd.DataFrame({
    'User ID': mc_m_user,
    'Department': dept,
    'Tweet content': mc_m_content
})

In [58]:
#Methodology Department
r = requests.get('https://api.twitter.com/2/users/86921024/mentions?expansions=author_id&max_results=100', headers=headers)
met_m = json.loads(r.text)['data']

met_m_user = []; met_m_content = []; dept = []
for i in range(0, len(met_m)):
    dept.append('Methodology')
    user = met_m[i]['author_id']
    content = met_m[i]['text']
    met_m_user.append(user)
    met_m_content.append(content)
    
met_mentions = pd.DataFrame({
    'User ID': met_m_user,
    'Department': dept,
    'Tweet content': met_m_content
})

In [59]:
#Philosophy, Logic and Scientific Method Department
r = requests.get('https://api.twitter.com/2/users/904251031/mentions?expansions=author_id&max_results=100', headers=headers)
phil_m = json.loads(r.text)['data']

phil_m_user = []; phil_m_content = []; dept = []
for i in range(0, len(phil_m)):
    dept.append('Philosophy, Logic and Scientific Method')
    user = phil_m[i]['author_id']
    content = phil_m[i]['text']
    phil_m_user.append(user)
    phil_m_content.append(content)
    
phil_mentions = pd.DataFrame({
    'User ID': phil_m_user,
    'Department': dept,
    'Tweet content': phil_m_content
})

In [60]:
#Psychological and Behavioural Science Department
r = requests.get('https://api.twitter.com/2/users/1965000560/mentions?expansions=author_id&max_results=100', headers=headers)
psych_m = json.loads(r.text)['data']

psych_m_user = []; psych_m_content = []; dept = []
for i in range(0, len(psych_m)):
    dept.append('Psychological and Behavioural Science')
    user = psych_m[i]['author_id']
    content = psych_m[i]['text']
    psych_m_user.append(user)
    psych_m_content.append(content)
    
psych_mentions = pd.DataFrame({
    'User ID': psych_m_user,
    'Department': dept,
    'Tweet content': psych_m_content
})

In [61]:
#Social Policy Department
r = requests.get('https://api.twitter.com/2/users/2472172578/mentions?expansions=author_id&max_results=100', headers=headers)
sp_m = json.loads(r.text)['data']

sp_m_user = []; sp_m_content = []; dept = []
for i in range(0, len(sp_m)):
    dept.append('Social Policy')
    user = sp_m[i]['author_id']
    content = sp_m[i]['text']
    sp_m_user.append(user)
    sp_m_content.append(content)
    
sp_mentions = pd.DataFrame({
    'User ID': sp_m_user,
    'Department': dept,
    'Tweet content': sp_m_content
})

In [62]:
#Sociology Department
r = requests.get('https://api.twitter.com/2/users/1671486960/mentions?expansions=author_id&max_results=100', headers=headers)
socio_m = json.loads(r.text)['data']

socio_m_user = []; socio_m_content = []; dept = []
for i in range(0, len(socio_m)):
    dept.append('Sociology')
    user = socio_m[i]['author_id']
    content = socio_m[i]['text']
    socio_m_user.append(user)
    socio_m_content.append(content)
    
socio_mentions = pd.DataFrame({
    'User ID': socio_m_user,
    'Department': dept,
    'Tweet content': socio_m_content
})

In [63]:
#Statistics Department
r = requests.get('https://api.twitter.com/2/users/420282103/mentions?expansions=author_id&max_results=100', headers=headers)
stats_m = json.loads(r.text)['data']

stats_m_user = []; stats_m_content = []; dept = []
for i in range(0, len(stats_m)):
    dept.append('Statistics')
    user = stats_m[i]['author_id']
    content = stats_m[i]['text']
    stats_m_user.append(user)
    stats_m_content.append(content)
    
stats_mentions = pd.DataFrame({
    'User ID': stats_m_user,
    'Department': dept,
    'Tweet content': stats_m_content
})

In [64]:
mentions = pd.concat([acc_mentions, anth_mentions, econ_mentions, econhist_mentions,
         finance_mentions, gender_mentions, geo_mentions, gov_mentions,
         hp_mentions, ID_mentions, ih_mentions, ir_mentions, lang_mentions,
         law_mentions, man_mentions, math_mentions, mc_mentions, met_mentions,
         phil_mentions, psych_mentions, sp_mentions, socio_mentions, stats_mentions], ignore_index=True)

mentions #Dataframe with information about all the department's mentions

Unnamed: 0,User ID,Department,Tweet content
0,4900666161,Accounting,@LSE_Accounting in great company. https://t.co...
1,1898598396,Accounting,@LSE_Accounting in good company here!\n\nhttps...
2,1898598396,Accounting,A great opportunity for a range of #research p...
3,1456252200825597963,Accounting,Develop a financial and managerial accounting ...
4,21861323,Accounting,Join our MRes/PhD in Accounting information se...
...,...,...,...
2295,68416219,Statistics,@FlorianFoos @LSEGovernment @LSEDataScience @M...
2296,1174697132651102208,Statistics,@melissaleesands @LSEnews @LSEGovernment @Meth...
2297,68416219,Statistics,Know a secondary school / high school student ...
2298,740657209,Statistics,Very excited that @LSEGovernment is launching ...


To determine the sentiment of the mentions, I'll be using TextBlob. There are two handy tools in TextBlob: Subjectivity and polarity. Subjectivity tells us what kind of mention this tweet is. For example, the more subjective a tweet is, the more opinionated or emotion heavy it is. On the other hand, a more objective tweet contains more facts. The polarity score is more important at the moment. I'll be making use of it to classify the sentiments. Polarity operates on a scale of [-1, 1], where -1 represents a negative tweet, and 1 represents a positive tweet.

In [65]:
#Sentiment analysis code using TextBlob
from textblob import TextBlob

#Subjectivity function
def Subjectivity(tweet):
    return TextBlob(tweet).sentiment.subjectivity
    
#Polarity function
def Polarity(tweet):
    return TextBlob(tweet).sentiment.polarity

In [66]:
mentions['Subjectivity'] = mentions['Tweet content'].apply(Subjectivity) #Calculating the subjectivity score
mentions['Polarity'] = mentions['Tweet content'].apply(Polarity) #Calculating the polarity score

In [67]:
#Classifying the polarity scores 
def sentiment_analysis(score):
    if score < 0:
        return 'Negative'
    elif score == 0 :
        return 'Neutral'
    elif score > 0:
        return 'Positive'
    
mentions['Sentiment'] = mentions['Polarity'].apply(sentiment_analysis)

In [68]:
mentions

Unnamed: 0,User ID,Department,Tweet content,Subjectivity,Polarity,Sentiment
0,4900666161,Accounting,@LSE_Accounting in great company. https://t.co...,0.750000,0.800000,Positive
1,1898598396,Accounting,@LSE_Accounting in good company here!\n\nhttps...,0.600000,0.875000,Positive
2,1898598396,Accounting,A great opportunity for a range of #research p...,0.375000,0.400000,Positive
3,1456252200825597963,Accounting,Develop a financial and managerial accounting ...,0.000000,0.000000,Neutral
4,21861323,Accounting,Join our MRes/PhD in Accounting information se...,0.400000,0.000000,Neutral
...,...,...,...,...,...,...
2295,68416219,Statistics,@FlorianFoos @LSEGovernment @LSEDataScience @M...,0.000000,0.000000,Neutral
2296,1174697132651102208,Statistics,@melissaleesands @LSEnews @LSEGovernment @Meth...,0.000000,0.000000,Neutral
2297,68416219,Statistics,Know a secondary school / high school student ...,0.448636,0.061591,Positive
2298,740657209,Statistics,Very excited that @LSEGovernment is launching ...,0.640909,0.319773,Positive


In [69]:
#Creating dummy variables for positive, neutral, and negative so we can count these classes
def positive(text):
    if text == 'Positive':
        return 1
    else:
        return 0
    
def neutral(text):
    if text == 'Neutral':
        return 1
    else:
        return 0
    
def negative(text):
    if text == 'Negative':
        return 1
    else:
        return 0
    
mentions['Positive'] = mentions['Sentiment'].apply(positive)
mentions['Neutral'] = mentions['Sentiment'].apply(neutral)
mentions['Negative'] = mentions['Sentiment'].apply(negative)

In [70]:
mentions_sum = mentions.groupby(['Department'], as_index=False).sum()
mentions_sum #Dataframe with information about the total number of positive, neutral and negative mentions

Unnamed: 0,Department,Subjectivity,Polarity,Positive,Neutral,Negative
0,Accounting,25.345229,11.531351,35,57,8
1,Anthropology,37.999137,19.24928,56,37,7
2,Economic History,27.835433,16.831492,48,44,8
3,Economics,28.546223,11.574058,55,35,10
4,Finance,33.469503,19.475427,51,42,7
5,Gender Studies,35.360246,20.252803,51,42,7
6,Geography and Evironment,29.212998,17.007926,54,41,5
7,Government,30.786153,21.12461,59,40,1
8,Health Policy,22.824255,20.145429,38,61,1
9,International Development,30.577266,10.7278,41,43,16


In [71]:
#Adding the totals from above to the main dataset dept_eng
dept_eng['Positive mentions'] = mentions_sum['Positive']
dept_eng['Neutral mentions'] = mentions_sum['Neutral']
dept_eng['Negative mentions'] =  mentions_sum['Negative']
dept_eng

Unnamed: 0,Department,Retweet count,Reply count,Like count,Quote count,Follower count,Tweet count,Total engagement,Engagement ratio,Positive mentions,Neutral mentions,Negative mentions
0,Accounting,277,1,99,3,2520,495,380,0.150794,35,57,8
1,Anthropology,586,3,306,9,6687,918,904,0.135188,56,37,7
2,Economics,417,7,297,11,35783,10194,732,0.020457,48,44,8
3,Economics History,187,54,351,17,3916,1765,609,0.155516,55,35,10
4,Finance,390,1,40,3,2482,605,434,0.174859,51,42,7
5,Gender Studies,1128,15,398,19,19727,7243,1560,0.079079,51,42,7
6,Geography and Environment,514,12,374,19,12606,5339,919,0.072902,54,41,5
7,Government,335,5,116,4,24781,8673,460,0.018563,59,40,1
8,Health Policy,313,21,396,10,7702,4559,740,0.096079,38,61,1
9,International Development,286,4,163,8,12198,5753,461,0.037793,41,43,16


In [72]:
#Saving the datasets
dept_eng.to_csv('data/dept_eng.csv')
mentions.to_csv('data/mentions.csv')
tweets_stats.to_csv('data/tweets_stats.csv')

[This code was last run on 25th April 2022]