First, we will call the list of tweets for this academic year for each LSE department.

In [1]:
from bs4 import BeautifulSoup
import requests
import json

In [2]:
url = "https://info.lse.ac.uk/staff/departments-and-institutes"
response = requests.get(url)
soup = BeautifulSoup(response.content, "lxml")

In [3]:
raw_list = soup.find_all('ul')

In [4]:
dept = []

for tag in raw_list:
    list_el = tag.text.strip()
    dept.append(list_el)

In [5]:
depts = dept[2:]
depts

['Department of Accounting\nDepartment of Anthropology\nData Science Institute\nDepartment of Economics\nDepartment of Economic History\nEuropean Institute\nDepartment of Finance\nFiroz Lalji Institute for Africa\nDepartment of Gender Studies\nDepartment of Geography and Environment\nDepartment of Government\nDepartment of Health Policy\nDepartment of International Development\nDepartment of International History\nInternational Inequalities Institute\nDepartment of International Relations\nLanguage Centre\nLSE Law School\nDepartment of Management\nMarshall Institute\nDepartment of Mathematics\nDepartment of Media and Communications\nDepartment of Methodology\nDepartment of Philosophy, Logic and Scientific Method\nDepartment of Psychological and Behavioural Science\nSchool of Public Policy (formerly Institute of Public Affairs)\nDepartment of Social Policy\nDepartment of Sociology\nDepartment of Statistics']

In [6]:
dept_list = depts[0].split("\n")
dept_list

['Department of Accounting',
 'Department of Anthropology',
 'Data Science Institute',
 'Department of Economics',
 'Department of Economic History',
 'European Institute',
 'Department of Finance',
 'Firoz Lalji Institute for Africa',
 'Department of Gender Studies',
 'Department of Geography and Environment',
 'Department of Government',
 'Department of Health Policy',
 'Department of International Development',
 'Department of International History',
 'International Inequalities Institute',
 'Department of International Relations',
 'Language Centre',
 'LSE Law School',
 'Department of Management',
 'Marshall Institute',
 'Department of Mathematics',
 'Department of Media and Communications',
 'Department of Methodology',
 'Department of Philosophy, Logic and Scientific Method',
 'Department of Psychological and Behavioural Science',
 'School of Public Policy (formerly Institute of Public Affairs)',
 'Department of Social Policy',
 'Department of Sociology',
 'Department of Statisti

In [7]:
#Remove all elements in the list with the word Institute as we are looking only at Departments
indices = [dept_list.index("Data Science Institute"), dept_list.index("European Institute"), 
           dept_list.index("Firoz Lalji Institute for Africa"), dept_list.index("International Inequalities Institute"),
           dept_list.index("Marshall Institute"), 
           dept_list.index("School of Public Policy (formerly Institute of Public Affairs)")]

In [8]:
dept_list = [i for j, i in enumerate(dept_list) if j not in indices]
dept_list

['Department of Accounting',
 'Department of Anthropology',
 'Department of Economics',
 'Department of Economic History',
 'Department of Finance',
 'Department of Gender Studies',
 'Department of Geography and Environment',
 'Department of Government',
 'Department of Health Policy',
 'Department of International Development',
 'Department of International History',
 'Department of International Relations',
 'Language Centre',
 'LSE Law School',
 'Department of Management',
 'Department of Mathematics',
 'Department of Media and Communications',
 'Department of Methodology',
 'Department of Philosophy, Logic and Scientific Method',
 'Department of Psychological and Behavioural Science',
 'Department of Social Policy',
 'Department of Sociology',
 'Department of Statistics']

In [9]:
#There is no compiled list so I've added this in manually
usernames = ['LSE_Accounting', 'LSEAnthropology', 'LSEEcon',
            'LSEEcHist', 'LSEfinance', 'LSEGenderTweet', 'LSEGeography',
            'LSEGovernment', 'LSEHealthPolicy', 'LSE_ID', 'lsehistory',
            'LSEIRDept', 'lselangcentre', 'LSELaw', 'LSEManagement', 'LSEMaths',
            'MediaLSE', 'MethodologyLSE', 'LSEPhilosophy', 'LSEBehavioural', 
            'LSESocialPolicy', 'LSEsociology', 'LSEStatistics']

In [10]:
import pandas as pd

dept_usernames = pd.DataFrame({
    'Department': dept_list,
    'Twitter username': usernames
})

dept_usernames['Department'] = dept_usernames['Department'].str.replace('Department of ', '')
dept_usernames

Unnamed: 0,Department,Twitter username
0,Accounting,LSE_Accounting
1,Anthropology,LSEAnthropology
2,Economics,LSEEcon
3,Economic History,LSEEcHist
4,Finance,LSEfinance
5,Gender Studies,LSEGenderTweet
6,Geography and Environment,LSEGeography
7,Government,LSEGovernment
8,Health Policy,LSEHealthPolicy
9,International Development,LSE_ID


Next, we start identifying what kind of tweets we would like to look at for each department. We want to find out which department has better engagement overall and see what factors contribute to its successful engagement. So, the first step is to find out the overall engagement in favorites and views for regular posts and retweets. The next engagement we will look at is comments(which may be limited), and then we will look at follower counts. All the data analysis will only go back 90 days because this is the limit twitter allows.

For the comments, we also have to classify the sentiment of the comment by ourselves, we will use sentiment analysis for this.

In [11]:
with open('keys.json') as f:
    keys = json.load(f)

bearer_token = keys['twitter']['bearer_token']
headers = {
    'Authorization': f"Bearer {bearer_token}"
}

In [14]:
r = requests.get('https://api.twitter.com/2/users/by?usernames=LSE_Accounting,LSEAnthropology,LSEEcon,LSEEcHist,LSEfinance,LSEGenderTweet,LSEGeography,LSEGovernment,LSEHealthPolicy,LSE_ID,lsehistory,LSEIRDept,lselangcentre,LSELaw,LSEManagement,LSEMaths,MediaLSE,MethodologyLSE,LSEPhilosophy,LSEBehavioural,LSESocialPolicy,LSEsociology,LSEStatistics', headers=headers)
r.text

'{"data":[{"id":"4900666161","name":"LSE Accounting","username":"LSE_Accounting"},{"id":"850888387","name":"LSE Anthropology","username":"LSEAnthropology"},{"id":"1200727465","name":"LSE Department of Economics","username":"LSEEcon"},{"id":"224639696","name":"LSE Economic History","username":"LSEEcHist"},{"id":"972257048","name":"LSE Finance","username":"LSEfinance"},{"id":"189090262","name":"LSE Gender","username":"LSEGenderTweet"},{"id":"240262055","name":"LSE Geography & Environment","username":"LSEGeography"},{"id":"303823238","name":"LSE Government","username":"LSEGovernment"},{"id":"472009727","name":"LSE Health Policy","username":"LSEHealthPolicy"},{"id":"317018025","name":"LSE International Development","username":"LSE_ID"},{"id":"253471591","name":"LSE International History","username":"lsehistory"},{"id":"237225532","name":"LSE Intl Relations","username":"LSEIRDept"},{"id":"179888345","name":"LSE Language Centre","username":"lselangcentre"},{"id":"532172035","name":"LSE Law S

In [30]:
ids = json.loads(r.text)
ids = ids['data']

dept_ids = []
for i in range(0, len(ids)):
    idno = ids[i]['id']
    dept_ids.append(idno)

dept_usernames['Twitter ID'] = dept_ids

In [31]:
dept_usernames

Unnamed: 0,Department,Twitter username,Twitter ID
0,Accounting,LSE_Accounting,4900666161
1,Anthropology,LSEAnthropology,850888387
2,Economics,LSEEcon,1200727465
3,Economic History,LSEEcHist,224639696
4,Finance,LSEfinance,972257048
5,Gender Studies,LSEGenderTweet,189090262
6,Geography and Environment,LSEGeography,240262055
7,Government,LSEGovernment,303823238
8,Health Policy,LSEHealthPolicy,472009727
9,International Development,LSE_ID,317018025


In [106]:
r = requests.get('https://api.twitter.com/2/users/4900666161/tweets', headers=headers)

In [108]:
ad = json.loads(r.text)
ad = ad['data']
ad

[{'id': '1506004918016094208',
  'text': 'RT @OnlineLSE: Develop a financial and managerial accounting toolkit to inform business decision-making and enhance organisational performa…'},
 {'id': '1502055596488540164',
  'text': 'RT @StudyLSE: Our in-person LSE Open Day is taking place on 6 April 2022!\n\nBookings are now open for all our events with academic departmen…'},
 {'id': '1501594873509691393',
  'text': 'The second edition of the Erasmus Accounting Workshop @ErasmusAccount will be held 18 March with Dr Aneesh Raghunandan looking at gender pay gap misreporting. #accountingworkshop #corporatedisclosure #GenderPayGap #misreporting #CSR https://t.co/GvDKYY3bx3'},
 {'id': '1498763109401612296',
  'text': 'RT @StudyLSE: Our in-person LSE Open Day is taking place on 6 April 2022!\n\nBookings are now open for all our events with academic departmen…'},
 {'id': '1498762781457321986',
  'text': 'RT @StudyLSE: Join our MRes/PhD in Accounting information session for prospective students on 

In [15]:
#Experiment 1 - Department of Accounting

In [83]:
ad_tweets = []
ad_tweet_ids = []
departments = []

for i in range(0, len(ad)):
    dept_name = 'Accounting'
    adtweetid = ad[i]['id']
    tweet = ad[i]['text']
    departments.append(dept_name)
    ad_tweets.append(tweet)
    ad_tweet_ids.append(adtweetid)

In [85]:
acc_dept = pd.DataFrame({
    'Tweet ID': ad_tweet_ids,
    'Department': departments,
    'Tweets': ad_tweets
})

acc_dept

Unnamed: 0,Tweet ID,Department,Tweets
0,1506004918016094208,Accounting,RT @OnlineLSE: Develop a financial and manager...
1,1502055596488540164,Accounting,RT @StudyLSE: Our in-person LSE Open Day is ta...
2,1501594873509691393,Accounting,The second edition of the Erasmus Accounting W...
3,1498763109401612296,Accounting,RT @StudyLSE: Our in-person LSE Open Day is ta...
4,1498762781457321986,Accounting,RT @StudyLSE: Join our MRes/PhD in Accounting ...
5,1493552844854673416,Accounting,Congratulations to Prof. Alnoor Bhimani - awar...
6,1491135328870342656,Accounting,Today's INSIGHTS @LSE_Accounting welcomed Dr S...
7,1488887891732357126,Accounting,Congratulations to #LSEAccounting alum Musa Af...
8,1488833208221413378,Accounting,RT @LSESummerSchool: 🤔 Do you have questions a...
9,1488519026020171784,Accounting,As we enter the Year of the Tiger we would lik...


Possible graphs to draw:
* Departments and their retweet count (over the month)
* Departments and their like count (over the month)
* Departments and their views count (over the month)
* Retweet engagements vs normal tweet engagements (over the month)
* Overall following count vs follower to engagement ratios (over the month)
* Following growth vs comment sentiment (over the month)

In [110]:
#Classifying posts into retweets and regular posts
post_class = []

for i in range(0, len(acc_dept)):
    first_char = acc_dept['Tweets'][i].strip()[0:2]
    if first_char == 'RT':
        post_class.append('Retweet')
    else:
        post_class.append('Post')
        
acc_dept['Tweet type'] = post_class

Parameters needed: Impressions, retweets, likes, replies, user profile clicks

In [118]:
r = requests.get('https://api.twitter.com/2/tweets?ids=1204084171334832128&impression_count,retweet_count,like_count,reply_count,user_profile_clicks&media.fields=non_public_metrics', headers=headers)

In [119]:
tweet_test = json.loads(r.text)

In [120]:
tweet_test

{'errors': [{'parameters': {'impression_count,retweet_count,like_count,reply_count,user_profile_clicks': ['']},
   'message': 'The query parameter [impression_count,retweet_count,like_count,reply_count,user_profile_clicks] is not one of [ids,expansions,tweet.fields,media.fields,poll.fields,place.fields,user.fields]'}],
 'title': 'Invalid Request',
 'detail': 'One or more parameters to your request was invalid.',
 'type': 'https://api.twitter.com/2/problems/invalid-request'}

In [69]:
acc_dept = pd.DataFrame({
    'Tweet ID': tweet_ids,
    'Tweets': ad_tweets,
    'Views':
    'Favorites': 
    'Comments':
    'Retweet count':
})

acc_dept

<Response [200]>

In [None]:
dept_df = pd.DataFrame({
    'Department': dept_list,
    'Twitter username': usernames,
    'Follow count': 
    'Average views':
    'Average likes':
    'Average replies': 
    'Average retweet':
})