Given how most events moved online during COVID-19, having a strong digital presence is a huge advantage for organisations. Therefore, the main problem I would like to answer in this project is which departments at the LSE have the strongest digital presence and whether Twitter is a social media platform worth considering. This is an academic project for ST115.
For this project, the dataset itself is derived from the Twitter accounts itself using API calls. Majority of the code involves extracting the tweets, cleaning them up, and classifying the sentiments of tweets using TextBlob. Following the creation of the dataset, I've analysed the data using packages such as NetworkX, Seaborn, and Matplotlib.
- Make sure you have the following packages installed on Python: Pandas, NumPy, Seaborn, Matplotlib, BeautifulSoup, Requests, Json, TextBlob and NetworkX.
- Make sure you have a Twitter Developer account with elevated access
- Update the keys.json file with your own access token
- Run the data collection code first
- Finally, run the main code
- Analysing non-public metrics (Unavailable due my Twitter developer account access)
- Analysing tweets over the years instead of the last 100 tweets
- Comparing analysis with other social media platforms like Facebook and Instagram
- Comparing analysis with other university departments on Twitter
- Efficiency of API calls
I learned a lot about about API calls on Twitter, but I would now look into the possibility of making the code more efficient. I will try to find a way to loop through the API calls without a higher developer account access. However, if I had higher access, using non-public metrics like impression counts and profile views would tremendously help this analysis.