Trudeau's Speeches: An Exercise in Sentiment Analysis and Topic Modelling

Article link: https://towardsdatascience.com/analyzing-justin-trudeaus-speeches-3ba2690ad57a

Canada will be entering the election season soon, with the projected election date to be on October 21, 2019. This election, in many ways, will be an interesting event. From the rise of populism across the world to refugee crises, Prime Minister Justin Trudeau had an extremely difficult term. These elections will be the chance for Canadian citizens to voice their concerns over Prime Minister Trudeau's policies.

Usually, citizens like to listen to debates and speeches by candidates on the campaign trail and occasionally dive into party platforms. But I propose a new way of judging candidates, especially incumbents: their official speeches. More often than not, they are a general representation of the government's agenda. I was inspired to analyze Prime Minister Trudeau's speeches when I heard of individuals examining President Trump's tweets; I thought speeches would be a great way of looking at a politician's sentiment over time, especially in lieu of elections.

The general structure of the project was as follows:

Find a way to scrape speeches from Prime Minister's Trudeau's website (https://pm.gc.ca/en/news/speeches)
Store the speeches in some database
Analyze the speeches sentiment
Analyze and predict speech topics from speech transcripts

I am proud that I was able to accomplish each of these steps and learn so many new techniques and technologies. If you would like to experiment with this on your own, please follow these instructions.

Running the code:

Navigate to your local directory and git clone this repo
Navigate to the project repo using your CLI and type the following commands:
1. source env/bin/activate: activates the virtual environment that hosts all modules
2. mongod: initiates MongoDB server to store speeches
Run the scraping script by typing python src/crawler_ajax.py. Note: AJAX requests were used. Selenium was the initial choice but it was hard to implement. The code for my initial work can be found in src/crawler_selenium.py
Clean the speech by running python src/speech_clean.py
Process the speech for natural language processing by running python src/speech_process.py
Analyze speech sentiments by running python src/sentiment_analysis.py
Find and predict speech topics by running python src/topic_modelling.py. Please follow the CLI instructions!

Note: visualizations with accomapnying analysis can be found at src/Visualizations.ipynb

Technologies used

Languages: Python
Techniques learned: natural language processing, topic modelling via latent Dirichlet allocation models, sentiment analysis, web scraping via Selenium, database storage
Frameworks: Selenium, MongoDB, NLTK

Things to work on

Refactoring to run all scripts with one command
Scrape more speeches and potentially predict sentiment scores over time

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.idea		.idea
data		data
env		env
src		src
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

data

data

env

env

src

src

.DS_Store

.DS_Store

README.md

README.md

Repository files navigation

Trudeau's Speeches: An Exercise in Sentiment Analysis and Topic Modelling

Running the code:

Technologies used

Things to work on

About

Releases

Packages

Languages

aaronabraham311/Trudeau-Speeches

Folders and files

Latest commit

History

Repository files navigation

Trudeau's Speeches: An Exercise in Sentiment Analysis and Topic Modelling

Running the code:

Technologies used

Things to work on

About

Resources

Stars

Watchers

Forks

Languages