Skip to content

reallyrehan/pollytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pollytics

Elections 2020: Biden Vs Trump

Sentiment Analysis - Can Social Media be an alternative for election polling?

As a part of a course in our Data Science Degree, we worked on analyzing sentiment around Biden and Trump for the year 2020 using around 60,000 Reddit posts. We analyzed and compared it with tweets of Biden (2k tweets) and Trump (10k tweets) and poll data (obtained from FiveThirtyEight).

We created a website visualizing all this data using Python's Dash framework which is fetching data from Firebase, deployed on Heroku. Also used Tableau for some visualizations to add some pretty cool visuals. Topic modeling has also been done using LDA (latent Dirichlet allocation) for creating a generative statistical model for each month that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar.

Stack Used

  • Python - Data Scraping, Collection, Cleaning, Processing
  • PySpark - Data Processing for Poll Data
  • Firebase - for realtime Data Storage and access
  • Plotly Express - for interactive Data Visualizing and Graphing
  • Tableau - Data Visualizing
  • Python Dash (with Flask) - for creating a React Web App with bootstrap components
  • Heroku - deploying Web App

The Heroku Web app is available on, http://pollytics.herokuapp.com/

Data Sources

  • Twitter API - for scraping Donald Trump and Joe Biden Tweets
  • Trump Archive - Since Twitter limits tweets to 3k and Trump had over 10k tweets for 2020, we used Trump Archive to extract tweets that exceeded the limit
  • PushShift Reddit API - for scraping Reddit posts/tweets, aggregated stats for 2020
  • FiveThirtyEight - for Polling Data

Demo

Demo Video

Sentiment Analysis

Sentiment Analysis

Overall Stats

Overall Stats

Topic Modeling

Topic modeling

Some interesting sights

  • We see a dip in sentiment around Trump and Biden, both, around April - the time when Covid started to make its impact in the US
  • We plotted a moving average against poll results and we were surprised to see similar dips in the sentiment way before the dips can be seen in the polls a few weeks later, effectively predicting the polls.
  • For every tweet Biden made, Trump tweeted 4 times
  • Biden had more negative tweets than Trump

Made By

  • Rehan Ahmed
  • Saurabh Jain
  • Danielle Sim

About

Sentiment Analysis - Biden Vs Trump

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published