Twitter_Sentiment_Analysis_US_Presedential_Elction2020

Objective: Using Python Script Extraction of tweets and Perform sentiment analysis on the presidential candidature of Donald Trump, Joe Biden before the elections in US in November, 2020. In this analysis extraction of tweets using Twitter's API, and GetOldTweets library to overcome the timeframe limitations of Twitter API is done. After extraction, I have done preprocessing for cleaning the datasets, basic EDA, sentiment analysis to observe polarity towards each candidate, used classification models on these sentiments and created visualizations.

Data Preprocessing : Data Preprocessing is a very crusial part for this project.Data preprocessing is the process of transforming raw data into a useful, understandable format.By preprocessing data, we make it easier to interpret and use. This process eliminates inconsistencies or duplicates in data, which can otherwise negatively affect a model’s accuracy. Data preprocessing also ensures that there aren’t any incorrect or missing values due to human error or bugs. In short, employing data preprocessing techniques makes the database more complete and accurate. Steps to process the twitter data :

Using Regular Expressions to remove Emojis from Tweets
Using Regular Expressions to remove any retweets (if they exist)
Using Regular Expressions to remove the usernames from the tweets as they do not provide any additional information
Using Regular Expressions to remove any URLs, websites,etc
Using Regular Expressions to identify for any hashtags in the tweet, & if they exist, remove the hashtag & keep the word. This can be very useful when modelling as it does not remove any possible words that might be a major factor in calculating the sentiment
Using Regular Expressions to remove any special characters, numbers , punctuations
Converting everything to lower case
Lastly, used the Tweet-Preprocessor Module for cleaning any leftover junk.

Above portion is the main preprocessing part.Now for calculating sentiment perform certain task using NLTK module.Steps are given below:

NLTK Module used to tokenize all words
NLTK Module used for removing any existing stop words (eg. Or , from , them, Does , etc)
NLTK Module used to perform stemming
Words that less than a length of 2 were dropped

Sentiment Analysis :

Sentiment analysis is used to analyze raw text to drive objective quantitative results using natural language processing, machine learning, and other data analytics techniques. It is used to detect positive or negative sentiment in text, and often businesses use it to gauge branded reputation among their customers. Lexicon file is used to calculate the sentiment of each tweets.

Donald Trump

The above graph describe the Emotions of Users for Donal Trump

The above barplot is depicting the counts of tweets per day for Trump.The Y axis denotes the no. of tweets on the day of July 14 to July 21

The above plot describe the No of tweets per hours on a particular day

Shows the different locations of tweets

Source here refers to the medium used by the person to tweet, which can be Twitter Web App, Iphone, Android, ipad etc. Here we can see a major no. of user’s tweet using Web Application, followed by Iphone and the Ipad

Given graph shows the No of tweets per day

This geographical map shows the sentiment of people around the country

Word Cloud Representation

Joe Biden

The above barplot is depicting the counts of tweets per day for Trump.The Y axis denotes the no. of tweets on the day of July 14 to July 21

The above plot describe the No of tweets per hours on a particular day

Shows the different locations of tweets

Source here refers to the medium used by the person to tweet, which can be Twitter Web App, Iphone, Android, ipad etc. Here we can see a major no. of user’s tweet using Web Application, followed by Iphone and the Ipad

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Extraction		Extraction
Notebook		Notebook
Visualizations		Visualizations
NRC Emotion Lexicon.txt		NRC Emotion Lexicon.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Twitter_Sentiment_Analysis_US_Presedential_Elction2020

The above graph describe the Emotions of Users for Donal Trump

The above barplot is depicting the counts of tweets per day for Trump.The Y axis denotes the no. of tweets on the day of July 14 to July 21

The above plot describe the No of tweets per hours on a particular day

Shows the different locations of tweets

Source here refers to the medium used by the person to tweet, which can be Twitter Web App, Iphone, Android, ipad etc. Here we can see a major no. of user’s tweet using Web Application, followed by Iphone and the Ipad

Given graph shows the No of tweets per day

This geographical map shows the sentiment of people around the country

Word Cloud Representation

The above barplot is depicting the counts of tweets per day for Trump.The Y axis denotes the no. of tweets on the day of July 14 to July 21

The above plot describe the No of tweets per hours on a particular day

Shows the different locations of tweets

Source here refers to the medium used by the person to tweet, which can be Twitter Web App, Iphone, Android, ipad etc. Here we can see a major no. of user’s tweet using Web Application, followed by Iphone and the Ipad

Given graph shows the No of tweets per day

This geographical map shows the sentiment of people around the country

Word Cloud Representation

Comparison of Accuracy of Decision Tree,Random Forest and Naive Bayes for Joa Biden

Deep Learning based Training and Validation acuracy graph for Joe Biden

About

Releases

Packages

Languages

Arupsau/Twitter_Sentiment_Analysis_US_Presedential_Elction2020

Folders and files

Latest commit

History

Repository files navigation

Twitter_Sentiment_Analysis_US_Presedential_Elction2020

The above graph describe the Emotions of Users for Donal Trump

The above barplot is depicting the counts of tweets per day for Trump.The Y axis denotes the no. of tweets on the day of July 14 to July 21

The above plot describe the No of tweets per hours on a particular day

Shows the different locations of tweets

Source here refers to the medium used by the person to tweet, which can be Twitter Web App, Iphone, Android, ipad etc. Here we can see a major no. of user’s tweet using Web Application, followed by Iphone and the Ipad

Given graph shows the No of tweets per day

This geographical map shows the sentiment of people around the country

Word Cloud Representation

The above barplot is depicting the counts of tweets per day for Trump.The Y axis denotes the no. of tweets on the day of July 14 to July 21

The above plot describe the No of tweets per hours on a particular day

Shows the different locations of tweets

Source here refers to the medium used by the person to tweet, which can be Twitter Web App, Iphone, Android, ipad etc. Here we can see a major no. of user’s tweet using Web Application, followed by Iphone and the Ipad

Given graph shows the No of tweets per day

This geographical map shows the sentiment of people around the country

Word Cloud Representation

Comparison of Accuracy of Decision Tree,Random Forest and Naive Bayes for Joa Biden

Deep Learning based Training and Validation acuracy graph for Joe Biden

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages