An NLP Data Science project to find out how people feel about 2021. Click here for the article
This project used Natural Language Processing (NLP) techniques to analyse users' sentiment towards 2021. After 2020 turned out to be a disaster, we've all been looking forward to 2021 with hope. I decided to perform a Twitter Sentiment Analysis to find out if the new year is treating us well! I scraped 37,621 tweets using the following search queries:
- "2021 is"
- "2021 will"
- "This year"
The tools used include Tweepy (for mining tweets), Pandas (for data cleaning/wrangling), Tweet Preprocessor (for rapid tweet cleaning), NLTK (for tokenization, stopwords removal and POS tagging), Plotly, Matplotlib and Word Cloud (for visualization).
With this project I wanted to get familiar with the Natural Language Processing (NLP) techniques and answer the following questions:
- What are the most common words people use to describe 2021?
- What is the number of tweets with positive, negative and neutral sentiment?
- What are the most common words used in positive, neutral and negative tweets?
- What are the most liked and retweeted posts?
In this repository you'll find:
- A notebook with a source code for the Twitter Sentiment Analysis
- A notebook with a source code for the Tweepy Twitter API scraper
- "Neon.ttf" font file which can be used to customise your Word Cloud visualisation
- "twitter.png" file which can be used as a mask for the Word Cloud to create a shape of Twitter logo
With this project we learnt the following insights:
2) The majority of tweets had a positive sentiment (19,107), followed by neutral (9,484) and negative (8,436) sentiment
Click here for an interactive version of this graph!
3)The most common word in positive tweets was "good", word "last" in negative tweets and "new" in neutral
Click here for an interactive version of this graph!
Click here for an interactive version of this graph!
Click here for an interactive version of this graph!

