# Sentiment Analysis

In this Notebook, we will use VADER sentiment analysis to determine the positive or negative connotation to tweets in a dataset of F1 tweets.
VADER works also best on short social media texts.

In [4]:
import pandas as pd

First, let's read the data file and filter on date, to limit the number of tweets (otherwise we have to wait to long later on).

In [5]:
df = pd.read_csv('F1_tweets.csv')
df=df[df['date']>'2021-11-22']
df.head()

Unnamed: 0.1,Unnamed: 0,user_name,date,text
199943,199943,#TeamPsg,['F1'],Twitter for Android
213267,213267,Link ➡️https://t.co/WpePiMd8qs,"['MexicoGP', 'F1']",Twitter for Android
264679,264679,PlanetF1,2021-11-26 23:59:00,Helmut Marko confirms that Red Bull will not t...
264680,264680,GP2Joey,2021-11-26 23:57:41,13/7/1990\n@F1 Rd 8/17 BRITISH GP \n8.30am\nPR...
264681,264681,♦️Break Every Rule♦️,2021-11-26 23:54:36,Watching this @AyrtonSenna documentary. I have...


Start with installing VaderSentiment: pip install vaderSentiment

Next we will import it and create a function to determine the sentiment of 1 text.

In [6]:
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

def sentiment_scores(tweet):
    sid_obj = SentimentIntensityAnalyzer() # Create a SentimentIntensityAnalyzer object.
    sentiment_dict = sid_obj.polarity_scores(tweet)# Apply on a tweet
    return sentiment_dict['compound']

Try it and apply it for different sentences

In [7]:
tweet=df.iloc[4,3]
tweet

'Watching this @AyrtonSenna documentary. I have vague memories of him from childhood but now as an adult I feel myself becoming a fan even though I know he passed away in 1994. 😞 #Formula1 #F1'

In [8]:
sentiment_scores(tweet)

-0.34

Finally add an extra variable 'sentiment' to the dataset and add for every tweet its sentiment.

In [9]:
df['sentiment']= df.apply(lambda x: sentiment_scores(x['text']), axis=1)
    
df.head()