# Overview

## What do we want to answer?
* What are the most common emotions in the tweets?
* What are the most common sentiments in the tweets?
* What are the most common emotions for each sentiment?
* What are the most common times of day for each sentiment?
* What are the most common words used for each sentiment?
* What are the most common words used across all tweets?

## About the data
* 24970 tweets with the @Dell handle
* 8 Emotions : anger, joy, anticipation, disgust, sadness, optimism, surprise, and fear
* 3 sentiments : positive, negative, and neutral
* 9 columns : Datetime,Tweet Id,Text,Username,sentiment,sentiment_score,emotion, and emotion_score
* Data Source : __[kaggle.com/datasets/ankitkumar2635/sentiment-and-emotions-of-tweets](https://www.kaggle.com/datasets/ankitkumar2635/sentiment-and-emotions-of-tweets)__


In [61]:
import pandas as pd

df = pd.read_csv('sentiment-emotion-labelled_Dell_tweets.csv')
df.head()

Unnamed: 0.1,Unnamed: 0,Datetime,Tweet Id,Text,Username,sentiment,sentiment_score,emotion,emotion_score
0,0,2022-09-30 23:29:15+00:00,1575991191170342912,@Logitech @apple @Google @Microsoft @Dell @Len...,ManjuSreedaran,neutral,0.853283,anticipation,0.587121
1,1,2022-09-30 21:46:35+00:00,1575965354425131008,@MK_habit_addict @official_stier @MortalKombat...,MiKeMcDnet,neutral,0.51947,joy,0.886913
2,2,2022-09-30 21:18:02+00:00,1575958171423752203,"As @CRN celebrates its 40th anniversary, Bob F...",jfollett,positive,0.763791,joy,0.960347
3,3,2022-09-30 20:05:24+00:00,1575939891485032450,@dell your customer service is horrible especi...,daveccarr,negative,0.954023,anger,0.983203
4,4,2022-09-30 20:03:17+00:00,1575939359160750080,@zacokalo @Dell @DellCares @Dell give the man ...,heycamella,neutral,0.52917,anger,0.776124


In [62]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 24970 entries, 0 to 24969
Data columns (total 9 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   Unnamed: 0       24970 non-null  int64  
 1   Datetime         24970 non-null  object 
 2   Tweet Id         24970 non-null  int64  
 3   Text             24970 non-null  object 
 4   Username         24970 non-null  object 
 5   sentiment        24970 non-null  object 
 6   sentiment_score  24970 non-null  float64
 7   emotion          24970 non-null  object 
 8   emotion_score    24970 non-null  float64
dtypes: float64(2), int64(2), object(5)
memory usage: 1.7+ MB


In [63]:
df.describe()

Unnamed: 0.1,Unnamed: 0,Tweet Id,sentiment_score,emotion_score
count,24970.0,24970.0,24970.0,24970.0
mean,12484.5,1.526448e+18,0.782578,0.819114
std,7208.362447,2.765398e+16,0.150751,0.195378
min,0.0,1.477082e+18,0.337307,0.12548
25%,6242.25,1.503808e+18,0.664537,0.722941
50%,12484.5,1.524811e+18,0.81795,0.90661
75%,18726.75,1.550124e+18,0.912815,0.97036
max,24969.0,1.575991e+18,0.991532,0.994312


In [64]:
df['sentiment'].value_counts()

sentiment
negative    10556
positive     7366
neutral      7048
Name: count, dtype: int64

In [65]:
df['emotion'].value_counts()

emotion
anger           7520
joy             6326
anticipation    5171
disgust         3000
sadness         1328
optimism        1225
fear             366
surprise          34
Name: count, dtype: int64

In [66]:
df['Datetime'].min()

'2022-01-01 00:59:37+00:00'

In [67]:
df['Datetime'].max()

'2022-09-30 23:29:15+00:00'

In [68]:
df.isnull().sum()

Unnamed: 0         0
Datetime           0
Tweet Id           0
Text               0
Username           0
sentiment          0
sentiment_score    0
emotion            0
emotion_score      0
dtype: int64

# Visualize

## What are the most common sentiments in the tweets?

In [69]:
import plotly.express as px
import plotly.graph_objects as go

# create pie chart to visualize the sentiment
fig = px.pie(df, names='sentiment', title='Sentiment of Tweets', hover_data=['sentiment'])
fig.show()

## What are the most common emotions in the tweets?

In [70]:
fig = px.pie(df, names='emotion', title='Emotion of Tweets', hover_data=['emotion'])
fig.show()

## What are the most common emotions for each sentiment?

In [71]:
fig = px.sunburst(df, path=['sentiment', 'emotion'], title='Emotion of Tweets for each Sentiment')
fig.show()

## What are the most common times of day for each sentiment?

In [72]:
# convert the datetime column to datetime format
df['Datetime'] = pd.to_datetime(df['Datetime'])

# create a new column for the hour of the day
df['Hour'] = df['Datetime'].dt.hour

# graph sentiments by hour of the day
fig = px.histogram(df, x='Hour', color='sentiment', title='Sentiment of Tweets by Hour of the Day')
fig.show()


## What are the most common words used for each sentiment?

In [73]:
import wordcloud
from wordcloud import WordCloud, STOPWORDS

# create a wordcloud for each sentiment
for sentiment in df['sentiment'].unique():
    # create a dataframe for each sentiment
    df_sentiment = df[df['sentiment'] == sentiment]
    # create a list of all the words in the tweets
    words = ' '.join(df_sentiment['Text'])
    # create the wordcloud
    wordcloud = WordCloud(stopwords=STOPWORDS, background_color='white', width=1200, height=400).generate(words)
    # plot the wordcloud
    fig = px.imshow(wordcloud, title=sentiment)
    fig.show()

## What are the most common words used across all tweets?

In [74]:
from collections import Counter

# create a list of all the words in the tweets
words = ' '.join(df['Text'])

# remove common words from the list of words
words = [word for word in words.split() if word != '@Dell']
words = [word for word in words if word != '@dell']



# remove words with a string length less than 4
words = [word for word in words if len(word) > 4]

# remove stopwords from the list of words
words = [word for word in words if word not in STOPWORDS]

# create a list of the 20 most common words in the tweets
common_words = Counter(words).most_common(20)

# create a dataframe of the 20 most common words in the tweets
df_common_words = pd.DataFrame(common_words, columns=['word', 'count'])

# create a bar chart of the 20 most common words in the tweets
fig = px.bar(df_common_words, x='word', y='count', title='20 Most Common Words in the Tweets')
fig.show()

# Conclusion

## What are the most common emotions in the tweets?
* The most common emotions in the tweets are Anger, joy,anticipation, and disgust.
* The least common emotions in the tweets are optimism, fear, and surprise.

## What are the most common sentiments in the tweets?
* The most common sentiment in the tweets is negative.
* The least common sentiment in the tweets is neutral.
* While the negative sentiment is the most common, the positive sentiment is not far behind.

## What are the most common emotions for each sentiment?
* The most common emotions for the negative sentiment are anger, disgust, and sadness.
* The most common emotions for the positive sentiment are joy, anticipation, and optimism.
* The most common emotions for the neutral sentiment are anticipation, joy, and disgust.

## What are the most common times of day for each sentiment?
* The most common time of day for all tweets is 16:00 (4:00 PM).
* While there is more negative emotion consistently throughout the day than positive emotion the overall volume of
  each sentiment is similar throughout the day.

## What are the most common words used for each sentiment?
* Most of the words used in tweets are not unique to a specific sentiment
* The most common words used, excluding "@dell", "@Dell" and simple words less than 4 characters, are "Laptop", "@DellCares", @MichealDell, "Service", "Customer", "support", "@elonmusk"
