# Emosions

<p align="center">
  <img src="Images/emosion.jpeg" width="200" alt="Image 1" style="margin-right: 50px;" />
</p>


Image Source : [istockphoto](https://www.istockphoto.com/photo/customer-choose-emoji-emoticons-happy-mood-on-emotions-satisfaction-meter-evaluation-gm1454928178-490425219?utm_campaign=srp_photos_top&utm_content=https%3A%2F%2Funsplash.com%2Fs%2Fphotos%2Femosion&utm_medium=affiliate&utm_source=unsplash&utm_term=emosion%3A%3A%3A).  

- Welcome to the "Emotions" dataset – a collection of English Twitter messages meticulously annotated with six fundamental emotions: anger, fear, joy, love, sadness, and surprise. 

- This dataset serves as a valuable resource for understanding and analyzing the diverse spectrum of emotions expressed in short-form text on social media.
- This dataset is provided in the Kaggle. You can find the source here : [Click here for the Source](https://www.kaggle.com/datasets/nelgiriyewithana/emotions/data)

<h2> Dataset </h2>

Each entry in this dataset consists of a text segment representing a Twitter message and a corresponding label indicating the predominant emotion conveyed. The emotions are classified into six categories: sadness (0), joy (1), love (2), anger (3), fear (4), and surprise (5). Whether you're interested in sentiment analysis, emotion classification, or text mining, this dataset provides a rich foundation for exploring the nuanced emotional landscape within the realm of social media.

Key Features:

- text: A string feature representing the content of the Twitter message.
- label: A classification label indicating the primary emotion, with values ranging from 0 to 5.

Potential Use Cases:

- Sentiment Analysis: Uncover the prevailing sentiments in English Twitter messages across various emotions.
- Emotion Classification: Develop models to accurately classify tweets into the six specified emotion categories.
- Textual Analysis: Explore linguistic patterns and expressions associated with different emotional states.

Sample Data:

Here's a glimpse of the dataset with a few examples:

| Text | Label |
|---|---|
| that was what i felt when i was finally accept... | 1 |
| i take every day as it comes i'm just focusing... | 4 |
| i give you plenty of attention even when i feel... | 0 |
| If you find this dataset useful consider giving it a vote! ❤️ | Note |


### <font color='blue'>Custom Functions</font>

In [14]:
# Lets use the custom function  with autoreload option

%load_ext autoreload
%autoreload 2

import custom_function as fn

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


### <font color='blue'>Import Libraries</font>

In [15]:
## imports

import matplotlib.pyplot as plt
import missingno
import matplotlib as mpl
import seaborn as sns
import numpy as np
import pandas as pd

pd.set_option("display.max_columns",50)
pd.set_option('display.max_colwidth', 250)

# Load imports for NLP
import nltk
from nltk.probability import FreqDist
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from wordcloud import WordCloud, STOPWORDS
import joblib
import tensorflow as tf
from imblearn.under_sampling import RandomUnderSampler
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import LabelEncoder
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import classification_report
from tensorflow.keras.layers import TextVectorization
from tensorflow.keras.models import Sequential
from tensorflow.keras import layers, optimizers, regularizers

In [16]:
# Setting random seed for reproducibility
tf.keras.utils.set_random_seed(42)
tf.random.set_seed(42)
np.random.seed(42)
tf.config.experimental.enable_op_determinism()

## Load the data 

In [4]:
emosion = pd.read_csv('Data/text.csv')
emosion.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 416809 entries, 0 to 416808
Data columns (total 3 columns):
 #   Column      Non-Null Count   Dtype 
---  ------      --------------   ----- 
 0   Unnamed: 0  416809 non-null  int64 
 1   text        416809 non-null  object
 2   label       416809 non-null  int64 
dtypes: int64(2), object(1)
memory usage: 9.5+ MB


In [5]:
emosion.head()

Unnamed: 0.1,Unnamed: 0,text,label
0,0,i just feel really helpless and heavy hearted,4
1,1,ive enjoyed being able to slouch about relax and unwind and frankly needed it after those last few weeks around the end of uni and the expo i have lately started to find myself feeling a bit listless which is never really a good thing,0
2,2,i gave up my internship with the dmrg and am feeling distraught,4
3,3,i dont know i feel so lost,0
4,4,i am a kindergarten teacher and i am thoroughly weary of my job after having taken the university entrance exam i suffered from anxiety for weeks as i did not want to carry on with my work studies were the only alternative,4


It look like we have unwanted column Unnamed:. Lets remove it from the dataframe



In [7]:
emosion = emosion.drop(columns= (['Unnamed: 0']))

In [8]:
emosion.head()

Unnamed: 0,text,label
0,i just feel really helpless and heavy hearted,4
1,ive enjoyed being able to slouch about relax and unwind and frankly needed it after those last few weeks around the end of uni and the expo i have lately started to find myself feeling a bit listless which is never really a good thing,0
2,i gave up my internship with the dmrg and am feeling distraught,4
3,i dont know i feel so lost,0
4,i am a kindergarten teacher and i am thoroughly weary of my job after having taken the university entrance exam i suffered from anxiety for weeks as i did not want to carry on with my work studies were the only alternative,4


## Light EDA

In [23]:
# Check null values
fn.check_null_count(emosion)


Unnamed: 0,features,dtypes,null count,null %
0,text,object,0,0.0
1,label,int64,0,0.0


In [24]:
# Check for duplicates

emosion.duplicated().sum()

686

In [27]:
# remove the duplicates

emosion = emosion.drop_duplicates()

# sanity check for duplicates

emosion.duplicated().sum()

0

### <font color='blue'>Explore each column </font>

#### Inspect label

In [31]:
# using custom function to inspect and explore label 

#fn.explore_numeric(emosion, 'label')

emosion['label'].value_counts()

1    140779
0    120989
3     57235
4     47664
2     34497
5     14959
Name: label, dtype: int64

In [34]:
# lets inpect each label with its associated text

label1 = emosion[emosion['label']==1]

In [35]:
label1

Unnamed: 0,text,label
7,i fear that they won t ever feel that delicious excitement of christmas eve at least not in the same way i remember doing it,1
10,i try to be nice though so if you get a bitchy person on the phone or at the window feel free to have a little fit and throw your pen at her face,1
12,i have officially graduated im not feeling as ecstatic as i thought i would,1
14,i feel my portfolio demonstrates how eager i am to learn but some who know me better might call it annoyingly persistent,1
15,i may be more biased than the next because i have a dependent life to take care of and to keep safe but i feel we all need to take care of ourselves as well,1
...,...,...
416796,im feeling inspired by their drama today,1
416800,i don t know why today i feel like it looks very cool you know somedays your just like yea bring that shit on to be honest the novelty has already worn off,1
416801,i don t even feel like i fully resolved it but it felt right to ask it,1
416802,i feel like i have been neglecting you my faithful reader s,1
