### Text Emotions Classification using Python

Text emotions classification is the problem of assigning emotion to a text by understanding the context and the emotion behind the text. One real-world example is the keyboard of an iPhone that recommends the most relevant emoji by understanding the text. So, if you want to learn how to classify the emotions of a text, this article is for you. In this article, I will take you through the task of text emotions classification with Machine Learning using Python.



Text emotions classification is the problem of natural language processing and text classification. Here we need to train a text classification model to classify the emotion of a text.

To solve this problem, we need labelled data of texts and their emotions. I found an ideal dataset to solve this problem on Kaggle.

In [1]:
import pandas as pd
import numpy as np
import keras
import tensorflow
from keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from keras.models import Sequential
from keras.layers import Embedding, Flatten, Dense

ModuleNotFoundError: No module named 'tensorflow.compat'

In [None]:
tensorflow.__version__

In [None]:
#!pip install --upgrade tensorflow==2.12.0


In [None]:
data = pd.read_csv('train.txt', sep=';')
data.columns = ['Text', 'Emotions']
print(data.head())

### As this is a problem of natural language processing, I’ll start by tokenizing the data:

In [None]:
texts = data["Text"].tolist()
labels = data["Emotions"].tolist()

# Tokenize the text data
tokenizer = Tokenizer()
tokenizer.fit_on_texts(texts)

### Now we need to pad the sequences to the same length to feed them into a neural network. Here’s how we can pad the sequences of the texts to have the same length

In [None]:
sequences = tokenizer.texts_to_sequences(texts)
max_length = max([len(seq) for seq in sequences])
padded_sequences = pad_sequences(sequences, maxlen=max_length)

### Now I’ll use the label encoder method to convert the classes from strings to a numerical representation:

In [None]:
# Encode the string labels to integers
label_encoder = LabelEncoder()
labels = label_encoder.fit_transform(labels)

### We are now going to One-hot encode the labels. One hot encoding refers to the transformation of categorical labels into a binary representation where each label is represented as a vector of all zeros except a single 1. This is necessary because machine learning algorithms work with numerical data. So here is how we can One-hot encode the labels:

In [None]:
# One-hot encode the labels
one_hot_labels = keras.utils.to_categorical(labels)

### Now we will split the data into training and test sets:

In [None]:
# Split the data into training and testing sets
xtrain, xtest, ytrain, ytest = train_test_split(padded_sequences, 
                                                one_hot_labels, 
                                                test_size=0.2)

### Now let’s define a neural network architecture for our classification problem and use it to train a model to classify emotions

In [None]:
# Define the model
model = Sequential()
model.add(Embedding(input_dim=len(tokenizer.word_index) + 1, 
                    output_dim=128, input_length=max_length))
model.add(Flatten())
model.add(Dense(units=128, activation="relu"))
model.add(Dense(units=len(one_hot_labels[0]), activation="softmax"))

model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
model.fit(xtrain, ytrain, epochs=10, batch_size=32, validation_data=(xtest, ytest))

### Now let’s take a sentence as an input text and see how the model performs:

In [None]:
input_text = "She didn't come today because she lost her dog yestertay!"

# Preprocess the input text
input_sequence = tokenizer.texts_to_sequences([input_text])
padded_input_sequence = pad_sequences(input_sequence, maxlen=max_length)
prediction = model.predict(padded_input_sequence)
predicted_label = label_encoder.inverse_transform([np.argmax(prediction[0])])
print(predicted_label)

### So this is how you can use Machine Learning for the task of text emotion classification using the Python programming language.

### Summary
Text emotion classification is the problem of assigning emotion to a text by understanding the context and the emotion behind the text. One real-world example is the keyboard of an iPhone that recommends the most relevant emoji by understanding the text. I hope you liked this article on Text Emotion Classification with Machine Learning using Python. Feel free to ask valuable questions in the comments section below.