# DataLab Task 5: Model Iterations (LSTM)

First iteration on the model. Unvalid due to overfitting as shown at the end.



### Summary of the Notebook

This notebook focuses on building and evaluating an LSTM-based model for emotion classification using NLP features. The steps include:

1. Importing necessary libraries and loading the dataset.
2. Preprocessing features such as TF-IDF, embeddings, and other numerical data.
3. Encoding target labels and combining all features into a single dataset.
4. Splitting the data into training and testing sets and normalizing the features.
5. Reshaping the data for LSTM input and converting labels to categorical format.
6. Building and compiling an LSTM model with dropout and batch normalization layers.
7. Training the model with callbacks for early stopping, model checkpointing, and F1-score evaluation.
8. Evaluating the final model's performance using the F1-score.

In [None]:
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout, BatchNormalization
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from sklearn.metrics import f1_score
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.model_selection import train_test_split

In [15]:
import pandas as pd
from IPython.display import display

# Load the dataset with extracted features
features = "NLP_features.xlsx"
df = pd.read_excel(features)

# Display dataset structure in table format
display(df.head())

Unnamed: 0,Sentence,POS_Tags,TF_IDF,Sentiment_Score,Pretrained_Embeddings,Custom_Embeddings,Sentiment_Exclamations_Questions,Personal_Pronoun_Count
0,Vous êtes embrassés?,Vous_PRON êtes_AUX embrassés_VERB ?_PUNCT,[0. 0. 0. ... 0. 0. 0.],0.0,[ 0.044216 -0.0278645 -0.032453 -0.030573...,[ 5.27364027e-04 5.90693962e-04 3.06792255e-...,"0.0,0,1",1
1,Oui.,Oui_ADV ._PUNCT,[0. 0. 0. ... 0. 0. 0.],0.01,[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. ...,[-3.2684386e-03 4.5674204e-04 -2.1957180e-03 ...,"0.0,0,0",0
2,Mais non!,Mais_CCONJ non_ADV !_PUNCT,[0. 0. 0. ... 0. 0. 0.],-0.0125,[ 1.6874e-01 6.2667e-03 -7.5556e-02 -8.9906e-...,[ 1.61075848e-03 3.01836710e-03 2.69862730e-...,"0.0,1,0",0
3,Vous êtes embrassés?,Vous_PRON êtes_AUX embrassés_VERB ?_PUNCT,[0. 0. 0. ... 0. 0. 0.],0.0,[ 0.044216 -0.0278645 -0.032453 -0.030573...,[ 5.27364027e-04 5.90693962e-04 3.06792255e-...,"0.0,0,1",1
4,Oui.,Oui_ADV ._PUNCT,[0. 0. 0. ... 0. 0. 0.],0.01,[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. ...,[-3.2684386e-03 4.5674204e-04 -2.1957180e-03 ...,"0.0,0,0",0


In [16]:
# Convert TF-IDF features
df["TF_IDF"] = df["TF_IDF"].apply(lambda x: np.fromstring(x.strip("[]"), sep=" ") if isinstance(x, str) else x)
tfidf_features = np.array(df["TF_IDF"].tolist())
if len(tfidf_features.shape) == 1:
    tfidf_features = tfidf_features.reshape(-1, 1)

  df["TF_IDF"] = df["TF_IDF"].apply(lambda x: np.fromstring(x.strip("[]"), sep=" ") if isinstance(x, str) else x)


In [17]:
# Convert embeddings
df["Pretrained_Embeddings"] = df["Pretrained_Embeddings"].apply(lambda x: np.fromstring(x.strip("[]"), sep=" ") if isinstance(x, str) else x)
df["Custom_Embeddings"] = df["Custom_Embeddings"].apply(lambda x: np.fromstring(x.strip("[]"), sep=" ") if isinstance(x, str) else x)
pretrained_embeddings = np.array(df["Pretrained_Embeddings"].tolist())
custom_embeddings = np.array(df["Custom_Embeddings"].tolist())
if len(pretrained_embeddings.shape) == 1:
    pretrained_embeddings = pretrained_embeddings.reshape(-1, 1)
if len(custom_embeddings.shape) == 1:
    custom_embeddings = custom_embeddings.reshape(-1, 1)

# Convert other numerical features
df["Sentiment_Score"] = df["Sentiment_Score"].astype(float)
df["Personal_Pronoun_Count"] = df["Personal_Pronoun_Count"].astype(float)
other_features = df[["Sentiment_Score", "Personal_Pronoun_Count"]].values

In [18]:
# Encode target labels
label_encoder = LabelEncoder()
df["Emotion_Label"] = label_encoder.fit_transform(df["Sentiment_Exclamations_Questions"])

# Combine all features
X = np.hstack((tfidf_features, pretrained_embeddings, custom_embeddings, other_features))
y = df["Emotion_Label"].astype(int).values

In [19]:
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [20]:
# Normalize numerical features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

In [21]:
# Reshape data for LSTM
X_train_reshaped = X_train_scaled.reshape((X_train_scaled.shape[0], 1, X_train_scaled.shape[1]))
X_test_reshaped = X_test_scaled.reshape((X_test_scaled.shape[0], 1, X_test_scaled.shape[1]))

# Convert labels to categorical
num_classes = len(np.unique(y))
y_train_categorical = to_categorical(y_train, num_classes=num_classes)
y_test_categorical = to_categorical(y_test, num_classes=num_classes)

In [22]:
# Build LSTM model
model = Sequential([
    LSTM(128, return_sequences=True, input_shape=(1, X_train_scaled.shape[1])),
    BatchNormalization(),
    Dropout(0.3),
    LSTM(64, return_sequences=False),
    BatchNormalization(),
    Dropout(0.3),
    Dense(32, activation='relu'),
    Dense(num_classes, activation='softmax')
])

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

  super().__init__(**kwargs)


In [23]:
# Define F1-score callback
class F1ScoreCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs=None):
        y_pred = np.argmax(self.model.predict(X_test_reshaped), axis=1)
        f1 = f1_score(y_test, y_pred, average='weighted')
        print(f' - F1 Score: {f1:.4f}')

# Set callbacks
callbacks = [
    EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True),
    ModelCheckpoint('best_lstm_model.h5', save_best_only=True),
    F1ScoreCallback()
]

In [24]:
# Train the model
history = model.fit(
    X_train_reshaped, y_train_categorical,
    validation_data=(X_test_reshaped, y_test_categorical),
    epochs=50,
    batch_size=32,
    callbacks=callbacks
)

Epoch 1/50
[1m12/21[0m [32m━━━━━━━━━━━[0m[37m━━━━━━━━━[0m [1m0s[0m 5ms/step - accuracy: 0.0360 - loss: 2.8653     



[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 49ms/step
 - F1 Score: 0.8782
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 56ms/step - accuracy: 0.0862 - loss: 2.7119 - val_accuracy: 0.9080 - val_loss: 2.5634
Epoch 2/50
[1m14/21[0m [32m━━━━━━━━━━━━━[0m[37m━━━━━━━[0m [1m0s[0m 4ms/step - accuracy: 0.6436 - loss: 1.4817 



[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step 
 - F1 Score: 0.8938
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 14ms/step - accuracy: 0.6919 - loss: 1.3809 - val_accuracy: 0.9202 - val_loss: 2.2337
Epoch 3/50
[1m12/21[0m [32m━━━━━━━━━━━[0m[37m━━━━━━━━━[0m [1m0s[0m 5ms/step - accuracy: 0.9108 - loss: 0.7203 



[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 
 - F1 Score: 0.8938
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 15ms/step - accuracy: 0.9144 - loss: 0.6653 - val_accuracy: 0.9202 - val_loss: 1.8667
Epoch 4/50
[1m12/21[0m [32m━━━━━━━━━━━[0m[37m━━━━━━━━━[0m [1m0s[0m 5ms/step - accuracy: 0.9491 - loss: 0.3536 



[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 
 - F1 Score: 0.9004
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 15ms/step - accuracy: 0.9470 - loss: 0.3448 - val_accuracy: 0.9264 - val_loss: 1.6118
Epoch 5/50
[1m13/21[0m [32m━━━━━━━━━━━━[0m[37m━━━━━━━━[0m [1m0s[0m 4ms/step - accuracy: 0.9660 - loss: 0.2108 



[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 
 - F1 Score: 0.9004
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 15ms/step - accuracy: 0.9597 - loss: 0.2317 - val_accuracy: 0.9264 - val_loss: 1.3930
Epoch 6/50
[1m13/21[0m [32m━━━━━━━━━━━━[0m[37m━━━━━━━━[0m [1m0s[0m 4ms/step - accuracy: 0.9567 - loss: 0.2202 



[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 
 - F1 Score: 0.9098
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 14ms/step - accuracy: 0.9598 - loss: 0.2124 - val_accuracy: 0.9325 - val_loss: 1.1844
Epoch 7/50
[1m13/21[0m [32m━━━━━━━━━━━━[0m[37m━━━━━━━━[0m [1m0s[0m 4ms/step - accuracy: 0.9658 - loss: 0.1762 



[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 
 - F1 Score: 0.9098
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 15ms/step - accuracy: 0.9644 - loss: 0.1747 - val_accuracy: 0.9325 - val_loss: 0.9547
Epoch 8/50
[1m13/21[0m [32m━━━━━━━━━━━━[0m[37m━━━━━━━━[0m [1m0s[0m 4ms/step - accuracy: 0.9624 - loss: 0.1723 



[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 
 - F1 Score: 0.9098
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 14ms/step - accuracy: 0.9652 - loss: 0.1664 - val_accuracy: 0.9325 - val_loss: 0.7811
Epoch 9/50
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.9855 - loss: 0.0906



[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step 
 - F1 Score: 0.9098
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step - accuracy: 0.9850 - loss: 0.0920 - val_accuracy: 0.9325 - val_loss: 0.6582
Epoch 10/50
[1m11/21[0m [32m━━━━━━━━━━[0m[37m━━━━━━━━━━[0m [1m0s[0m 5ms/step - accuracy: 0.9676 - loss: 0.1563 



[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step 
 - F1 Score: 0.9098
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 20ms/step - accuracy: 0.9684 - loss: 0.1438 - val_accuracy: 0.9325 - val_loss: 0.5349
Epoch 11/50
[1m12/21[0m [32m━━━━━━━━━━━[0m[37m━━━━━━━━━[0m [1m0s[0m 5ms/step - accuracy: 0.9744 - loss: 0.1198 



[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 
 - F1 Score: 0.9098
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step - accuracy: 0.9737 - loss: 0.1142 - val_accuracy: 0.9325 - val_loss: 0.4724
Epoch 12/50
[1m12/21[0m [32m━━━━━━━━━━━[0m[37m━━━━━━━━━[0m [1m0s[0m 5ms/step - accuracy: 0.9790 - loss: 0.0905 



[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 
 - F1 Score: 0.9038
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 15ms/step - accuracy: 0.9758 - loss: 0.0997 - val_accuracy: 0.9264 - val_loss: 0.4430
Epoch 13/50
[1m13/21[0m [32m━━━━━━━━━━━━[0m[37m━━━━━━━━[0m [1m0s[0m 4ms/step - accuracy: 0.9700 - loss: 0.0788 



[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step 
 - F1 Score: 0.9098
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 14ms/step - accuracy: 0.9735 - loss: 0.0766 - val_accuracy: 0.9325 - val_loss: 0.4280
Epoch 14/50
[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step p - accuracy: 0.9871 - loss: 0.0620
 - F1 Score: 0.9098
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 11ms/step - accuracy: 0.9835 - loss: 0.0699 - val_accuracy: 0.9325 - val_loss: 0.4341
Epoch 15/50
[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step p - accuracy: 0.9881 - loss: 0.0465
 - F1 Score: 0.9098
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 11ms/step - accuracy: 0.9837 - loss: 0.0540 - val_accuracy: 0.9325 - val_loss: 0.4376
Epoch 16/50
[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step p - accuracy: 0.9778 - loss: 0.0508
 - F1 Score: 0.9038
[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[

In [25]:
# Evaluate final model
final_predictions = np.argmax(model.predict(X_test_reshaped), axis=1)
final_f1 = f1_score(y_test, final_predictions, average='weighted')
print(f'Final F1 Score: {final_f1:.4f}')

[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 
Final F1 Score: 0.9098


### Model Performance Improvements and Choices for Emotion Classification

To improve model performance, we used TF-IDF, pretrained, and custom embeddings combined with numerical features. An LSTM model with dropout and batch normalization was built to prevent overfitting. Early stopping and model checkpointing ensured optimal training.

These choices align with the task of emotion classification in spoken language by leveraging embeddings for semantic understanding and numerical features for context. The LSTM architecture captures sequential dependencies, crucial for spoken language. Dropout and batch normalization address overfitting, ensuring robustness across diverse language patterns and domains.