Approach:

- Convert image into 1D array
- Save into CSV file
- Used the CSV file to Train the model

## Unsupervised Learning

Step 1: Problem Definition and Objective Setting

    The goal is to classify facial expressions into emotion categories (e.g., anger, happiness, sadness, etc.).
    We’ll use accuracy as the primary evaluation metric.

Import Modules

In [3]:

import pandas as pd
import os
import cv2  # OpenCV for image processing
import numpy as np
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.optimizers import Adam
from sklearn.metrics import confusion_matrix,classification_report

classes (anger, happy, sad, ...)

Directory:


In [None]:
"""

emotion_dataset
    anger
        1.jpg
        2.jpg
        ...
    happy
        1.jpg
    sad
        1.jpg
        
"""

Step 2: Data Collection and Preparation

    Download the FER-2013 dataset from Kaggle and unzip it to a directory, for example, fer2013/.

Approach Used: Load the image in a csv. The csv file will used to train the model.

- Load Image Dataset into CSV

Here’s a complete example using Python, assuming you have a folder of images and you want to create a CSV file with pixel values and labels (if applicable).

Assumptions:

    - Images are in a directory named images/.
    - Labels are either embedded in the file names (e.g., happy_01.jpg, sad_02.jpg) or you have a separate list of labels.

Use this code

In [None]:
# emotion_dataset
#     anger_1.jpg
#     anger_2.jpg
#     happy_1.jpg
#     sad_1.jpg

# Directory containing images
image_dir = 'emotion_image_datasets/'
image_size = (48, 48)  # Desired image size (48x48 for FER-2013)

# Initialize lists to hold data
data = []
labels = []

# Loop through all images in the directory
for filename in os.listdir(image_dir):
    if filename.endswith('.jpg') or filename.endswith('.png'):
        img_path = os.path.join(image_dir, filename)
        
        # Load the image using OpenCV
        img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)  # Read image in grayscale
        img = cv2.resize(img, image_size)  # Resize image to the desired size

        # Flatten the image to a 1D array
        img_flat = img.flatten()
        
        # Extract label from filename (if applicable)
        # Example: 'happy_01.jpg' -> label 'happy'
        label = filename.split('_')[0]  # Assuming label is the first part before '_'
        
        # Append image data and label to lists
        data.append(img_flat)
        labels.append(label)

# Create a DataFrame from the data
df = pd.DataFrame(data)
df['emotion'] = labels  # Add the labels as a new column

# Save the DataFrame to a CSV file
csv_path = 'emotion_datasets.csv'
df.to_csv(csv_path, index=False)

print(f'Data saved to {csv_path}')


OR

In [1]:
# emotion_dataset
#     anger
#         1.jpg
#         2.jpg
#         ...
#     happy
#         1.jpg
#     sad
#         1.jpg

# Directory containing subfolders of images
dataset_dir = 'emotion_image_datasets/'
image_size = (48, 48)  # Desired image size (e.g., 48x48 for FER-2013)

# Initialize lists to hold data
data = []
labels = []

# Loop through each folder in the dataset directory
for label in os.listdir(dataset_dir):
    folder_path = os.path.join(dataset_dir, label)
    
    # Check if it is a directory
    if os.path.isdir(folder_path):
        # Loop through each image in the subfolder
        for filename in os.listdir(folder_path):
            if filename.endswith('.jpg') or filename.endswith('.png'):
                img_path = os.path.join(folder_path, filename)
                
                # Load the image using OpenCV
                img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)  # Read image in grayscale
                img = cv2.resize(img, image_size)  # Resize image to the desired size

                # Flatten the image to a 1D array
                img_flat = img.flatten()
                
                # Convert the flattened image array to a space-separated string
                img_flat_str = ' '.join(map(str, img_flat))
                
                # Append image data and corresponding label to lists
                data.append(img_flat_str)  # Use string format for pixel values
                labels.append(label)  # Use folder name as label

# Create a DataFrame from the data
df = pd.DataFrame({
    'Pixels': data,      # Assign the pixel data
    'Emotion': labels    # Assign the labels
})

# Save the DataFrame to a CSV file
csv_path = 'emotion_datasets.csv'
df.to_csv(csv_path, index=False)

print(f'Data saved to {csv_path}')


Data saved to emotion_datasets.csv


After the image are loaded into CSV, use the csv file for preparation

- load the csv
- split the dataset (csv data) into Training, Validation, Testing

Load

In [4]:


# Load data (assuming the data is stored as a CSV with pixel values and labels)
data_path = 'emotion_datasets.csv'
data = pd.read_csv(data_path)

# Check the structure
print(data.head())
print(data.columns.tolist()) # Print the list of column names



                                              Pixels Emotion
0  46 34 26 23 12 32 35 26 26 38 71 99 123 134 14...   angry
1  55 33 54 32 26 21 50 29 38 45 79 73 43 22 51 5...   angry
2  123 123 126 131 124 69 109 149 159 174 169 178...   angry
3  255 255 255 255 255 255 255 255 255 251 254 16...   angry
4  127 121 124 137 123 118 120 111 111 109 118 14...   angry
['Pixels', 'Emotion']


Preprocess & Split

In [5]:
# Prepare data arrays for images and labels
X = []
y = []

for _, row in data.iterrows():
    # Use 'Pixels' with uppercase 'P'
    pixels = np.array(row['Pixels'].split(), dtype='float32').reshape(48, 48, 1)
    X.append(pixels)
    y.append(row['Emotion'])  # Use 'Emotion' with uppercase 'E'

# Convert lists to NumPy arrays
X = np.array(X)
y = np.array(y)

# Split data into training, validation, and test sets
# X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.3, random_state=42)
# X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)

from sklearn.preprocessing import LabelEncoder
label_encoder = LabelEncoder()
y_encoded = label_encoder.fit_transform(y)  # Integer encode the labels
# Now split the data
X_train, X_temp, y_train, y_temp = train_test_split(X, y_encoded, test_size=0.3, random_state=42)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)

Before training the model, check the data types of your training labels:

In [6]:
print("Data type of y_train:", y_train.dtype)  # Should be an integer type (e.g., int32)

Data type of y_train: int64


If it’s not, you can convert it:


In [None]:
# y_train = y_train.astype(np.int32)  # Ensure it's in the correct integer format
# y_val = y_val.astype(np.int32)

Step 3: Feature Engineering

For image data, feature engineering is often handled by convolutional layers in a CNN. However, we can normalize pixel values for better model convergence.

In [7]:
# Normalize pixel values
X_train = X_train / 255.0
X_val = X_val / 255.0
X_test = X_test / 255.0

Step 4: Model Selection

A Convolutional Neural Network (CNN) is a suitable choice for image classification tasks.

In [8]:


# Model definition
model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(48, 48, 1)),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Conv2D(128, (3, 3), activation='relu'),
    Flatten(),
    Dense(128, activation='relu'),
    Dropout(0.5),
    Dense(7, activation='softmax')  # 7 output neurons for 7 emotion classes
])

model.summary()


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Step 5: Model Training

Compile and Train the model on the training dataset.

In [9]:

# Compile model
model.compile(optimizer=Adam(), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train model
history = model.fit(X_train, y_train, epochs=30, validation_data=(X_val, y_val), batch_size=64)


Epoch 1/30
[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m16s[0m 139ms/step - accuracy: 0.2083 - loss: 1.8739 - val_accuracy: 0.2479 - val_loss: 1.7998
Epoch 2/30
[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 132ms/step - accuracy: 0.2376 - loss: 1.8253 - val_accuracy: 0.2683 - val_loss: 1.7610
Epoch 3/30
[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 128ms/step - accuracy: 0.2977 - loss: 1.7494 - val_accuracy: 0.3315 - val_loss: 1.6741
Epoch 4/30
[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 120ms/step - accuracy: 0.3402 - loss: 1.6822 - val_accuracy: 0.3835 - val_loss: 1.5949
Epoch 5/30
[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 124ms/step - accuracy: 0.3976 - loss: 1.5856 - val_accuracy: 0.4076 - val_loss: 1.5519
Epoch 6/30
[1m79/79[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 121ms/step - accuracy: 0.4229 - loss: 1.5261 - val_accuracy: 0.4271 - val_loss: 1.5203
Epoch 7/30
[1m79/79[

Step 6: Model Evaluation

Evaluate the model on the validation set to check its performance.

In [10]:
# Evaluate on validation data
val_loss, val_accuracy = model.evaluate(X_val, y_val)
print(f'Validation Accuracy: {val_accuracy:.2f}')

[1m34/34[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 24ms/step - accuracy: 0.4828 - loss: 2.3694
Validation Accuracy: 0.47


Step 7: Hyperparameter Tuning

You could apply a grid search for tuning, though it’s computationally intensive. Here, we can try adjusting the learning rate, batch size, or network layers.


Step 8: Model Testing

After final tuning, evaluate the model on the test set.

In [11]:
# Evaluate on test data
test_loss, test_accuracy = model.evaluate(X_test, y_test)
print(f'Test Accuracy: {test_accuracy:.2f}')


[1m34/34[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 22ms/step - accuracy: 0.4354 - loss: 2.6842
Test Accuracy: 0.45


Other Evaluations

In [12]:
history.history #Return Dictionary of history

{'accuracy': [0.22730891406536102,
  0.2490047812461853,
  0.30254778265953064,
  0.3509156107902527,
  0.39530253410339355,
  0.41600319743156433,
  0.41938695311546326,
  0.45561304688453674,
  0.4852707087993622,
  0.506369411945343,
  0.5201035141944885,
  0.5519506335258484,
  0.5696656107902527,
  0.5915604829788208,
  0.6150477528572083,
  0.6453025341033936,
  0.6707802414894104,
  0.696457028388977,
  0.7066082954406738,
  0.7366639971733093,
  0.7559713125228882,
  0.777667224407196,
  0.7804538011550903,
  0.7991639971733093,
  0.8142914175987244,
  0.8172770738601685,
  0.8240445852279663,
  0.847332775592804,
  0.8471337556838989,
  0.8558917045593262],
 'loss': [1.8520427942276,
  1.803928256034851,
  1.7375102043151855,
  1.6598913669586182,
  1.5907431840896606,
  1.5333325862884521,
  1.4815630912780762,
  1.4241293668746948,
  1.3769943714141846,
  1.2992082834243774,
  1.2474925518035889,
  1.1906825304031372,
  1.1313891410827637,
  1.0763237476348877,
  1.002203345

In [13]:
print(history.history.keys())

dict_keys(['accuracy', 'loss', 'val_accuracy', 'val_loss'])


In [14]:
training_accuracy      = history.history['accuracy'][-1]
training_loss          = history.history['loss'][-1]
validation_accuracy    = history.history['val_accuracy'][-1]
validation_loss       = history.history['val_loss'][-1]
print("Training Accuracy   :", training_accuracy )
print("Training Loss       :", training_loss)
print("Validation Accuracy :", validation_accuracy)
print("Validation Loss     :", validation_loss)

Training Accuracy   : 0.8558917045593262
Training Loss       : 0.36520451307296753
Validation Accuracy : 0.470752090215683
Validation Loss     : 2.477069854736328


Step 9: Model Deployment

Save the model to deploy it into a production environment.

In [15]:
# Save the model
model.save('emotion_classification_model.h5')



Step 10: Monitoring and Maintenance

Monitoring can be done in production by setting up logging and tracking metrics like accuracy, latency, etc.

### Deploy the Model

In [None]:
import cv2
import numpy as np
from tensorflow.keras.models import load_model

# Load the trained model
model = load_model('emotion_classification_model.h5')

# List of emotions corresponding to class indices
emotions = ['angry','disgust','fear','happy','neutral','sad','surprise']  # Update this list as needed

# Function to process the image and make a prediction
def process_image(image_path):
    # Load the image
    img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)  # Load in grayscale
    img = cv2.resize(img, (48, 48))  # Resize to match model input
    img = img.reshape(1, 48, 48, 1)  # Reshape for the model
    img = img.astype('float32') / 255  # Normalize the pixel values
    return img

# Specify the image path directly
my_image = 'sample.PNG'  # Update this path to your image

# Proccess the image
img = process_image(my_image)

# Make a prediction
prediction = model.predict(img)
predicted_class = np.argmax(prediction)

# Get the emotion from the list based on the predicted class index
emotion = emotions[predicted_class]

print(f'The predicted emotion is: {emotion}')



[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 256ms/step
The predicted emotion is: happy
