# üé≠ Facial Mood Detection using OpenCV & Deep Learning

**Capstone Project - Module III**

**Author:** Isha Sharma

---

## üìå Problem Statement

Develop a facial emotion detection system that can identify human emotions from facial expressions in real-time using Computer Vision and Deep Learning techniques.

## üéØ Objective

- Build a CNN-based model to classify facial expressions into 7 emotion categories
- Implement transfer learning using MobileNetV2 for improved accuracy
- Create a real-time emotion detection system using OpenCV
- Deploy as a web application using Streamlit

## üìä Dataset

**FER2013 (Facial Expression Recognition 2013)**
- 35,887 grayscale images (48x48 pixels)
- 7 emotion classes: Angry, Disgust, Fear, Happy, Neutral, Sad, Surprise
- Source: Kaggle / Hugging Face

---
## 1Ô∏è‚É£ Setup & Installation

In [None]:
# Install required packages
!pip install -q datasets pillow opencv-python-headless

In [None]:
# Import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import cv2
import os
from PIL import Image

# TensorFlow/Keras
import tensorflow as tf
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import (
    Dense, Dropout, Flatten, Conv2D, MaxPooling2D, 
    BatchNormalization, GlobalAveragePooling2D, Input, Layer
)
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications import MobileNetV2

# Sklearn
from sklearn.utils.class_weight import compute_class_weight
from sklearn.metrics import classification_report, confusion_matrix

# Suppress warnings
import warnings
warnings.filterwarnings('ignore')

print(f"TensorFlow version: {tf.__version__}")
print(f"GPU Available: {tf.config.list_physical_devices('GPU')}")

---
## 2Ô∏è‚É£ Download & Prepare Dataset

In [None]:
from datasets import load_dataset

# Download FER2013 dataset from Hugging Face
print("Downloading FER2013 dataset...")
dataset = load_dataset("AutumnQiu/fer2013")
print("Download complete!")
print(f"\nDataset structure: {dataset}")

In [None]:
# Emotion labels
EMOTION_LABELS = ['angry', 'disgust', 'fear', 'happy', 'neutral', 'sad', 'surprise']
EMOTION_EMOJIS = ['üò†', 'ü§¢', 'üò®', 'üòä', 'üòê', 'üò¢', 'üò≤']

# Create directories for saving images
DATA_DIR = '/content/data'
os.makedirs(DATA_DIR, exist_ok=True)

for label in EMOTION_LABELS:
    os.makedirs(os.path.join(DATA_DIR, label), exist_ok=True)

print("Created directories for each emotion class")

In [None]:
# Save images to directories
def save_dataset_images(split_name, data_split):
    counts = {label: 0 for label in EMOTION_LABELS}
    
    for i, sample in enumerate(data_split):
        img = sample['image']
        label_idx = sample['label']
        label_name = EMOTION_LABELS[label_idx]
        
        # Convert to grayscale if needed
        if img.mode != 'L':
            img = img.convert('L')
        
        # Resize to 48x48
        img = img.resize((48, 48))
        
        # Save image
        filename = f"{split_name}_{i}.png"
        filepath = os.path.join(DATA_DIR, label_name, filename)
        img.save(filepath)
        counts[label_name] += 1
        
        if (i + 1) % 5000 == 0:
            print(f"Processed {i + 1} images...")
    
    return counts

print("Saving training images...")
train_counts = save_dataset_images('train', dataset['train'])

print("\nüìä Dataset Statistics:")
for label, count in train_counts.items():
    emoji = EMOTION_EMOJIS[EMOTION_LABELS.index(label)]
    print(f"  {emoji} {label.title()}: {count} images")

print(f"\n  Total: {sum(train_counts.values())} images")

---
## 3Ô∏è‚É£ Exploratory Data Analysis (EDA)

In [None]:
# Visualize class distribution
plt.figure(figsize=(12, 5))

# Bar plot
plt.subplot(1, 2, 1)
colors = ['#FF6B6B', '#A55EEA', '#FFA94D', '#51CF66', '#868E96', '#74C0FC', '#FFD43B']
bars = plt.bar(EMOTION_LABELS, train_counts.values(), color=colors)
plt.title('Class Distribution', fontsize=14, fontweight='bold')
plt.xlabel('Emotion')
plt.ylabel('Number of Images')
plt.xticks(rotation=45)

# Add count labels on bars
for bar, count in zip(bars, train_counts.values()):
    plt.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 50, 
             str(count), ha='center', fontsize=9)

# Pie chart
plt.subplot(1, 2, 2)
plt.pie(train_counts.values(), labels=[f"{e} {l}" for e, l in zip(EMOTION_EMOJIS, EMOTION_LABELS)], 
        colors=colors, autopct='%1.1f%%', startangle=90)
plt.title('Emotion Distribution', fontsize=14, fontweight='bold')

plt.tight_layout()
plt.show()

print("\n‚ö†Ô∏è Note: 'Disgust' class has significantly fewer samples - we'll use class weighting!")

In [None]:
# Visualize sample images from each class
fig, axes = plt.subplots(2, 7, figsize=(16, 5))

for i, label in enumerate(EMOTION_LABELS):
    label_dir = os.path.join(DATA_DIR, label)
    images = os.listdir(label_dir)[:2]
    
    for j, img_name in enumerate(images):
        img_path = os.path.join(label_dir, img_name)
        img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
        axes[j, i].imshow(img, cmap='gray')
        axes[j, i].axis('off')
        if j == 0:
            axes[j, i].set_title(f"{EMOTION_EMOJIS[i]}\n{label.title()}", fontsize=10)

plt.suptitle('Sample Images from Each Emotion Class', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

---
## 4Ô∏è‚É£ Data Preprocessing & Augmentation

In [None]:
# Data generators with augmentation
IMG_SIZE = 96  # Using 96x96 for transfer learning
BATCH_SIZE = 32

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=20,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest',
    validation_split=0.2
)

print("Creating data generators...")

train_generator = train_datagen.flow_from_directory(
    DATA_DIR,
    target_size=(IMG_SIZE, IMG_SIZE),
    color_mode='grayscale',
    batch_size=BATCH_SIZE,
    class_mode='categorical',
    subset='training',
    shuffle=True
)

validation_generator = train_datagen.flow_from_directory(
    DATA_DIR,
    target_size=(IMG_SIZE, IMG_SIZE),
    color_mode='grayscale',
    batch_size=BATCH_SIZE,
    class_mode='categorical',
    subset='validation',
    shuffle=True
)

print(f"\nTraining samples: {train_generator.samples}")
print(f"Validation samples: {validation_generator.samples}")
print(f"Classes: {train_generator.class_indices}")

In [None]:
# Compute class weights to handle imbalanced data
class_weights = compute_class_weight(
    'balanced',
    classes=np.unique(train_generator.classes),
    y=train_generator.classes
)
class_weight_dict = dict(enumerate(class_weights))

print("üìä Class Weights (for handling imbalance):")
for idx, weight in class_weight_dict.items():
    label = list(train_generator.class_indices.keys())[list(train_generator.class_indices.values()).index(idx)]
    emoji = EMOTION_EMOJIS[EMOTION_LABELS.index(label)]
    print(f"  {emoji} {label}: {weight:.3f}")

---
## 5Ô∏è‚É£ Model Architecture - Transfer Learning with MobileNetV2

In [None]:
# Custom layer to convert grayscale to RGB
@tf.keras.utils.register_keras_serializable()
class GrayscaleToRGB(Layer):
    """Converts grayscale (1 channel) to RGB (3 channels)"""
    
    def __init__(self, **kwargs):
        super(GrayscaleToRGB, self).__init__(**kwargs)
    
    def call(self, inputs):
        return tf.image.grayscale_to_rgb(inputs)
    
    def compute_output_shape(self, input_shape):
        return input_shape[:-1] + (3,)
    
    def get_config(self):
        return super().get_config()

In [None]:
def create_transfer_model(input_shape=(96, 96, 1), num_classes=7):
    """
    Creates a transfer learning model using MobileNetV2.
    """
    # Input layer for grayscale images
    inputs = Input(shape=input_shape)
    
    # Convert grayscale to RGB
    x = GrayscaleToRGB()(inputs)
    
    # Load MobileNetV2 with ImageNet weights
    base_model = MobileNetV2(
        weights='imagenet',
        include_top=False,
        input_shape=(96, 96, 3)
    )
    
    # Freeze base model
    base_model.trainable = False
    
    # Pass through base model
    x = base_model(x, training=False)
    
    # Classification head
    x = GlobalAveragePooling2D()(x)
    x = Dense(256, activation='relu')(x)
    x = Dropout(0.5)(x)
    x = Dense(128, activation='relu')(x)
    x = Dropout(0.3)(x)
    outputs = Dense(num_classes, activation='softmax')(x)
    
    model = Model(inputs, outputs)
    
    model.compile(
        optimizer=Adam(learning_rate=0.001),
        loss='categorical_crossentropy',
        metrics=['accuracy']
    )
    
    return model, base_model

# Create model
model, base_model = create_transfer_model()
model.summary()

---
## 6Ô∏è‚É£ Model Training - Phase 1 (Frozen Base)

In [None]:
# Callbacks
checkpoint = ModelCheckpoint(
    'best_model.keras',
    monitor='val_accuracy',
    save_best_only=True,
    mode='max',
    verbose=1
)

early_stop = EarlyStopping(
    monitor='val_loss',
    patience=5,
    restore_best_weights=True
)

reduce_lr = ReduceLROnPlateau(
    monitor='val_loss',
    factor=0.5,
    patience=3,
    min_lr=1e-6
)

In [None]:
print("="*60)
print("PHASE 1: Training Classification Head (Base Frozen)")
print("="*60)

EPOCHS_PHASE1 = 10

history1 = model.fit(
    train_generator,
    steps_per_epoch=train_generator.samples // BATCH_SIZE,
    epochs=EPOCHS_PHASE1,
    validation_data=validation_generator,
    validation_steps=validation_generator.samples // BATCH_SIZE,
    callbacks=[checkpoint, early_stop, reduce_lr],
    class_weight=class_weight_dict
)

---
## 7Ô∏è‚É£ Model Training - Phase 2 (Fine-tuning)

In [None]:
print("="*60)
print("PHASE 2: Fine-tuning Top Layers of MobileNetV2")
print("="*60)

# Unfreeze top 30 layers
base_model.trainable = True
for layer in base_model.layers[:-30]:
    layer.trainable = False

# Recompile with lower learning rate
model.compile(
    optimizer=Adam(learning_rate=0.00005),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

print(f"Trainable layers: {len([l for l in model.layers if l.trainable])}")

In [None]:
EPOCHS_PHASE2 = 15

early_stop2 = EarlyStopping(
    monitor='val_loss',
    patience=8,
    restore_best_weights=True
)

history2 = model.fit(
    train_generator,
    steps_per_epoch=train_generator.samples // BATCH_SIZE,
    epochs=EPOCHS_PHASE2,
    validation_data=validation_generator,
    validation_steps=validation_generator.samples // BATCH_SIZE,
    callbacks=[checkpoint, early_stop2, reduce_lr],
    class_weight=class_weight_dict
)

---
## 8Ô∏è‚É£ Training Visualization

In [None]:
# Combine training histories
acc = history1.history['accuracy'] + history2.history['accuracy']
val_acc = history1.history['val_accuracy'] + history2.history['val_accuracy']
loss = history1.history['loss'] + history2.history['loss']
val_loss = history1.history['val_loss'] + history2.history['val_loss']

# Plot training curves
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Accuracy
axes[0].plot(acc, label='Train Accuracy', color='#667eea', linewidth=2)
axes[0].plot(val_acc, label='Val Accuracy', color='#764ba2', linewidth=2)
axes[0].axvline(x=len(history1.history['accuracy']), color='red', linestyle='--', 
                label='Fine-tuning starts', alpha=0.7)
axes[0].set_title('Model Accuracy', fontsize=14, fontweight='bold')
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Accuracy')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# Loss
axes[1].plot(loss, label='Train Loss', color='#667eea', linewidth=2)
axes[1].plot(val_loss, label='Val Loss', color='#764ba2', linewidth=2)
axes[1].axvline(x=len(history1.history['loss']), color='red', linestyle='--', 
                label='Fine-tuning starts', alpha=0.7)
axes[1].set_title('Model Loss', fontsize=14, fontweight='bold')
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('Loss')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.suptitle('Training Progress - Transfer Learning with MobileNetV2', fontsize=16, fontweight='bold')
plt.tight_layout()
plt.show()

print(f"\nüéØ Best Validation Accuracy: {max(val_acc)*100:.2f}%")

---
## 9Ô∏è‚É£ Model Evaluation

In [None]:
# Load best model
best_model = tf.keras.models.load_model('best_model.keras', 
                                         custom_objects={'GrayscaleToRGB': GrayscaleToRGB})

# Evaluate on validation set
print("Evaluating model on validation set...")
val_loss, val_acc = best_model.evaluate(validation_generator)
print(f"\nüìä Validation Loss: {val_loss:.4f}")
print(f"üìä Validation Accuracy: {val_acc*100:.2f}%")

In [None]:
# Generate predictions for confusion matrix
validation_generator.reset()
predictions = best_model.predict(validation_generator, verbose=1)
y_pred = np.argmax(predictions, axis=1)
y_true = validation_generator.classes[:len(y_pred)]

# Classification report
print("\nüìã Classification Report:")
print("="*60)
print(classification_report(y_true, y_pred, target_names=EMOTION_LABELS))

In [None]:
# Confusion Matrix
cm = confusion_matrix(y_true, y_pred)

plt.figure(figsize=(10, 8))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=[f"{e} {l}" for e, l in zip(EMOTION_EMOJIS, EMOTION_LABELS)],
            yticklabels=[f"{e} {l}" for e, l in zip(EMOTION_EMOJIS, EMOTION_LABELS)])
plt.title('Confusion Matrix', fontsize=14, fontweight='bold')
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.tight_layout()
plt.show()

---
## üîü Real-time Face Detection Demo

In [None]:
def predict_emotion(image, model):
    """
    Predict emotion from a face image.
    """
    # Preprocess
    if len(image.shape) == 3:
        image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    
    image = cv2.resize(image, (96, 96))
    image = image.astype('float32') / 255.0
    image = np.expand_dims(image, axis=(0, -1))
    
    # Predict
    predictions = model.predict(image, verbose=0)
    emotion_idx = np.argmax(predictions)
    confidence = predictions[0][emotion_idx]
    
    return EMOTION_LABELS[emotion_idx], confidence, predictions[0]

print("‚úÖ Prediction function created!")

In [None]:
# Test with sample images
fig, axes = plt.subplots(2, 4, figsize=(16, 8))
axes = axes.flatten()

# Load sample images from each class
for i, label in enumerate(EMOTION_LABELS[:7]):
    label_dir = os.path.join(DATA_DIR, label)
    img_name = os.listdir(label_dir)[10]  # Get a sample image
    img_path = os.path.join(label_dir, img_name)
    
    # Load and predict
    img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
    predicted_label, confidence, all_probs = predict_emotion(img, best_model)
    
    # Display
    axes[i].imshow(img, cmap='gray')
    axes[i].set_title(f"True: {label}\nPred: {predicted_label} ({confidence*100:.1f}%)", fontsize=10)
    axes[i].axis('off')

axes[7].axis('off')
plt.suptitle('Model Predictions on Sample Images', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

---
## üìä Results Summary

### Key Findings:

| Metric | Value |
|--------|-------|
| **Best Validation Accuracy** | ~47-50% |
| **Model Architecture** | MobileNetV2 + Custom Head |
| **Training Strategy** | 2-Phase Transfer Learning |
| **Dataset** | FER2013 (28,709 images) |

### Observations:

1. **Class Imbalance**: The 'disgust' class has significantly fewer samples (436 vs ~4000 for others)
2. **Dataset Challenges**: FER2013 has noisy labels - even human accuracy is only ~65-72%
3. **Transfer Learning Benefits**: Using pre-trained MobileNetV2 features helps generalization

### Limitations:

- Low resolution images (48x48) limit detail capture
- Some emotions are inherently similar (fear vs surprise)
- Dataset contains mislabeled samples

---
## üíæ Save Model

In [None]:
# Save the final model
best_model.save('facial_mood_detector.keras')
print("‚úÖ Model saved as 'facial_mood_detector.keras'")

# Save class indices
with open('class_indices.txt', 'w') as f:
    f.write(str(train_generator.class_indices))
print("‚úÖ Class indices saved")

---

## üéâ Conclusion

This project successfully demonstrates:

1. **Data Preprocessing**: Handling the FER2013 dataset with proper augmentation
2. **Transfer Learning**: Leveraging MobileNetV2 pre-trained on ImageNet
3. **Class Imbalance Handling**: Using computed class weights
4. **Two-Phase Training**: Frozen base followed by fine-tuning
5. **Model Evaluation**: Comprehensive metrics and visualizations

### Future Improvements:

- Use larger pre-trained models (ResNet50, EfficientNet)
- Apply more aggressive data augmentation
- Ensemble multiple models
- Add attention mechanisms

---

**Author**: Isha Sharma  
**Project**: Facial Mood Detection - Capstone Module III  
**Date**: December 2024