# Face Emotion Recognition using EfficientNetB0

### Face Emotion Recognition (EfficientNetB0 & Fine-tuning Strategy)

Dataset: https://huggingface.co/datasets/tukey/human_face_emotions_roboflow/


## Goal
This notebook trains a deep learning model to classify facial emotions from images using transfer learning with EfficientNetB0.

## Dataset
The data is sourced from the `tukey/human_face_emotions_roboflow` dataset available on the Hugging Face Hub. It contains images of faces labeled with one of several emotions (e.g., happy, sad, anger, etc.).

## Model
The model utilizes the EfficientNetB0 architecture, pre-trained on the ImageNet dataset. Transfer learning is applied by:
1.  Replacing the original classification head with a new one suitable for the number of emotion classes in this dataset.
2.  Initially training only the new head while keeping the EfficientNetB0 base frozen.
3.  Fine-tuning the model by unfreezing the top layers of the EfficientNetB0 base and training the entire network with a very low learning rate.

## Workflow
1.  **Load Data:** Fetches the Parquet dataset file directly from the Hugging Face Hub.
2.  **Clean Data:** Extracts emotion labels from the structured 'qa' column, converts labels to lowercase, and handles potential missing data.
3.  **Split Data:** Divides the dataset into training (60%), validation (20%), and test (20%) sets using stratified splitting to maintain class distribution.
4.  **Preprocess:**
    * Decodes image bytes into NumPy arrays (uint8 format).
    * Resizes images to the target size (224x224 for EfficientNetB0).
    * Handles potential image decoding errors and filters corresponding labels.
    * Encodes the string emotion labels into numerical format using `LabelEncoder`.
5.  **Model Building:**
    * Defines data augmentation layers (e.g., flips, rotations) to improve model generalization.
    * Loads the pre-trained EfficientNetB0 base model (weights from ImageNet) without its top classification layer.
    * Adds a global average pooling layer, a dropout layer, and a final dense layer (with softmax activation) for classification.
    * Applies EfficientNet-specific preprocessing within the model.
6.  **Training:**
    * **Initial Phase:** Trains only the newly added classification head with the base model frozen. Uses `Adam` optimizer and `sparse_categorical_crossentropy` loss. Includes `EarlyStopping` based on validation loss.
    * **Fine-tuning Phase:** Unfreezes the top ~30 layers of the EfficientNetB0 base model. Recompiles the model with a much lower learning rate and continues training. Also uses `EarlyStopping`.
7.  **Evaluation:**
    * Evaluates the final model's performance on the unseen test set (reporting loss and accuracy).
    * Plots training and validation accuracy/loss curves over epochs.
    * Generates a classification report (precision, recall, F1-score per class).
    * Displays a confusion matrix heatmap.
8.  **Save Model:** Saves the trained model to a `.keras` file.
9.  **Qualitative Analysis:** Shows a few sample images from the test set with their true and predicted labels.

## Requirements
-   `pandas`
-   `numpy`
-   `tensorflow` (>= 2.x)
-   `scikit-learn`
-   `matplotlib`
-   `seaborn`
-   `Pillow` (PIL)
-   `huggingface_hub` (or `datasets`) for loading data from the hub (`pip install huggingface_hub`)

## Usage
1.  Ensure all required libraries are installed.
2.  Run the notebook cells sequentially.
3.  **GPU Recommended:** Training deep learning models, especially fine-tuning, is computationally intensive. Running this notebook on a machine with a GPU (e.g., Google Colab, Kaggle, or a local setup) is highly recommended for reasonable training times.
4.  Monitor the output, especially the training logs and evaluation plots, to understand model performance and convergence.

## Key Parameters (Adjustable)
-   `DATASET_PATH`: Path to the Parquet file on Hugging Face Hub.
-   `TARGET_SIZE`: Input image dimensions (default 224x224).
-   `BATCH_SIZE`: Number of samples per training step (default 32). Adjust based on GPU memory.
-   `INITIAL_EPOCHS`: Maximum epochs for initial head training (default 15).
-   `FINE_TUNE_EPOCHS`: Maximum additional epochs for fine-tuning (default 15).
-   `fine_tune_layers_count`: Number of top layers to unfreeze in the base model during fine-tuning (default 30).
-   `initial_learning_rate`: Learning rate for head training (default 0.001).
-   `fine_tune_learning_rate`: Learning rate for fine-tuning (default 1e-5).
-   `EarlyStopping` `patience`: Number of epochs to wait for improvement before stopping (default 5).

In [None]:
import pandas as pd
import io
from PIL import Image
import json
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.applications import EfficientNetB2
from tensorflow.keras.applications.efficientnet import preprocess_input as efficientnet_preprocess_input
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import confusion_matrix, classification_report

# Optional: Configure GPU memory growth if needed
# gpus = tf.config.experimental.list_physical_devices('GPU')
# if gpus:
#   try:
#     for gpu in gpus:
#       tf.config.experimental.set_memory_growth(gpu, True)
#     logical_gpus = tf.config.experimental.list_logical_devices('GPU')
#     print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
#   except RuntimeError as e:
#     print(e)

print("TensorFlow Version:", tf.__version__)


## 1. Load Raw Data

In [None]:
# Load Data from Hugging Face Hub
DATASET_PATH = "hf://datasets/tukey/human_face_emotions_roboflow/data/train-00000-of-00001.parquet"
try:
    df = pd.read_parquet(DATASET_PATH)
    print(f"Data loaded successfully from {DATASET_PATH}.")
    print(f"Initial shape: {df.shape}")
except Exception as e:
    print(f"Error loading data: {e}")
    # Handle error, e.g., exit or try loading locally
    df = pd.DataFrame() # Assign empty dataframe to avoid further errors if loading fails

## 2. Data Overview & Cleaning

In [None]:
# Chunk 3: Clean Data and Extract Labels
if not df.empty:
    # Standardize column names
    df.columns = [col.strip().lower().replace(' ', '_') for col in df.columns]

    # Function to extract emotion label
    def extract_emotion(qa_entry):
        try:
            if isinstance(qa_entry, str):
                qa_entry = qa_entry.strip()
                qa_data = json.loads(qa_entry)
            else:
                qa_data = qa_entry

            if isinstance(qa_data, np.ndarray):
                qa_data = qa_data.tolist()

            if isinstance(qa_data, (list, tuple)) and len(qa_data) > 0:
                answer = qa_data[0].get("answer")
                if isinstance(answer, str):
                    return answer.lower().strip() # Standardize labels
                else:
                    # print(f"Unexpected answer type: {answer}, type: {type(answer)} in entry: {qa_entry}")
                    return None
            else:
                # print(f"Unexpected qa_data structure: {qa_data}, type: {type(qa_data)}")
                return None
        except Exception as e:
            # print(f"Error parsing qa entry: {qa_entry}, \nError: {e}")
            return None

    # Apply the function
    df["emotion"] = df["qa"].apply(extract_emotion)

    # Check and handle missing values
    print("\nMissing values BEFORE handling:")
    print(df.isna().sum())
    initial_rows = len(df)
    df.dropna(subset=['image', 'emotion'], inplace=True) # Drop if image or extracted emotion is missing
    print(f"Dropped {initial_rows - len(df)} rows due to missing image or emotion labels.")
    print("\nMissing values AFTER handling:")
    print(df.isna().sum())


    # Drop the original 'qa' column
    if 'qa' in df.columns:
        df.drop(columns=["qa"], inplace=True)

    # Display info and head
    print("\nCleaned Dataframe Info:")
    df.info()
    print("\nCleaned Dataframe Head:")
    print(df.head())

    # Check unique values and distribution
    print("\nUnique emotion labels:")
    unique_labels = df['emotion'].unique()
    print(unique_labels)
    num_classes = len(unique_labels)
    print(f"\nNumber of unique labels: {num_classes}")

    print("\nDistribution of emotion labels:")
    print(df['emotion'].value_counts())
else:
    print("DataFrame is empty, skipping cleaning steps.")

## 3. Data Splitting

In [None]:
if not df.empty:
    X = df['image']
    y = df['emotion']

    # Stratified split into Train (60%), Validation (20%), Test (20%)
    X_train, X_temp, y_train, y_temp = train_test_split(
        X, y,
        test_size=0.4,
        random_state=42,
        stratify=y
    )

    X_val, X_test, y_val, y_test = train_test_split(
        X_temp, y_temp,
        test_size=0.5,
        random_state=42,
        stratify=y_temp
    )

    print(f"\nData Split:")
    print(f"Training set size: {len(X_train)}")
    print(f"Validation set size: {len(X_val)}")
    print(f"Test set size: {len(X_test)}")
else:
    print("DataFrame is empty, skipping data splitting.")

## 4. Image Preprocessing & Label Encoding

In [None]:
# Image Decoding Function
def decode_images(image_series, target_size=(224, 224)):
    """
    Decodes image bytes from a pandas Series into NumPy arrays.
    Handles potential errors during decoding.
    Returns decoded images (uint8) and indices of valid images.
    """
    decoded_list = []
    valid_indices = []
    original_indices = image_series.index

    print(f"Attempting to decode {len(image_series)} images...")
    processed_count = 0
    error_count = 0

    for i, item in enumerate(image_series):
        try:
            img_bytes = item['bytes']
            with Image.open(io.BytesIO(img_bytes)) as img:
                img = img.convert('RGB') # Ensure 3 channels
                img = img.resize(target_size, Image.Resampling.LANCZOS) # Use LANCZOS for quality
                # Keep as uint8 for EfficientNet preprocessing
                arr = np.array(img, dtype=np.uint8)
            decoded_list.append(arr)
            valid_indices.append(original_indices[i])
            processed_count += 1
        except Exception as e:
            # print(f"Error decoding image at index {original_indices[i]}: {e}. Skipping.")
            error_count += 1
            continue
        # Optional: Print progress periodically
        # if (i + 1) % 500 == 0:
        #     print(f"  Processed {i+1}/{len(image_series)} images...")

    print(f"Successfully decoded: {processed_count}, Errors: {error_count}")

    if not decoded_list:
        return np.array([]), []

    return np.stack(decoded_list, axis=0), valid_indices

In [None]:
# Apply Decoding and Filter Labels
if 'X_train' in locals(): # Check if splitting was done
    TARGET_SIZE = (224, 224) # Default for EfficientNetB0
    print("\nDecoding Training Images...")
    X_train_array, train_valid_indices = decode_images(X_train, target_size=TARGET_SIZE)
    print("\nDecoding Validation Images...")
    X_val_array, val_valid_indices = decode_images(X_val, target_size=TARGET_SIZE)
    print("\nDecoding Test Images...")
    X_test_array, test_valid_indices = decode_images(X_test, target_size=TARGET_SIZE)

    print(f"\nX_train_array shape: {X_train_array.shape if X_train_array.size > 0 else 'Empty'}")
    print(f"X_val_array shape: {X_val_array.shape if X_val_array.size > 0 else 'Empty'}")
    print(f"X_test_array shape: {X_test_array.shape if X_test_array.size > 0 else 'Empty'}")

    # Filter labels corresponding to successfully decoded images
    y_train = y_train.loc[train_valid_indices]
    y_val = y_val.loc[val_valid_indices]
    y_test = y_test.loc[test_valid_indices]

    print(f"\nFiltered label counts:")
    print(f"y_train: {len(y_train)}")
    print(f"y_val: {len(y_val)}")
    print(f"y_test: {len(y_test)}")

    # Check if any set became empty after filtering bad images
    if X_train_array.size == 0 or X_val_array.size == 0 or X_test_array.size == 0:
        raise ValueError("One or more data splits empty after image decoding. Check data source/errors.")
else:
    print("Skipping image decoding as data splits not found.")

In [None]:
# Encode Labels
if 'y_train' in locals():
    label_encoder = LabelEncoder()
    y_train_encoded = label_encoder.fit_transform(y_train)
    y_val_encoded   = label_encoder.transform(y_val)
    y_test_encoded  = label_encoder.transform(y_test)

    # Recalculate num_classes based on fitted encoder
    num_classes = len(label_encoder.classes_)
    print("\nLabel classes found:", label_encoder.classes_)
    print("Number of classes:", num_classes)
    print("Sample of encoded labels (train):", y_train_encoded[:10])
else:
    print("Skipping label encoding.")

## 5. Model Building (EfficientNetB0 with Transfer Learning)

In [None]:
# Chunk 8: Define Data Augmentation and Load Base Model
IMG_SHAPE = TARGET_SIZE + (3,)

# Data Augmentation Layer
data_augmentation = tf.keras.Sequential([
    layers.RandomFlip("horizontal"),
    layers.RandomRotation(0.1),
    layers.RandomZoom(0.1),
    layers.RandomContrast(0.1),
], name="data_augmentation")

# Load EfficientNetB0 base model
try:
    base_model = EfficientNetB2(
        input_shape=IMG_SHAPE,
        include_top=False,
        weights='imagenet'
    )
    # Freeze the base model initially
    base_model.trainable = False
    print("EfficientNetB0 base model loaded and frozen.")
except Exception as e:
    print(f"Error loading base model: {e}")
    base_model = None # Set to None to prevent errors later

In [None]:
# Build Full Model
if base_model is not None and 'num_classes' in locals():
    inputs = tf.keras.Input(shape=IMG_SHAPE)
    x = data_augmentation(inputs)
    # Use EfficientNetB0 preprocessing (expects uint8 0-255)
    x = efficientnet_preprocess_input(x)
    x = base_model(x, training=False) # Run base in inference mode initially
    x = layers.GlobalAveragePooling2D(name="global_avg_pool")(x)
    x = layers.Dropout(0.2, name="top_dropout")(x) # Regularization dropout
    outputs = layers.Dense(num_classes, activation='softmax', name="output_dense")(x)

    model = tf.keras.Model(inputs, outputs)

    # Compile the model for initial training
    initial_learning_rate = 0.001
    model.compile(
        optimizer=tf.keras.optimizers.Adam(learning_rate=initial_learning_rate),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )
    print("\nModel built successfully.")
    model.summary()
else:
    print("Skipping model building due to previous errors.")

## 6. Initial Training (Train the Head)

In [None]:
# Initial Model Training
if 'model' in locals() and X_train_array.size > 0:
    INITIAL_EPOCHS = 50 # Adjust as needed
    BATCH_SIZE = 32     # Adjust based on GPU memory

    # EarlyStopping Callback
    early_stopping = tf.keras.callbacks.EarlyStopping(
        monitor='val_loss',
        patience=5, # Stop if val_loss doesn't improve for 5 epochs
        restore_best_weights=True,
        verbose=1
    )

    print(f"\nStarting initial training for up to {INITIAL_EPOCHS} epochs...")
    history = model.fit(
        X_train_array, y_train_encoded,
        epochs=INITIAL_EPOCHS,
        validation_data=(X_val_array, y_val_encoded),
        batch_size=BATCH_SIZE,
        callbacks=[early_stopping]
    )

    # Evaluate after initial training
    loss0, accuracy0 = model.evaluate(X_val_array, y_val_encoded, verbose=0)
    print(f"\nInitial training complete (or stopped early).")
    print(f"Initial training - Validation Loss: {loss0:.4f}")
    print(f"Initial training - Validation Accuracy: {accuracy0:.4f}")
else:
    print("Skipping initial training due to previous errors or empty data.")

## 7. Fine-tuning (Adjusted Strategy)

In [None]:
# Prepare for Fine-tuning (Unfreeze Layers)
if 'model' in locals() and 'history' in locals():
    # Unfreeze the base model first
    base_model.trainable = True

    # Determine how many layers are in the base model
    total_layers = len(base_model.layers)
    print(f"\nTotal layers in base model: {total_layers}")

    # Decide layers to fine-tune (e.g., unfreeze top 30)
    fine_tune_layers_count = 30
    fine_tune_from_layer_index = total_layers - fine_tune_layers_count
    print(f"Fine-tuning the top {fine_tune_layers_count} layers (from index {fine_tune_from_layer_index} onwards).")

    # Freeze all layers before the `fine_tune_from_layer_index`
    for layer in base_model.layers[:fine_tune_from_layer_index]:
        layer.trainable = False

    # Re-compile with a very low learning rate
    fine_tune_learning_rate = 1e-5
    model.compile(
        loss='sparse_categorical_crossentropy',
        optimizer=tf.keras.optimizers.Adam(learning_rate=fine_tune_learning_rate),
        metrics=['accuracy']
    )

    print("\nModel Summary after setting fine-tuning layers:")
    model.summary()
else:
    print("\nSkipping fine-tuning setup.")

In [None]:
# Run Fine-tuning Training
if 'model' in locals() and 'history' in locals() and base_model.trainable: # Check if fine-tuning was set up
    FINE_TUNE_EPOCHS = 35 # Adjust as needed
    start_epoch_ft = history.epoch[-1] + 1 if history.epoch else 0
    total_epochs_target = start_epoch_ft + FINE_TUNE_EPOCHS

    print(f"\nStarting fine-tuning from epoch {start_epoch_ft} up to {total_epochs_target-1}...")

    # Use a new EarlyStopping instance
    early_stopping_ft = tf.keras.callbacks.EarlyStopping(
        monitor='val_loss',
        patience=5,
        restore_best_weights=True,
        verbose=1
    )

    # Check input data type again before fitting
    print(f"Fine-tuning input data type: {X_train_array.dtype}") # Should be uint8

    history_fine = model.fit(
        X_train_array, y_train_encoded,
        epochs=total_epochs_target,
        initial_epoch=start_epoch_ft,
        validation_data=(X_val_array, y_val_encoded),
        batch_size=BATCH_SIZE, # Consider reducing batch size if memory issues arise
        callbacks=[early_stopping_ft]
    )
    print("\nFine-tuning complete (or stopped early).")
else:
    print("\nSkipping fine-tuning training.")

## 8. Evaluation & Analysis

In [None]:
# Final Evaluation on Test Set
if 'model' in locals() and X_test_array.size > 0:
    print("\nEvaluating final model on Test Set...")
    test_loss, test_acc = model.evaluate(X_test_array, y_test_encoded)
    print(f"\nFinal Test Loss: {test_loss:.4f}")
    print(f"Final Test Accuracy: {test_acc:.4f}")
else:
    print("\nSkipping final evaluation.")

In [None]:
# Plot Training History
if 'history' in locals():
    # Combine histories safely
    acc = history.history['accuracy']
    val_acc = history.history['val_accuracy']
    loss = history.history['loss']
    val_loss = history.history['val_loss']
    epochs_initial = len(acc)

    fine_tune_epochs_run = 0
    if 'history_fine' in locals() and history_fine.history:
        acc += history_fine.history.get('accuracy', [])
        val_acc += history_fine.history.get('val_accuracy', [])
        loss += history_fine.history.get('loss', [])
        val_loss += history_fine.history.get('val_loss', [])
        fine_tune_epochs_run = len(history_fine.history.get('loss', []))

    epochs_range = range(epochs_initial + fine_tune_epochs_run)

    plt.figure(figsize=(12, 5))

    plt.subplot(1, 2, 1)
    plt.plot(epochs_range, acc, label='Training Accuracy')
    plt.plot(epochs_range, val_acc, label='Validation Accuracy')
    plt.axvline(x=epochs_initial -1 , color='r', linestyle='--', label='Start Fine-tuning') # Mark fine-tuning start
    plt.legend(loc='lower right')
    plt.title('Training and Validation Accuracy')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')

    plt.subplot(1, 2, 2)
    plt.plot(epochs_range, loss, label='Training Loss')
    plt.plot(epochs_range, val_loss, label='Validation Loss')
    plt.axvline(x=epochs_initial -1, color='r', linestyle='--', label='Start Fine-tuning') # Mark fine-tuning start
    plt.legend(loc='upper right')
    plt.title('Training and Validation Loss')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.tight_layout()
    plt.show()
else:
    print("\nSkipping history plotting as training did not run.")

In [None]:
# Classification Report and Confusion Matrix
if 'model' in locals() and X_test_array.size > 0 and 'y_test_encoded' in locals():
    print("\nGenerating predictions on test set...")
    y_pred_probs = model.predict(X_test_array)
    y_pred = np.argmax(y_pred_probs, axis=1)

    print("\nClassification Report:")
    print(classification_report(y_test_encoded, y_pred, target_names=label_encoder.classes_))

    # Confusion Matrix
    cm = confusion_matrix(y_test_encoded, y_pred)
    plt.figure(figsize=(10, 8))
    sns.heatmap(cm, annot=True, fmt="d", cmap="Blues",
                xticklabels=label_encoder.classes_,
                yticklabels=label_encoder.classes_)
    plt.xlabel("Predicted Label")
    plt.ylabel("True Label")
    plt.title("Confusion Matrix (EfficientNetB0)")
    plt.xticks(rotation=45, ha='right')
    plt.yticks(rotation=0)
    plt.tight_layout()
    plt.show()
else:
    print("\nSkipping Classification Report and Confusion Matrix generation.")

## 9. Save the Final Model

In [None]:
# Save the Trained Model
if 'model' in locals():
    model_save_path = "face_emotion_efficientnetb0_final.keras"
    try:
        model.save(model_save_path)
        print(f"Model saved successfully to {model_save_path}")
    except Exception as e:
        print(f"Error saving model: {e}")
else:
    print("\nSkipping model saving.")

## 10. Qualitative Analysis (Sample Predictions)

In [None]:
# Display Sample Predictions
if 'model' in locals() and X_test_array.size > 0 and 'y_test_encoded' in locals():
    num_samples = 5
    if len(X_test_array) >= num_samples:
        indices = np.random.choice(np.arange(len(X_test_array)), num_samples, replace=False)

        plt.figure(figsize=(15, 5))
        for i, idx in enumerate(indices):
            plt.subplot(1, num_samples, i+1)
            # EfficientNet input was uint8 [0, 255]
            img_display = X_test_array[idx]
            plt.imshow(img_display)
            true_label = label_encoder.classes_[y_test_encoded[idx]]
            pred_label = label_encoder.classes_[y_pred[idx]]
            plt.title(f"True: {true_label}\nPred: {pred_label}")
            plt.axis("off")
        plt.tight_layout()
        plt.show()
    else:
        print("\nNot enough test samples to display.")
else:
    print("\nSkipping qualitative analysis.")

In [None]:
#model.summary()
#effnet_model = model.get_layer('efficientnetb2')
#effnet_model.summary()
#model.summary(expand_nested=True)