# Facial Emotion Recognition Model Training on Google Colab

This notebook trains a CNN model to detect facial emotions from images.

## Quick Start:
1. Enable GPU: `Runtime` ‚Üí `Change runtime type` ‚Üí `GPU` ‚Üí `Save`
2. Upload your dataset (see Step 2 or 3)
3. Run all cells in order
4. Download the trained model when finished



## üìä Understanding Training Progress

When training starts, you'll see:
- **Progress bars** filling up: `[=====>...]`
- **Epoch numbers** incrementing: Epoch 1/50 ‚Üí 2/50 ‚Üí 3/50...
- **Metrics updating**: loss decreases, accuracy increases
- **Time per step**: shows training speed

**Signs training is working:**
‚úÖ Progress bars moving
‚úÖ Loss decreasing (starts ~1.8, goes down)
‚úÖ Accuracy increasing (starts ~0.25-0.30, goes up)
‚úÖ Epoch number incrementing

Check the cell output below for real-time updates!


In [None]:
# Quick check: Verify GPU is available (optional)
import tensorflow as tf

print("=" * 60)
print("SYSTEM CHECK")
print("=" * 60)
print(f"TensorFlow version: {tf.__version__}")
print(f"GPU Available: {len(tf.config.list_physical_devices('GPU')) > 0}")
if len(tf.config.list_physical_devices('GPU')) > 0:
    print(f"‚úÖ GPU is ready! Training will be fast.")
    print(f"GPU Device: {tf.config.list_physical_devices('GPU')[0]}")
else:
    print("‚ö†Ô∏è  No GPU detected. Training will be slower (2-4 hours vs 20-60 min)")
print("=" * 60)


In [None]:
# Install required packages
!pip install opencv-python pillow -q
print("‚úÖ Packages installed!")


## üìÅ Quick Setup: Extract Dataset (If Already Uploaded via Files)

**If you already uploaded `archive.zip` using the Files section (left sidebar), run this cell to extract it:**


In [None]:
# Extract archive.zip that you uploaded via Files section
import zipfile
import os

zip_path = '/content/archive.zip'

# Check if zip exists
if os.path.exists(zip_path):
    print(f"‚úÖ Found {zip_path}")
    print("üì¶ Extracting archive.zip...")
    
    # Extract to /content
    with zipfile.ZipFile(zip_path, 'r') as zip_ref:
        zip_ref.extractall('/content')
    
    print("‚úÖ Extraction complete!")
    print("\nüîç Looking for dataset folder...")
    
    # Check if train and test are directly in /content
    content_dir = '/content'
    content_items = os.listdir(content_dir)
    
    # Check if train and test folders exist directly in /content
    if 'train' in content_items and 'test' in content_items:
        # Dataset structure: /content/train/ and /content/test/
        dataset_path = '/content'
        print(f"‚úÖ Found 'train' and 'test' folders directly in /content")
    else:
        # Look for a folder containing train and test
        dataset_path = None
        extracted_items = [item for item in content_items 
                          if os.path.isdir(os.path.join(content_dir, item)) 
                          and item not in ['drive', 'sample_data', '.config']]
        
        print(f"Checking folders: {extracted_items}")
        
        for item in extracted_items:
            item_path = os.path.join(content_dir, item)
            if os.path.isdir(item_path):
                subdirs = os.listdir(item_path)
                if 'train' in subdirs and 'test' in subdirs:
                    dataset_path = item_path
                    print(f"‚úÖ Found dataset folder: {item_path}")
                    break
        
        # If still not found, check nested
        if dataset_path is None:
            for item in extracted_items:
                item_path = os.path.join(content_dir, item)
                for subitem in os.listdir(item_path):
                    subitem_path = os.path.join(item_path, subitem)
                    if os.path.isdir(subitem_path):
                        subdirs = os.listdir(subitem_path)
                        if 'train' in subdirs and 'test' in subdirs:
                            dataset_path = subitem_path
                            print(f"‚úÖ Found nested dataset: {dataset_path}")
                            break
                if dataset_path:
                    break
    
    # Verify and set global variable
    if dataset_path and os.path.exists(dataset_path):
        train_path = os.path.join(dataset_path, 'train')
        test_path = os.path.join(dataset_path, 'test')
        
        if os.path.exists(train_path) and os.path.exists(test_path):
            print(f"\n‚úÖ Dataset found at: {dataset_path}")
            train_contents = os.listdir(train_path)
            test_contents = os.listdir(test_path)
            print(f"   Train folders: {train_contents}")
            print(f"   Test folders: {test_contents}")
            print("‚úÖ Dataset structure looks correct!")
            
            # Make it a global variable for use in later cells
            globals()['dataset_path'] = dataset_path
            print(f"\n‚úì dataset_path set to: {dataset_path}")
        else:
            print(f"‚ùå train or test folders not found at {dataset_path}")
    else:
        print(f"\n‚ùå Could not find dataset with 'train' and 'test' folders.")
        print("Current /content structure:")
        print(f"   {content_items}")
else:
    print(f"‚ùå archive.zip not found at {zip_path}")
    print("Please make sure you uploaded archive.zip using the Files section (left sidebar)")


## üîç Troubleshooting: Check What Was Extracted

**If extraction didn't work, run this cell to see what's in /content:**


In [None]:
# Diagnostic: Check what was extracted
import os

print("=" * 60)
print("DIAGNOSTIC: Checking /content directory")
print("=" * 60)

content_path = '/content'
if os.path.exists(content_path):
    items = os.listdir(content_path)
    print(f"\nItems in /content: {items}\n")
    
    for item in items:
        item_path = os.path.join(content_path, item)
        if os.path.isdir(item_path) and item not in ['drive', '.config']:
            print(f"üìÅ {item}/")
            try:
                subitems = os.listdir(item_path)
                print(f"   Contains: {subitems[:10]}..." if len(subitems) > 10 else f"   Contains: {subitems}")
                
                # Check if this folder has train/test
                if 'train' in subitems and 'test' in subitems:
                    print(f"   ‚úÖ This folder has 'train' and 'test' - This is your dataset!")
                    print(f"   üìç Use: dataset_path = '/content/{item}'")
            except:
                pass
        elif os.path.isfile(item_path):
            print(f"üìÑ {item}")
else:
    print("‚ùå /content directory doesn't exist")


## ‚úÖ Quick Fix: Set Dataset Path (For Already Extracted Data)

**Since your train and test folders are already in /content, run this to set the path:**


In [None]:
# Quick fix: Set dataset_path for your current structure
# Your train and test folders are directly in /content

dataset_path = '/content'  # Since train/ and test/ are here

# Verify
import os
train_path = os.path.join(dataset_path, 'train')
test_path = os.path.join(dataset_path, 'test')

if os.path.exists(train_path) and os.path.exists(test_path):
    print(f"‚úÖ dataset_path set to: {dataset_path}")
    print(f"‚úÖ Train folder exists: {train_path}")
    print(f"‚úÖ Test folder exists: {test_path}")
    train_emotions = os.listdir(train_path)
    test_emotions = os.listdir(test_path)
    print(f"‚úÖ Train emotions: {train_emotions}")
    print(f"‚úÖ Test emotions: {test_emotions}")
    print("\n‚úÖ Ready to proceed! Run the next cells to load data and train.")
else:
    print("‚ùå Error: train or test folders not found")


## üìÅ Step 1: Upload Your Dataset

Choose ONE method below:
- **Method A**: Upload ZIP file directly (run the cell below)
- **Method B**: Use Google Drive (skip this, go to Step 2)


In [None]:
# METHOD A: Upload dataset as ZIP file
from google.colab import files
import zipfile
import os

# Upload zip file
print("Click 'Choose Files' and select your archive.zip file...")
uploaded = files.upload()

# Find and extract the zip file
zip_filename = [f for f in uploaded.keys() if f.endswith('.zip')][0]
print(f"\nüì¶ Extracting {zip_filename}...")

with zipfile.ZipFile(zip_filename, 'r') as zip_ref:
    zip_ref.extractall('/content')

print(f"‚úÖ Dataset extracted!")

# Set the dataset path
dataset_path = '/content/archive'

# Verify it exists
if os.path.exists(dataset_path):
    print(f"‚úÖ Dataset found at: {dataset_path}")
    print(f"Contents: {os.listdir(dataset_path)}")
else:
    print(f"‚ùå Dataset not found. Please check the extraction.")


## üìÅ Step 2: OR Use Google Drive (Alternative to Step 1)

**Only use this if your dataset is already on Google Drive**


In [None]:
# METHOD B: Use Google Drive
from google.colab import drive
import os

# Mount Google Drive
drive.mount('/content/drive')

# Set path to your dataset on Drive
# UPDATE THIS PATH to where you uploaded your archive folder
dataset_path = '/content/drive/MyDrive/archive'  # ‚¨ÖÔ∏è CHANGE THIS IF NEEDED

# Verify it exists
if os.path.exists(dataset_path):
    print(f"‚úÖ Dataset found at: {dataset_path}")
else:
    print(f"‚ùå Dataset not found at: {dataset_path}")
    print("Please update dataset_path above to point to your archive folder")


## üîß Step 3: Import Libraries and Define Functions


In [None]:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import os
import cv2
from PIL import Image
import time

# Emotion labels
EMOTIONS = ['Angry', 'Disgust', 'Fear', 'Happy', 'Sad', 'Surprise', 'Neutral']
EMOTION_FOLDER_MAP = {
    'angry': 0,
    'disgust': 1,
    'fear': 2,
    'happy': 3,
    'sad': 4,
    'surprise': 5,
    'neutral': 6
}

def load_images_from_folder(folder_path, emotion_label):
    """Load images from a specific emotion folder."""
    images = []
    labels = []
    
    if not os.path.exists(folder_path):
        return images, labels
    
    image_files = [f for f in os.listdir(folder_path) 
                   if f.lower().endswith(('.png', '.jpg', '.jpeg', '.bmp'))]
    
    print(f"  Loading {len(image_files)} images from {os.path.basename(folder_path)}...")
    
    for img_file in image_files:
        img_path = os.path.join(folder_path, img_file)
        try:
            img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
            if img is None:
                pil_img = Image.open(img_path).convert('L')
                img = np.array(pil_img)
            img_resized = cv2.resize(img, (48, 48))
            img_normalized = img_resized.astype('float32') / 255.0
            images.append(img_normalized)
            labels.append(emotion_label)
        except Exception as e:
            continue
    
    return images, labels

def load_fer2013_data(data_path):
    """Load FER2013 dataset from folder structure."""
    print("=" * 60)
    print("LOADING FER2013 DATASET")
    print("=" * 60)
    
    if not os.path.exists(data_path):
        print(f"ERROR: Dataset folder not found at {data_path}")
        return None, None, None, None
    
    print(f"Found dataset at: {data_path}")
    
    train_path = os.path.join(data_path, 'train')
    test_path = os.path.join(data_path, 'test')
    
    if not os.path.exists(train_path) or not os.path.exists(test_path):
        print(f"ERROR: 'train' or 'test' folders not found")
        return None, None, None, None
    
    # Load training images
    print("\nüìÇ Loading training images...")
    train_images = []
    train_labels = []
    
    start_time = time.time()
    for emotion_folder, label in EMOTION_FOLDER_MAP.items():
        emotion_path = os.path.join(train_path, emotion_folder)
        images, labels = load_images_from_folder(emotion_path, label)
        train_images.extend(images)
        train_labels.extend(labels)
    
    print(f"‚úì Training images loaded in {time.time() - start_time:.2f} seconds")
    
    # Load test images
    print("\nüìÇ Loading test images...")
    test_images = []
    test_labels = []
    
    start_time = time.time()
    for emotion_folder, label in EMOTION_FOLDER_MAP.items():
        emotion_path = os.path.join(test_path, emotion_folder)
        images, labels = load_images_from_folder(emotion_path, label)
        test_images.extend(images)
        test_labels.extend(labels)
    
    print(f"‚úì Test images loaded in {time.time() - start_time:.2f} seconds")
    
    # Convert to numpy arrays
    X_train = np.array(train_images, dtype='float32')
    y_train = np.array(train_labels, dtype='int32')
    X_test = np.array(test_images, dtype='float32')
    y_test = np.array(test_labels, dtype='int32')
    
    # Add channel dimension
    X_train = np.expand_dims(X_train, axis=-1)
    X_test = np.expand_dims(X_test, axis=-1)
    
    print(f"\n{'='*60}")
    print(f"‚úì Training samples: {len(X_train):,}")
    print(f"‚úì Test samples: {len(X_test):,}")
    print(f"‚úì Image shape: {X_train[0].shape}")
    print(f"{'='*60}\n")
    
    return X_train, y_train, X_test, y_test

print("‚úÖ Functions defined!")


## üìä Step 4: Load the Dataset

**Make sure you've set `dataset_path` in Step 1 or Step 2 above!**


In [None]:
# Load the dataset
# Make sure dataset_path is set from Step 1 or Step 2 above!

X_train, y_train, X_test, y_test = load_fer2013_data(dataset_path)

if X_train is None:
    print("\n‚ùå Failed to load dataset!")
    print("Please check:")
    print("1. Did you run Step 1 (upload ZIP) OR Step 2 (mount Drive)?")
    print("2. Is dataset_path correctly set?")
    print("3. Does the dataset have 'train' and 'test' folders?")
else:
    print("‚úÖ Dataset loaded successfully!")


## üèóÔ∏è Step 5: Build the Model


In [None]:
def build_model(input_shape=(48, 48, 1), num_classes=7):
    """Build CNN model for facial emotion recognition."""
    model = keras.Sequential([
        # First convolutional block
        layers.Conv2D(64, (3, 3), activation='relu', input_shape=input_shape),
        layers.BatchNormalization(),
        layers.MaxPooling2D((2, 2)),
        layers.Dropout(0.25),
        
        # Second convolutional block
        layers.Conv2D(128, (3, 3), activation='relu'),
        layers.BatchNormalization(),
        layers.MaxPooling2D((2, 2)),
        layers.Dropout(0.25),
        
        # Third convolutional block
        layers.Conv2D(256, (3, 3), activation='relu'),
        layers.BatchNormalization(),
        layers.MaxPooling2D((2, 2)),
        layers.Dropout(0.25),
        
        # Flatten and dense layers
        layers.Flatten(),
        layers.Dense(512, activation='relu'),
        layers.BatchNormalization(),
        layers.Dropout(0.5),
        layers.Dense(256, activation='relu'),
        layers.Dropout(0.5),
        
        # Output layer
        layers.Dense(num_classes, activation='softmax')
    ])
    
    return model

# Build model
print("Building CNN model...")
model = build_model()

# Compile model
model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=0.001),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

# Display architecture
print("\nModel Architecture:")
model.summary()
print("\n‚úÖ Model built and ready for training!")


## üöÄ Step 6: START TRAINING! 

**This is the cell that actually trains your model. Run this and watch the progress bars!**


In [None]:
# Define callbacks
callbacks = [
    keras.callbacks.EarlyStopping(
        monitor='val_loss',
        patience=10,
        restore_best_weights=True,
        verbose=1
    ),
    keras.callbacks.ReduceLROnPlateau(
        monitor='val_loss',
        factor=0.5,
        patience=5,
        min_lr=0.00001,
        verbose=1
    ),
    keras.callbacks.ModelCheckpoint(
        'face_emotionModel.h5',
        monitor='val_accuracy',
        save_best_only=True,
        verbose=1
    )
]

print("=" * 60)
print("üöÄ STARTING TRAINING")
print("=" * 60)
print("This may take 20-60 minutes with GPU, or 2-4 hours with CPU only.")
print("Watch for progress bars [=====>...] and metrics updating!")
print("=" * 60 + "\n")

start_time = time.time()

# THIS IS WHERE TRAINING HAPPENS!
history = model.fit(
    X_train, y_train,
    batch_size=64,
    epochs=50,
    validation_data=(X_test, y_test),
    callbacks=callbacks,
    verbose=1
)

training_time = time.time() - start_time
print(f"\n‚úÖ Training completed in {training_time/60:.2f} minutes")

# Load best weights if saved
if os.path.exists('face_emotionModel.h5'):
    model.load_weights('face_emotionModel.h5')
    print("‚úÖ Loaded best model weights")

# Evaluate final model
print("\nEvaluating model on test set...")
test_loss, test_accuracy = model.evaluate(X_test, y_test, verbose=1)

print("\n" + "=" * 60)
print("üéâ TRAINING RESULTS")
print("=" * 60)
print(f"Test Accuracy: {test_accuracy:.4f} ({test_accuracy*100:.2f}%)")
print(f"Test Loss: {test_loss:.4f}")
print(f"Model saved as: face_emotionModel.h5")
print("=" * 60)


In [None]:
# Download the trained model
from google.colab import files

files.download('face_emotionModel.h5')
print("‚úÖ Model download initiated!")
print("üì• Check your Downloads folder for 'face_emotionModel.h5'")
print("üìÅ Then copy it to your FACE_DETECTION project folder!")
