# Emotion Classification: Data Preparation

In this notebook, you'll:

- Upload up to 10 face images per emotion class.
- Organize them into folders by emotion.
- Split the dataset into training and testing sets.

**Emotions to capture:** `['happy', 'sad', 'angry', 'surprised', 'neutral']`

## 1. Setup and Imports
Install dependencies and import required modules:

In [10]:
# Cell 1: Imports
import os
from google.colab import files
import cv2
import matplotlib.pyplot as plt
import numpy as np

emotions = ['happy', 'sad', 'neutral']
data_dir = 'images'

os.makedirs(data_dir, exist_ok=True)
for emo in emotions:
    os.makedirs(os.path.join(data_dir, emo), exist_ok=True)
print("Setup complete. Data directories ready.")

Setup complete. Data directories ready.


## 2. Upload Images per Emotion
For each emotion, upload up to 10 images. After selecting files, the dialog will close.

In [14]:
def upload_emotion_images(emotion, max_images=10):
    print(f"\n Uploading {emotion} images (Max {max_images}):")
    uploaded = files.upload()
    for i, filename in enumerate(uploaded.keys()):
        if i >= max_images:
            print(f"Error! Reached max {max_images} amount of images for {emotion}")
            break
        os.rename(filename, f"images/{emotion}/{filename}")
    print(f" Saved {min(len(uploaded), max_images)} {emotion} images")

!mkdir -p images/{happy,sad,neutral}
!ls images


happy  neutral	sad


In [16]:
upload_emotion_images("happy")


 Uploading happy images (Max 10):


Saving 1000_F_120417629_8fGen5FVRAP4rCeZkVMLt7ZB4MR33zo1.jpg to 1000_F_120417629_8fGen5FVRAP4rCeZkVMLt7ZB4MR33zo1.jpg
Saving 1000_F_162586190_6C6Ufxm8bFYv44OS14LjS4ygnfbLCQm8.jpg to 1000_F_162586190_6C6Ufxm8bFYv44OS14LjS4ygnfbLCQm8.jpg
Saving close-up-portrait-smiling-laughing-attractive-man-happy-face-human-emotion-expression-african-american-having-fun-joy-154266727.webp to close-up-portrait-smiling-laughing-attractive-man-happy-face-human-emotion-expression-african-american-having-fun-joy-154266727.webp
Saving depositphotos_378472000-stock-photo-happy-little-child-smiling-emotions.jpg to depositphotos_378472000-stock-photo-happy-little-child-smiling-emotions.jpg
Saving happy vs well blog header.webp to happy vs well blog header.webp
Saving happy-child.jpg to happy-child.jpg
Saving Positive-emotions.webp to Positive-emotions.webp
Saving shutterstock_1687578475.webp to shutterstock_1687578475.webp
Saving smiling-man-wearing-a-plaid-shirt-jpg.webp to smiling-man-wearing-a-plaid-shirt-j

In [17]:
upload_emotion_images("sad")


 Uploading sad images (Max 10):


Saving 76a644dc-cdff-4fe2-9cc4-784260667d14-GettyImages-107429862.webp to 76a644dc-cdff-4fe2-9cc4-784260667d14-GettyImages-107429862.webp
Saving 960x0.webp to 960x0.webp
Saving 1000_F_206825274_frM1i88fNtBhWJ8ria4R9pf5kviFOAXt.jpg to 1000_F_206825274_frM1i88fNtBhWJ8ria4R9pf5kviFOAXt.jpg
Saving 1000_F_278337861_MA6DYoepz0gcxqbjP8yOLoq5Oz8bZwQm.jpg to 1000_F_278337861_MA6DYoepz0gcxqbjP8yOLoq5Oz8bZwQm.jpg
Saving african-american-man-with-beard-crying-depressed-full-of-sadness-expressing-sad-emotion-isolated-over-white-background-MNKT5X.jpg to african-american-man-with-beard-crying-depressed-full-of-sadness-expressing-sad-emotion-isolated-over-white-background-MNKT5X.jpg
Saving beautiful-brunette-young-woman-sad-260nw-206254474.webp to beautiful-brunette-young-woman-sad-260nw-206254474.webp
Saving crying-baby-1-scaled.jpg to crying-baby-1-scaled.jpg
Saving sad-kid.jpg to sad-kid.jpg
Saving shutterstock_2439969785.webp to shutterstock_2439969785.webp
Saving why-does-lack-of-sleep-make-me-em

In [18]:
upload_emotion_images("neutral")


 Uploading neutral images (Max 10):


Saving 1000_F_270568567_irmImDf5e1rKCXaevtqwYHPmGJ6whICF.jpg to 1000_F_270568567_irmImDf5e1rKCXaevtqwYHPmGJ6whICF.jpg
Saving close-up-portrait-handsome-young-man-neutral-poker-facial-expressions-human-emotions-attractive-confident-african-137916421.webp to close-up-portrait-handsome-young-man-neutral-poker-facial-expressions-human-emotions-attractive-confident-african-137916421.webp
Saving girl-neutral-emotions-girls-neutral-emotion-2HDPKN4.jpg to girl-neutral-emotions-girls-neutral-emotion-2HDPKN4.jpg
Saving man-neutral-expression-candid-portrait-facial-53831400.webp to man-neutral-expression-candid-portrait-facial-53831400.webp
Saving natural-portrait-young-attractive-man-his-s-looking-posing-neutral-face-expression-close-up-headshot-mixed-race-154272090.webp to natural-portrait-young-attractive-man-his-s-looking-posing-neutral-face-expression-close-up-headshot-mixed-race-154272090.webp
Saving natural-portrait-young-attractive-teenager-woman-looking-posing-neutral-face-expression-clo

In [19]:
print("\n Upload Summary:")
for emotion in ["happy", "sad", "neutral"]:
  count = len(os.listdir(f"images/{emotion}"))
  print(f"- {emotion}: {count}/10 images")


 Upload Summary:
- happy: 10/10 images
- sad: 10/10 images
- neutral: 10/10 images


## 3. Prepare Train/Test Split
Gather all image paths and labels, then split:

In [22]:
from sklearn.model_selection import train_test_split

image_paths = []
labels = []

for emotion in ['happy', 'sad', 'neutral']:
    emotion_dir = os.path.join('images', emotion)
    for img_file in os.listdir(emotion_dir):
        image_paths.append(os.path.join(emotion_dir, img_file))
        labels.append(emotion)

X = np.array(image_paths)
y = np.array(labels)

X_train, X_test, y_train, y_test = train_test_split(
    X, y,
    test_size = 0.2,
    random_state = 42,
    stratify = y
)

print(f"Total images: {len(X)}")
print(f"Train set: {len(X_train)} ({(len(X_train)/len(X))*100:.1f}%)")
print(f"Test set: {len(X_test)} ({(len(X_test)/len(X))*100:.1f}%)")
print("\nClass distribution in test set:")
for emotion in ['happy', 'sad', 'neutral']:
    print(f"- {emotion}: {sum(y_test == emotion)} samples")

Total images: 30
Train set: 24 (80.0%)
Test set: 6 (20.0%)

Class distribution in test set:
- happy: 2 samples
- sad: 2 samples
- neutral: 2 samples


## 4. Save Split Lists
Export file lists and labels to CSV for future use:

Aswer reflection in markdown

1. **Hidden Bias:**  
   Identify one scenario where your current images might lead the model to learn a spurious signal (e.g. background, lighting). How would you test for and eliminate it?

2. **Edge Cases:**  
   Describe a face or expression that your dataset likely fails to capture. What impact could that have on real-world performance, and how would you address it?

3. **Generalization Strategy:**  
   With only 10 images per class, what’s one concrete augmentation or data-collection strategy you’d use to improve robustness—and why that choice?






