# Exploratory Data Analysis (EDA) for Facial Expressions Dataset

This notebook provides a comprehensive EDA for the facial expressions image dataset. We will analyze class distributions, examine image characteristics, and visualize sample images from each emotion category.

In [None]:
import sys
sys.path.append('..')
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Import EDA utility functions
from src.eda_utils import (
    get_class_distribution,
    plot_class_distribution,
    show_sample_images,
    analyze_image_dimensions
)

# Set plot style
plt.style.use('seaborn')
sns.set(style='whitegrid')

# Define paths
TRAIN_PATH = '../archive (2)/train'
TEST_PATH = '../archive (2)/test'

## 1. Dataset Overview

Let's first examine the structure of our dataset and count the number of images in each emotion category.

In [None]:
# Get distributions for train and test sets
train_dist = get_class_distribution(TRAIN_PATH)
test_dist = get_class_distribution(TEST_PATH)

# Create DataFrames for visualization
train_df = pd.DataFrame(list(train_dist.items()), columns=['Emotion', 'Count'])
test_df = pd.DataFrame(list(test_dist.items()), columns=['Emotion', 'Count'])

print("Training Set Distribution:")
print(train_df)
print("\nTest Set Distribution:")
print(test_df)

## 2. Visualizing Class Distribution

Let's visualize the distribution of images across different emotion categories.

In [None]:
plot_class_distribution(train_df, 'Training Set Class Distribution')
plot_class_distribution(test_df, 'Test Set Class Distribution')

## 3. Image Analysis

Let's examine some sample images from each emotion category and analyze their characteristics.

In [None]:
show_sample_images(TRAIN_PATH)

## 4. Image Statistics

Let's analyze the dimensions and other characteristics of the images in our dataset.

In [None]:
analyze_image_dimensions(TRAIN_PATH)