# Exploratory Data Analysis

This notebook is intended for exploratory data analysis (EDA) on the waste classification dataset. The goal is to understand the dataset better, visualize the distribution of classes, and prepare for training the EfficientNet model.

In [1]:
import os
import matplotlib.pyplot as plt
import seaborn as sns
from PIL import Image
import numpy as np

# Set the path to the dataset
data_dir = '../data/'
categories = os.listdir(data_dir)

print('Categories:', categories)

In [2]:
# Visualize the distribution of classes
class_counts = [len(os.listdir(os.path.join(data_dir, category))) for category in categories]

plt.figure(figsize=(10, 6))
sns.barplot(x=categories, y=class_counts)
plt.title('Distribution of Waste Classes')
plt.xlabel('Waste Class')
plt.ylabel('Number of Images')
plt.xticks(rotation=45)
plt.show()

In [3]:
# Display some sample images from each class
plt.figure(figsize=(12, 8))
for i, category in enumerate(categories):
    img_path = os.path.join(data_dir, category, os.listdir(os.path.join(data_dir, category))[0])
    img = Image.open(img_path)
    plt.subplot(2, 3, i + 1)
    plt.imshow(img)
    plt.title(category)
    plt.axis('off')
plt.tight_layout()
plt.show()

## Conclusion

In this notebook, we explored the waste classification dataset by visualizing the distribution of classes and displaying sample images. This analysis will help inform the training process for the EfficientNet model.