# Image Classification Project Overview

This project utilizes a convolutional neural network (CNN) to classify images into two distinct categories based on their content:

- **O (Organic):** This class includes images of items that are biodegradable and typically composed of natural materials that can decompose naturally. Examples include food scraps, paper, and wooden objects.

- **R (Recyclable):** This class encompasses images of items that can be processed and reused as raw material for new products. Common examples are plastics, metals, and glass.

The goal of the project is to assist in waste management by automatically sorting waste items into appropriate recycling or composting streams using image recognition technology.



# LIBRAIRIES


In [37]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, Dataset
from torchvision import transforms
from torchvision.datasets import ImageFolder
import matplotlib.pyplot as plt

# UNZIP DATASET

In [39]:
import zipfile
import os

# Path to the zip file
zip_path = './Waste Classification data.zip'  # Update this path if your ZIP file is located elsewhere

# Directory where you want to extract the contents
extract_to = './'  # This will extract the files to the root directory

# Ensure the file path exists
if os.path.exists(zip_path):
    with zipfile.ZipFile(zip_path, 'r') as zip_ref:
        zip_ref.extractall(extract_to)
    print("Files have been extracted successfully!")
else:
    print(f"Error: The file {zip_path} does not exist.")

Files have been extracted successfully!


# MODEL

# EnhancedCNN4 Model Explanation

The `EnhancedCNN4` class defines a convolutional neural network (CNN) for image classification using PyTorch. This network is designed to process images with three color channels (e.g., RGB images) and classify them into two categories. The architecture includes multiple convolutional layers, batch normalization, pooling, and dropout for regularization. Here is a step-by-step breakdown of the model components:

## Components

### Convolutional Layers
- **Conv1:** The first convolutional layer takes an input with 3 channels and applies 32 filters of size 3x3 with a stride of 1 and padding of 1. This maintains the spatial dimensions of the input.
- **BatchNorm1:** Normalizes the output of Conv1 to stabilize learning and improve convergence rates.
- **Conv2:** Doubles the number of filters to 64, using the same kernel size, stride, and padding as Conv1. This layer further processes the data, extracting more complex features.
- **BatchNorm2:** Normalizes the output of Conv2.
- **Conv3:** Increases the filters to 128, following the same pattern to capture even more complex patterns in the data.
- **BatchNorm3:** Normalizes the output of Conv3.
- **Conv4:** Again doubles the filters to 256 for high-level feature extraction.
- **BatchNorm4:** Normalizes the output of Conv4.

### Pooling Layers
- **Pool1 and Pool2:** These are max-pooling layers with a kernel size of 2 and stride of 2. They reduce the spatial dimensions of the feature maps, effectively summarizing the most prominent features in smaller dimensional representations.

### Dropout Layer
- **Dropout:** Randomly zeros some of the elements of the input tensor with probability 0.5 during training, which helps prevent overfitting.

### Fully Connected (Dense) Layers
- **FC:** A fully connected layer that takes the flattened output from the last pooling layer and maps it to 1024 neurons. It plays a role in learning non-linear combinations of the high-level features extracted by the convolutional layers.
- **FC2:** The final fully connected layer that maps the output of the previous layer to 2 neurons, corresponding to the two possible output classes.

## Forward Pass
The `forward` method defines the data flow through the network:
1. Input data is processed through the first convolutional layer, followed by ReLU activation and batch normalization.
2. The output is then downsampled using the first max-pooling layer.
3. This process is repeated for additional convolutional layers and the second pooling layer.
4. After the last pooling layer, the data is flattened and passed through a dropout layer.
5. The flattened, regularized data is then processed by two fully connected layers with ReLU activation between them.
6. The final output is produced by the second fully connected layer without any activation, as this will typically be used with a loss function like cross-entropy that incorporates softmax.

This architecture is suitable for various image classification tasks where the input images are standardized to the same size and the output requires categorization into two distinct classes.


In [40]:
import torch.nn.functional as F
import numpy as np
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
import torch.nn.functional as F

class EnhancedCNN4(nn.Module):
    def __init__(self):
        in_channels = 3
        out_channels = 32
        k_size = 3
        stride_ = 1
        padding_ = 1
        pool_k_size = 2
        pool_stride = 2
        pool_padding = 0
        dropout_rate = 0.5

        

        super(EnhancedCNN4, self).__init__()
        self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size= k_size, stride=stride_, padding=padding_)
        self.bn1 = nn.BatchNorm2d(out_channels)


        in_channels = out_channels
        out_channels = out_channels*2

        self.conv2 = nn.Conv2d(in_channels, out_channels, kernel_size= k_size, stride= stride_, padding=padding_)
        self.bn2 = nn.BatchNorm2d(out_channels)

        self.pool1 = nn.MaxPool2d(kernel_size=pool_k_size, stride=pool_stride)

        in_channels = out_channels
        out_channels = out_channels*2

        self.conv3 = nn.Conv2d(in_channels, out_channels, kernel_size=k_size, stride=stride_, padding=padding_)
        self.bn3 = nn.BatchNorm2d(out_channels)

        in_channels = out_channels
        out_channels = out_channels*2

        self.conv4 = nn.Conv2d(in_channels, out_channels, kernel_size=k_size, stride=stride_, padding=padding_)
        self.bn4 = nn.BatchNorm2d(out_channels)

        self.pool2 = nn.MaxPool2d(kernel_size=pool_k_size, stride=pool_stride)
        



        # Calculate the size of the output from the last pooling layer
        def calc_output_dim(input_dim, kernel_size, stride, padding):
            return (input_dim - kernel_size + 2 * padding) // stride + 1
        
        #Initial dimension of the data is 64
        dim = 64
        # After conv1
        dim = calc_output_dim(dim, k_size, stride_, padding_)      
        # After conv2
        dim = calc_output_dim(dim, k_size, stride_, padding_)
        # After pool1
        dim = calc_output_dim(dim, pool_k_size, pool_stride, pool_padding)   
        # After conv3
        dim = calc_output_dim(dim, k_size, stride_, padding_)
        # After conv4
        dim = calc_output_dim(dim, k_size, stride_, padding_)

        # After pool2
        dim = calc_output_dim(dim, pool_k_size, pool_stride, pool_padding)          

        self.dropout = nn.Dropout(dropout_rate)
    
        self.fc = nn.Linear(in_features= out_channels*dim*dim, out_features=1024)
        
        #out_features is the number of classes we want to predict, here Cat and Dog so 2 classses
        self.fc2 = nn.Linear(in_features=1024 , out_features=2)
        

    def forward(self, x):
        
        x = F.relu(self.bn1(self.conv1(x)))
        x = self.pool1(F.relu(self.bn2(self.conv2(x))))
        x = F.relu(self.bn3(self.conv3(x)))
        x = self.pool2(F.relu(self.bn4(self.conv4(x))))

        x = torch.flatten(x, 1)
        x = self.dropout(x)
        x = F.relu(self.fc(x))
        x = (self.fc2(x))
        return x



In [41]:
# Model initialization
model = EnhancedCNN4()

# The loss function
criterion = nn.CrossEntropyLoss()


# Optimizer

In [42]:
# Our optimizer
optimizer = optim.Adam(model.parameters(), lr=0.0009) # lr = learning rate

# Custom TRAIN DataSet

In [43]:
path_train_dataset = './Waste Classification data/TRAIN'

In [44]:
import cv2
from torchvision.transforms import functional as TF

# Define custom dataset class
class CustomDataset(torch.utils.data.Dataset):
    def __init__(self, root_dir, transform=None):
        self.dataset = ImageFolder(path_train_dataset, transform=transform)

    def __len__(self):
        return len(self.dataset)

    def __getitem__(self, idx):
        image, label = self.dataset[idx]
        return image, label
    
def color_skew(image):
    image = cv2.cvtColor(image, cv2.COLOR_RGB2HSV)  # Ensure image is in HSV color space
    h, s, v = cv2.split(image)
    h = (h * np.random.uniform(low=0.5, high=1.5)).clip(0, 180).astype(h.dtype)
    s = (s * np.random.uniform(low=0.5, high=1.5)).clip(0, 255).astype(s.dtype)
    v = (v * np.random.uniform(low=0.5, high=1.5)).clip(0, 255).astype(v.dtype)
    image = cv2.merge((h, s, v))
    return cv2.cvtColor(image, cv2.COLOR_HSV2RGB)  # Convert back to RGB


# Define transformation to be applied to images
transform = transforms.Compose([
    transforms.Resize((64, 64)),  # Resize images to 64x64
    #transforms.RandomHorizontalFlip(),  # Randomly flip images horizontally
    #transforms.RandomVerticalFlip(),  # Randomly flip images vertically
    #transforms.RandomRotation(15),  # Randomly rotate images up to 15 degrees
    #transforms.Lambda(lambda x: color_skew(np.array(x))),  # Randomly adjust color skew
    transforms.ToTensor()          # Convert images to PyTorch tensors
])

# Create custom dataset instances for training and validation
train_dataset = CustomDataset(root_dir='dataset', transform=transform)

# Split the dataset into training and validation sets
train_size = int(0.8 * len(train_dataset))
valid_size = len(train_dataset) - train_size
train_dataset, valid_dataset = torch.utils.data.random_split(train_dataset, [train_size, valid_size])

# Create DataLoader instances for training and validation
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=32, shuffle=True)
valid_loader = torch.utils.data.DataLoader(valid_dataset, batch_size=32, shuffle=False)


# TRAINING LOOP

In [45]:
model_output_name = 'enchanceCNN4_2classes_001.pth'

In [46]:
# Lists to store training and validation errors
train_errors = []
val_errors = []

# Training loop
min_loss = 0.2
num_epochs = 6
actual_epochs = 0  # To count the actual number of epochs before early stopping
for epoch in range(num_epochs):
    # Training
    model.train()  # Set the model to training mode
    total_train_loss = 0.0
    for images, labels in train_loader:
        # Forward pass
        outputs = model(images.view(images.size(0), 3,64,64))
        loss = criterion(outputs, labels)


        # Backward pass and optimization
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        total_train_loss += loss.item()

    # Calculate average training loss for the epoch
    avg_train_loss = total_train_loss / len(train_loader)
    train_errors.append(avg_train_loss)

    # Validation
    model.eval()  # Set the model to evaluation mode
    total_val_loss = 0.0
    with torch.no_grad():
        for images, labels in valid_loader:
            outputs = model(images.view(images.size(0), 3,64,64))
            loss = criterion(outputs, labels)
            total_val_loss += loss.item()

    # Calculate average validation loss for the epoch
    avg_val_loss = total_val_loss / len(valid_loader)
    val_errors.append(avg_val_loss)

    actual_epochs += 1  # Increment the actual number of epochs

    if avg_val_loss < min_loss:
        torch.save(model.state_dict(), model_output_name)
        break

    print(f'Epoch [{epoch+1}/{num_epochs}], Train Loss: {avg_train_loss:.4f}, Validation Loss: {avg_val_loss:.4f}')

# Adjust the range to the actual number of epochs completed
plt.plot(range(1, actual_epochs+1), train_errors, label='Training Error')
plt.plot(range(1, actual_epochs+1), val_errors, label='Validation Error')
plt.xlabel('Epochs')
plt.ylabel('Error')
plt.title('Training and Validation Errors')
plt.legend()
plt.show()


KeyboardInterrupt: 

# Save the trained model

In [30]:
model_output_name = 'enchanceCNN4_2classes_001.pth'

In [33]:
# Save the trained model
torch.save(model.state_dict(),model_output_name)

# Print the Model's Parameters

In [47]:
# Accessing the model's parameters
for name, param in model.named_parameters():
    print(f'Parameter name: {name}')
    print(f'Parameter shape: {param.shape}')
    print(f'Parameter values: {param}')

Parameter name: conv1.weight
Parameter shape: torch.Size([32, 3, 3, 3])
Parameter values: Parameter containing:
tensor([[[[-0.0358, -0.0566,  0.0474],
          [ 0.1448,  0.0009,  0.0214],
          [-0.0511,  0.1622,  0.0904]],

         [[ 0.1700, -0.1439, -0.0165],
          [ 0.1316, -0.1536,  0.0144],
          [ 0.0330, -0.0718, -0.0520]],

         [[-0.1832,  0.1060,  0.1355],
          [-0.1211,  0.1809,  0.0637],
          [ 0.0865,  0.0806,  0.1799]]],


        [[[-0.0157, -0.1204,  0.0255],
          [-0.1055,  0.0372, -0.1752],
          [-0.0444, -0.0309,  0.0518]],

         [[ 0.0434, -0.0023, -0.1629],
          [ 0.1131, -0.0744,  0.0997],
          [-0.0003, -0.0110, -0.0036]],

         [[-0.0885,  0.1025,  0.1508],
          [ 0.1844, -0.1803, -0.1589],
          [ 0.0789,  0.1904,  0.0409]]],


        [[[-0.0257,  0.1835, -0.1238],
          [ 0.0752, -0.1709, -0.0396],
          [ 0.1555, -0.1696, -0.0777]],

         [[-0.0808, -0.0291,  0.1648],
          [-

# TESTING 


In [50]:
#trial_model = model_output_name

#Model was trained for 6 epochs and the validation loss was 0.3
trial_model = 'trained_model.pth'
path_test_dataset = './Waste Classification data/TEST'

In [51]:
import torch
import torch.nn as nn
from torchvision import transforms
from torchvision.datasets import ImageFolder
import numpy as np
import cv2
from torch.utils.data import DataLoader, Dataset

model = EnhancedCNN4()
model.load_state_dict(torch.load(trial_model))
model.eval()  # Set the model to evaluation mode


# Assuming the transforms are the same as when training, or you may need to adjust
test_transform = transforms.Compose([
    transforms.Resize((64, 64)),
    transforms.ToTensor()
])

# Assuming the dataset setup is similar to your training/validation
test_dataset = ImageFolder(root=path_test_dataset, transform=test_transform)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

# Function to calculate accuracy
def calculate_accuracy(model, data_loader):
    correct = 0
    total = 0
    with torch.no_grad():
        for images, labels in data_loader:
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    return 100 * correct / total

# Calculate and print the accuracy of the model
accuracy = calculate_accuracy(model, test_loader)
print(f'Accuracy of the model on the test images: {accuracy:.2f}%')


Accuracy of the model on the test images: 86.03%
