# CNN Model

In [1]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
import os
import cv2
import tensorflow as tf

import seaborn as sns
import random
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical

from keras import layers, models
from keras.optimizers import Adam
from sklearn.metrics import classification_report,confusion_matrix,accuracy_score

## Basic Initial Implementation

In [5]:
model = models.Sequential([
    layers.InputLayer(input_shape=(30, 30, 3)), 
    layers.Conv2D(32, (3, 3), activation='relu', padding='same'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu', padding='same'),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(256, activation='relu'),
    layers.Dense(106, activation='softmax')
])

### Results & Limitations

Train Accuracy: 96.89%

Test Accuracy: 89.33%

The accuracy scores indicate that the initial model is overfitting to the training data. The testing accuracy is also much lower than benchmark models for the individual datasets. Additionally, the filter size of 5 x 5 is too large and struggles to detect smaller and more intricate features. Due to our dataset being extremely large and diverse with a variety of samples, it is evident that a well-performing CNN model must be more complex and have extra non-linearity. This model is too simple, and thus unable to learn accurately on our dataset.

### Changes

To account for the above limitations, we made several changes to our initial model as outlined below.

1) Added convolutional layers to the model architecture. This helped us to increase the complexity of the model and adapt to our more complex data.

2) Used a smaller filter size of 3 x 3. This ensured that the model was able to detect intricate features while also reducing the number of parameters, thus increasing efficiency.

3) Added a dropout layer. A dropout layer was added to reduce overfitting and help the model generalise better. By adding a dropout layer, the model is less likely to be dependent on a certain set of features or parameters.

4) Added a third fully connected layer. This allowed us to accomodate for combinations of complex features, and hence include extra non-linearity.

## Final CNN Implementation

In [4]:
CNN_model = models.Sequential([
    layers.InputLayer(input_shape=(30, 30, 3)),
    
    layers.Conv2D(64, (3, 3), activation='relu', padding='same'),
    layers.Conv2D(64, (3, 3), activation='relu', padding='same'),
    layers.MaxPooling2D((2, 2)),
    
    layers.Conv2D(128, (3, 3), activation='relu', padding='same'),
    layers.Conv2D(128, (3, 3), activation='relu', padding='same'),
    layers.MaxPooling2D((2, 2)),
    
    layers.Flatten(),
    
    layers.Dense(512, activation='relu'),
    layers.Dropout(0.5),  # Adding dropout for regularization
    
    layers.Dense(256, activation='relu'),
    layers.Dense(106, activation='softmax')
])
CNN_model.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_8 (Conv2D)           (None, 30, 30, 64)        1792      
                                                                 
 conv2d_9 (Conv2D)           (None, 30, 30, 64)        36928     
                                                                 
 max_pooling2d_4 (MaxPoolin  (None, 15, 15, 64)        0         
 g2D)                                                            
                                                                 
 conv2d_10 (Conv2D)          (None, 15, 15, 128)       73856     
                                                                 
 conv2d_11 (Conv2D)          (None, 15, 15, 128)       147584    
                                                                 
 max_pooling2d_5 (MaxPoolin  (None, 7, 7, 128)         0         
 g2D)                                                 