# Exam on Convolutional Neural Networks (CNN)

Welcome to the Convolutional Neural Networks (CNN) practical exam. In this exam, you will work on an image classification task to predict the type of traffic sign. You are provided with a dataset of traffic sign images, and your task is to build, train, and evaluate a CNN model.

---

## Dataset Overview
### **Dataset:**
* Just run the command under the `Load Data` section to get the data downloaded and unzipped or you can access it [here](https://drive.google.com/file/d/1HwMV-Lt_sWoxc5v6igmTxTwomS3DR6cQ/view?usp=sharing)
### **Dataset Name:** Traffic Signs

### **Description:**  
The dataset contains images of various German traffic signs labeled for classification purposes. Each image belongs to one of the 43 classes, representing different types of traffic signs.

### **Labels:**
```python
classes = {
    0:  'Speed limit (20km/h)',
    1:  'Speed limit (30km/h)',
    2:  'Speed limit (50km/h)',
    3:  'Speed limit (60km/h)',
    4:  'Speed limit (70km/h)',
    5:  'Speed limit (80km/h)',
    6:  'End of speed limit (80km/h)',
    7:  'Speed limit (100km/h)',
    8:  'Speed limit (120km/h)',
    9:  'No passing',
    10: 'No passing veh over 3.5 tons',
    11: 'Right-of-way at intersection',
    12: 'Priority road',
    13: 'Yield',
    14: 'Stop',
    15: 'No vehicles',
    16: 'Veh > 3.5 tons prohibited',
    17: 'No entry',
    18: 'General caution',
    19: 'Dangerous curve left',
    20: 'Dangerous curve right',
    21: 'Double curve',
    22: 'Bumpy road',
    23: 'Slippery road',
    24: 'Road narrows on the right',
    25: 'Road work',
    26: 'Traffic signals',
    27: 'Pedestrians',
    28: 'Children crossing',
    29: 'Bicycles crossing',
    30: 'Beware of ice/snow',
    31: 'Wild animals crossing',
    32: 'End speed + passing limits',
    33: 'Turn right ahead',
    34: 'Turn left ahead',
    35: 'Ahead only',
    36: 'Go straight or right',
    37: 'Go straight or left',
    38: 'Keep right',
    39: 'Keep left',
    40: 'Roundabout mandatory',
    41: 'End of no passing',
    42: 'End no passing veh > 3.5 tons'
}
```



## Load Data
Run the following command to get the data and unzip it, alternatively you can access the data [here](https://drive.google.com/file/d/1HwMV-Lt_sWoxc5v6igmTxTwomS3DR6cQ/view?usp=sharing).

In [None]:
!unzip /content/drive/MyDrive/Traffic_Signs.zip

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
                 File exists
                 unable to process Traffic_Signs/Dataset/35/00035_00012_00003.png.
checkdir error:  cannot create Traffic_Signs
                 File exists
                 unable to process Traffic_Signs/Dataset/35/00035_00006_00026.png.
checkdir error:  cannot create Traffic_Signs
                 File exists
                 unable to process Traffic_Signs/Dataset/35/00035_00000_00023.png.
checkdir error:  cannot create Traffic_Signs
                 File exists
                 unable to process Traffic_Signs/Dataset/35/00035_00000_00013.png.
checkdir error:  cannot create Traffic_Signs
                 File exists
                 unable to process Traffic_Signs/Dataset/35/00035_00038_00016.png.
checkdir error:  cannot create Traffic_Signs
                 File exists
                 unable to process Traffic_Signs/Dataset/35/00035_00019_00002.png.
checkdir error:  cannot create Traffic

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## Import Libraries

In [None]:
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import OneHotEncoder
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_curve, auc
from sklearn.preprocessing import label_binarize
from sklearn.multiclass import OneVsRestClassifier
from sklearn.metrics import f1_score
from sklearn.metrics import precision_score, recall_score
from sklearn.metrics import accuracy_score

## Data Preprocessing
In this section, preprocess the dataset by:
- Loading the images from the file paths.
- Resizing the images to a consistent size.
- Normalizing pixel values.

Add more if needed!

In [None]:
#Loading the images from the file paths.

import os
import cv2
import numpy as np

def load_images_from_folder(folder):
    images = []
    labels = []
    for filename in os.listdir(folder):
        img = cv2.imread(os.path.join(folder,filename))
        if img is not None:
            images.append(img)
            labels.append(int(filename.split('.')[0])) # Assuming filename is in the format 'label.png'
    return images, labels

In [None]:
if train_images is not None:
    resized_train_images = [cv2.resize(img, (IMG_WIDTH, IMG_HEIGHT)) for img in train_images if img is not None]
else:
    resized_train_images = []

if test_images is not None:
    resized_test_images = [cv2.resize(img, (IMG_WIDTH, IMG_HEIGHT)) for img in test_images if img is not None]
else:
    resized_test_images = []


In [None]:
import numpy as np
#Normalizing pixel values
normalized_train_images = np.array(resized_train_images) / 255.0
normalized_test_images = np.array(resized_test_images) / 255.0

## Data Splitting
In this section, we will split our dataset into three parts:

* Training set (70%).
* Validation set (15%).
* Test set (15%).

In [None]:
from sklearn.model_selection import train_test_split

if len(resized_train_images) > 0:

    train_images, temp_images, train_labels, temp_labels = train_test_split(resized_train_images, train_labels, test_size=0.3, random_state=42)

    if len(temp_images) > 0:

        val_images, test_images, val_labels, test_labels = train_test_split(temp_images, temp_labels, test_size=0.5, random_state=42)

        print("Training set size:", len(train_images))
        print("Validation set size:", len(val_images))
        print("Test set size:", len(test_images))
    else:
        print("Dataset too small to split into validation and test sets.")
else:
    print("Empty dataset. Cannot split.")


Empty dataset. Cannot split.


## Building the CNN Model
In this section, define the architecture of the CNN model. The architecture may consist of:
- Convolutional layers with max-pooling
- Dropout layers
- Flatten layer
- Dense layers
- Output layer

Add and remove any of these as needed!

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

IMG_WIDTH = 30
IMG_HEIGHT = 30
NUM_CATEGORIES = 43

model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(IMG_WIDTH, IMG_HEIGHT, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(NUM_CATEGORIES, activation='softmax'))

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


## Training the Model
Train the CNN model using the training data and validate it on the validation set.

In [None]:
model_ann = keras.Sequential([
    layers.Input(shape=(image_size[0], image_size[1], 3)),
    layers.Rescaling(1./255), #preprocessing layer
    layers.Flatten(),
    layers.Dense(units=64, activation='relu'),
    layers.Dense(units=32, activation='relu'),
    layers.Dense(units=16, activation='relu'),
    layers.Dense(units=1, activation='sigmoid'),
])
model_ann.summary()

In [None]:
import numpy as np
# Reshape images to a consistent size
IMG_WIDTH = 30
IMG_HEIGHT = 30
X_train = np.array([cv2.resize(img, (IMG_WIDTH, IMG_HEIGHT)) for img in train_images])
X_test = np.array([cv2.resize(img, (IMG_WIDTH, IMG_HEIGHT)) for img in test_images])

# Normalize pixel values
X_train = X_train / 255.0
X_test = X_test / 255.0

# Convert labels to categorical (one-hot encoding)
y_train = np.array(train_labels)
y_test = np.array(test_labels)
y_train = tf.keras.utils.to_categorical(y_train, num_classes=NUM_CATEGORIES)
y_test = tf.keras.utils.to_categorical(y_test, num_classes=NUM_CATEGORIES)

print("Preprocessing complete.")


In [None]:
#compile
model_ann.compile(
  optimizer='adam',
  loss='binary_crossentropy',
  metrics=['accuracy'],
)

In [None]:
#Define the model

model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(IMG_WIDTH, IMG_HEIGHT, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(NUM_CATEGORIES, activation='softmax'))

In [None]:
import numpy as np
from sklearn.model_selection import train_test_split

# Train the model using the training data and validate it on the validation set
model.fit(np.array(train_images), np.array(train_labels), epochs=10, validation_data=(np.array(val_images), np.array(val_labels)))

NameError: name 'val_images' is not defined

## Evaluate the Model
Evaluate the performance of the model on the test set.

In [None]:
model_ann.evaluate(valid_ds)

In [None]:
# Evaluate the model on the test set
test_loss, test_acc = model.evaluate(X_test, y_test, verbose=0)
print('Test accuracy:', test_acc)

## Make Predictions
Use the trained model to make predictions on new or unseen traffic sign images.

In [None]:
#Use the trained model to make predictions on new or unseen traffic sign images.

new_image_dir = 'new_images'

for filename in os.listdir(new_image_dir):
    if filename.endswith(".png"):
        image_path = os.path.join(new_image_dir, filename)
        image = cv2.imread(image_path)
        resized_image = cv2.resize(image, (32, 32))  # Resize to match model input
        normalized_image = resized_image / 255.0  # Normalize pixel values

        # Reshape the image to match the model's input shape
        input_image = normalized_image.reshape(1, 32, 32, 3)

        prediction = model.predict(input_image)
        predicted_class = np.argmax(prediction)

        print(f"Image: {filename}, Predicted Class: {predicted_class}")

if you need new, we prepared some data for you [here](https://drive.google.com/file/d/1S_vpQntND9839x8kJpegaEgtSIA4JxHO/view?usp=sharing), or you can simply run the following command to get the data and unzip it.

<small>Note: please note that the file contain MetaData to tell you what each image contains <b>THIS IS JUST FOR YOU TO MAKE SURE</b></smmall>

## Model Performance Visualization
Visualize performance metrics such as accuracy and loss over the epochs.

In [None]:
# Plot training & validation accuracy values
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()

In [None]:
# Plot training & validation loss values
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()

## Save the Model
Save the trained CNN model for submission.

In [None]:
model_cnn.save('modle2.keras')

## Project Questions:

1. **Data Preprocessing**: Explain why you chose your specific data preprocessing techniques (e.g., resizing images, normalization, data augmentation). How do these preprocessing steps improve the performance of your CNN model?
2. **Model Architecture**: Describe the architecture of your CNN model (e.g., number of convolutional layers, kernel sizes, pooling layers). Why did you choose this structure, and how do you expect each layer to contribute to feature extraction?
3. **Activation Functions**: Justify your choice of activation functions. How do they influence the training and output of your CNN?
4. **Training Process**: Discuss your choice of batch size, number of epochs, and optimizer. How did these decisions impact the training process and the convergence of the model?
5. **Loss Function and Metrics**: Explain why you chose the specific loss function and evaluation metrics for this classification task. How do they align with the goal of correctly classifying traffic signs?
6. **Regularization Techniques**: If you used regularization methods like dropout or batch normalization, explain why you implemented them and how they helped prevent overfitting in your model.
7. **Model Evaluation**: Justify the method you used to evaluate your model's performance on the test set. Why did you select these evaluation techniques, and what insights did they provide about your model's accuracy and generalization ability?
8. **Model Visualization**: Explain the significance of the performance visualizations (e.g., accuracy and loss curves). What do they tell you about your model's training process and its ability to generalize?
9. **Overfitting and Underfitting**: Analyze whether the model encountered any overfitting or underfitting during training. What strategies could you implement to mitigate these issues?

### Answer Here: