<a href="https://colab.research.google.com/github/rberbenkova/project-1-deep-learning-image-classification-with-cnn/blob/main/Copy_of_main.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Project I | Deep Learning: Image Classification with CNN**

# **Task Description:**
Students will build a Convolutional Neural Network (CNN) model to classify images from a given dataset into predefined categories/classes.


**## Datasets (pick one!)**

The dataset for this task is the **CIFAR-10** dataset, which consists of 60,000 32x32 color images in 10 classes, with 6,000 images per class. You can download the dataset from here.
(https://www.cs.toronto.edu/~kriz/cifar.html)

The **second dataset** contains about 28,000 medium quality animal images belonging to 10 categories: dog, cat, horse, spyder, butterfly, chicken, sheep, cow, squirrel, elephant. The link is here.
(https://www.kaggle.com/datasets/alessiocorrado99/animals10/data)

# Assessment Components

**Data Preprocessing**
*   Data loading and preprocessing (e.g., normalization, resizing,
augmentation).
*   Create visualizations of some images, and labels.


**Model Architecture**
* Design a CNN architecture suitable for image classification
* Include convolutional layers, pooling layers, and fully connected layers.


**Model Training**
*   Train the CNN model using appropriate optimization techniques (e.g., stochastic gradient descent, Adam).
*   Utilize techniques such as early stopping to prevent overfitting.


**Model Evaluation**

Evaluate the trained model on a separate validation set.
Compute and report metrics such as accuracy, precision, recall, and F1-score.
Visualize the confusion matrix to understand model performance across different classes.
Transfer Learning

Evaluate the accuracy of your model on a pre-trained models like ImagNet, VGG16, Inception... (pick one an justify your choice)
You may find this link helpful.
This is the Pytorch version.
Perform transfer learning with your chosen pre-trained models i.e., you will probably try a few and choose the best one.
Code Quality

Well-structured and commented code.
Proper documentation of functions and processes.
Efficient use of libraries and resources.
Report

Write a concise report detailing the approach taken, including:
Description of the chosen CNN architecture.
Explanation of preprocessing steps.
Details of the training process (e.g., learning rate, batch size, number of epochs).
Results and analysis of models performance.
What is your best model. Why?
Insights gained from the experimentation process.
Include visualizations and diagrams where necessary.
Model deployment

Pick the best model
Build an app using Flask - Can you host somewhere other than your laptop? +5 Bonus points if you use Tensorflow Serving
User should be able to upload one or multiples images get predictions including probabilities for each prediction
Evaluation Criteria
Accuracy of the trained models on the validation set. 30 points
Clarity and completeness of the report. 20 points
Quality of code implementation. 5 points
Proper handling of data preprocessing and models training. 30 points
Demonstration of understanding key concepts of deep learning. 5 points
Model deployment. 10 points
Passing Score is 70 points.

Submission Details
Deadline for submission: end of the week or as communicated by your teaching team.
Submit the following:
Python code files (*.py, ipynb) containing the model implementation and training process.
A data folder with 5-10 images to test the deployed model/app if hosted somewhere else other than your laptop (strongly recommended! Not a must have)
A PDF report documenting the approach, results, and analysis.
Any additional files necessary for reproducing the results (e.g., requirements.txt, README.md).
PPT presentation
Additional Notes
Students are encourage to experiment with different architectures, hyper-parameters, and optimization techniques.
Provide guidance and resources for troubleshooting common issues during model training and evaluation.
Students will discuss their approaches and findings in class during assessment evaluation sessions.

*   Visualize the images in CIFAR-10 dataset. Create a 10 x 10 plot showing 10 random samples from each class.
*   Convert the labels to one-hot encoded form.
*   Normalize the images.




In [None]:
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical

# Load CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

In [None]:
# Your code here :

## Define the following model (same as the one in tutorial)

For the convolutional front-end, start with a single convolutional layer with a small filter size (3,3) and a modest number of filters (32) followed by a max pooling layer.

Use the input as (32,32,3).

The filter maps can then be flattened to provide features to the classifier.

Use a dense layer with 100 units before the classification layer (which is also a dense layer with softmax activation).

In [None]:
from keras.backend import clear_session
clear_session()




In [None]:
# Your code here :

*   Compile the model using categorical_crossentropy loss, SGD optimizer and use 'accuracy' as the metric.
*   Use the above defined model to train CIFAR-10 and train the model for 50 epochs with a batch size of 512.

In [None]:
# Your code here :

*   Plot the cross entropy loss curve and the accuracy curve

In [None]:
# Your code here :

## Defining Deeper Architectures: VGG Models

*   Define a deeper model architecture for CIFAR-10 dataset and train the new model for 50 epochs with a batch size of 512. We will use VGG model as the architecture.

Stack two convolutional layers with 32 filters, each of 3 x 3.

Use a max pooling layer and next flatten the output of the previous layer and add a dense layer with 128 units before the classification layer.

For all the layers, use ReLU activation function.

Use same padding for the layers to ensure that the height and width of each layer output matches the input


In [None]:
from keras.backend import clear_session
clear_session()

In [None]:
# Your code here :

*   Compile the model using categorical_crossentropy loss, SGD optimizer and use 'accuracy' as the metric.
*   Use the above defined model to train CIFAR-10 and train the model for 50 epochs with a batch size of 512.

In [None]:
# Your code here :

*   Compare the performance of both the models by plotting the loss and accuracy curves of both the training steps. Does the deeper model perform better? Comment on the observation.


In [None]:
# Your code here :

**Comment on the observation**

*(Double-click or enter to edit)*

...

*   Use predict function to predict the output for the test split
*   Plot the confusion matrix for the new model and comment on the class confusions.


In [None]:
# Your code here :

**Comment here :**

*(Double-click or enter to edit)*

...

*    Print the test accuracy for the trained model.

In [None]:
# Your code here :

## Define the complete VGG architecture.

Stack two convolutional layers with 64 filters, each of 3 x 3 followed by max pooling layer.

Stack two more convolutional layers with 128 filters, each of 3 x 3, followed by max pooling, followed by two more convolutional layers with 256 filters, each of 3 x 3, followed by max pooling.

Flatten the output of the previous layer and add a dense layer with 128 units before the classification layer.

For all the layers, use ReLU activation function.

Use same padding for the layers to ensure that the height and width of each layer output matches the input

*   Change the size of input to 64 x 64.

In [None]:
from keras.backend import clear_session
clear_session()

In [None]:
# Your code here :

*   Compile the model using categorical_crossentropy loss, SGD optimizer and use 'accuracy' as the metric.
*   Use the above defined model to train CIFAR-10 and train the model for 10 epochs with a batch size of 512.
*   Predict the output for the test split and plot the confusion matrix for the new model and comment on the class confusions.

In [None]:
# Your code here :

# Understanding deep networks

*   What is the use of activation functions in network? Why is it needed?
*   We have used softmax activation function in the exercise. There are other activation functions available too. What is the difference between sigmoid activation and softmax activation?
*   What is the difference between categorical crossentropy and binary crossentropy loss?

**Write the answers below :**

1 - Use of activation functions:



_

2 - Key Differences between sigmoid and softmax:



_

3 - Key Differences between categorical crossentropy and binary crossentropy loss:


_
