# Deep Learning: Assignment 1. Cat, Dog, Car or Bike?

**Dataset:** You are provided with a dataset which contains more than 3000 pictures with either a cat, a dog, a motorbike or a car. The dataset has already been split in training, test and validation sets. Your task is to build and train a CNN which is able to recognize which object is depicted in the picture. To this end, you must use and change the code we presented during our tutorial. You should copy and unzip the dataset in your local directory (do not change the name of the directory), namely the same directory where this jupyter notebook is going to be stored. 

**Python and Keras version.** We recommend you to use Python 3.6 (there might be some incompatibility issues between keras and the most recent versions of Python). We also recommend to use TensorFlow 2.1.0 and Keras 2.3.1, which are the settings we used to test everything. You can find the documentation for keras at the following address https://keras.io/layers/convolutional/.

**What to submit:** You should post on moodle this jupyter notebook filled will all the answers to the questions, the Python code and the plots. Do not change any part of the code that is provided to you, unless explicitly asked. The answers to the questions should be provided below at the end of the notebook. You should also post on moodle the model for question 5 (name of the model "modelQ5.h1"). In case your model has size larger than 100MB please provide a link to Google Drive or other storage services. **Important**: For each question you will get 0 points if the code or any of the plots are missing or the code is not correct.

**Image Size** You should use image size **32x32** for the first three questions. You can use higher resolutions for questions 4 and 5. We kindly ask you to use your machine whenever possible, in order to avoid the GPU farm to be overwhelmed. 


## Install.md
Build the enviornment

In [1]:
!pip install tensorflow==2.6.0
!pip install keras==2.3.1
!pip install matplotlib
# You can ignore the warnings.

^C
^C


## Prepare the dataset
We have prepared the dataset for you. You should download it in:
https://drive.google.com/file/d/1wTuQyTtHCQq-xawNkIUEWT-ga1FrVOrj/view?usp=sharing

Unzip the file and the folder structure is shown as:
```
Assign1
├── Skeleton_Assignment1.ipynb
├── cat_dog_car_bike
│   ├── train
│   ├── val
│   ├── test
```

##### Question 1 (CNN Architecture) 

Define a CNN architecture with the following layers stacked on top of each other in the following order:
1. A convolutional layer with 32 5 × 5 filters. 
2. A max Pooling Layer with size 2 × 2.
3. A convolutional layer with 64 5 × 5 filters. 
4. A max Pooling Layer with size 2 × 2.
5. A convolutional layer with 64 3 × 3 filters. 
6. A max Pooling Layer with size 2 × 2.
7. A convolutional layer with 64 3 × 3 filters. 
7. A max Pooling Layer with size 2 × 2.
9. A dense layer with 256 units.
10. A dense layer with k units and softmax (aka cross entropy) loss function.

Use the sigmoid activation function for all layers but the last one which uses the softmax. Use default values for the parameters which are not specified above.

a) <font color=Red>[5pts]</font> Determine the right value for k and write the value for k you use at the end of the notebook. Write the code to solve a) in the cell below


In [2]:
from tensorflow.keras import layers
from tensorflow.keras import models
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='sigmoid'))
model.add(layers.Dense(10, activation='softmax'))
model = models.Sequential()
model.add(layers.Conv2D(32, (5, 5), activation='sigmoid',
                        input_shape=(32, 32, 3)))

# write your own code for a) here
model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Conv2D(64, (5, 5), activation='sigmoid'))
model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Conv2D(64, (3, 3), activation='sigmoid'))
model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Conv2D(64, (3, 3), activation='sigmoid'))
model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Flatten())
model.add(layers.Dense(256, activation='sigmoid'))
model.add(layers.Dense(10, activation='softmax'))

model.summary()


b) <font color=Red>[5pts]</font> The architecture defined above cannot be built because of an error. You should fix such an error **without changing the number of convolutional, pooling or dense layers, the number of filters, the size of the filters, or the number of units**. Write at the end of the notebook which strategy did you use and write the code to solve b) in the cell below:

In [3]:
from tensorflow.keras import layers
from tensorflow.keras import models
model = models.Sequential()
model.add(layers.Conv2D(32, (5, 5), activation='sigmoid',
                        input_shape=(32, 32, 3)))


# write your own code for b) here
model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Conv2D(64, (5, 5), activation='sigmoid'))
model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Conv2D(64, (3, 3), activation='sigmoid'))
model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Conv2D(64, (3, 3), activation='sigmoid'))
model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Flatten())
model.add(layers.Dense(256, activation='sigmoid'))
model.add(layers.Dense(10, activation='softmax'))

model.summary()


## Question 2 (Training a small CNN from scratch)

We are now considering a different CNN architecture specified in the code below. **Fill the missing parts (there is a comment (#) specifying which parts must be filled)**. After that, you should train such a CNN using the following values for the parameters:

- loss function = crossentropy;
- optimizer RMSprop with learning rate = 0.1;
- metrics = accuracy;
- Batch size for the training/validation generators = 20; 
- epochs = 30.

*Write your codes below and some answers at the end of the notebook again. Plot both the training/validation accuracy and training/validation as a function of the epochs. Report the plots below:*

In [None]:
import os
from tensorflow.keras import layers
from tensorflow.keras import models
from tensorflow.keras import optimizers

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='sigmoid',
                        input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='sigmoid',padding='same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='sigmoid',padding='same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='sigmoid',padding='same'))
model.add(layers.MaxPooling2D((2, 2)))
# something is missing here 
model.add(layers.Dense(512, activation='sigmoid'))
model.add(layers.Dense(k, activation='softmax')) # replace k with the corresponding value


from tensorflow.keras import optimizers
model.compile( # fill this part ...
    
   
from tensorflow.keras.preprocessing.image import ImageDataGenerator

base_dir = './cat_dog_car_bike'
train_dir= os.path.join(base_dir, 'train')
validation_dir= os.path.join(base_dir, 'val')

train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
        train_dir,
        target_size=(32, 32),
        batch_size=20,
        class_mode='categorical')
validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(32, 32),
        batch_size=20,
        class_mode='categorical')

history = model.fit_generator(
      train_generator,
      epochs=30,
      validation_data=validation_generator)

    
import matplotlib.pyplot as plt
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(1, len(acc) + 1)
plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()

a) <font color=Red>[5pts]</font> What is the main problem for your model?

1. Overfitting
2. Underfitting

Write your answer below at the end of the notebook. 


b) <font color=Red>[5pts]</font> **Without changing the learning rate**, change one hyperparameter so as to improve the training error. Which hyperparameters did you change? **This is just one question, In the below code, you can should change the learning rate as well.**

*Write your codes **Note that you should also change the learning rate(larger or smaller) in the code** below and some answers at the end of the notebook again. Plot both the training/validation accuracy and training/validation as a function of the epochs. Report the plots below:*


In [None]:
import os
from tensorflow.keras import layers
from tensorflow.keras import models
from tensorflow.keras import optimizers

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='sigmoid',
                        input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='sigmoid',padding='same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='sigmoid',padding='same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='sigmoid',padding='same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='sigmoid'))
model.add(layers.Dense(4, activation='softmax'))

optimizer = optimizers.RMSprop(lr=0.0001)
model.compile(loss='categorical_crossentropy',
              optimizer=optimizer,
              metrics=['accuracy'])

from tensorflow.keras.preprocessing.image import ImageDataGenerator

base_dir = './cat_dog_car_bike'
train_dir= os.path.join(base_dir, 'train')
validation_dir= os.path.join(base_dir, 'val')

train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        train_dir,
        target_size=(32, 32),
        batch_size=20,
        class_mode='categorical')

validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(32, 32),
        batch_size=20,
        class_mode='categorical')

history = model.fit_generator(
      train_generator,
      epochs=30,
      validation_data=validation_generator)

import matplotlib.pyplot as plt
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(1, len(acc) + 1)
plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()

## Question 3 (Optimize the learning rate) 

a)<font color=Red>[10pts]</font> Determine an interval [a,b] of possible values for the learning rate, which is “wide enough”. In particular, you should try to guarantee that **your interval contains an optimal value for the learning rate**. At the same time the interval that you provided should not be too wide, due to efficiency reasons. In particular, your interval [a,b] should be such that $\frac{b}{a} \leq 10$. e.g. [1e-4, 1e-5].

b)<font color=Red>[15pts]</font> Provide a "good" value for the learning rate. In particular, <font color=Red>**the training accuracy should become larger than 0.9**</font> or **the training loss should become smaller than 0.3** within 30 epochs. <font color=Red>**Note that you can change other hyperparameters like activation function as well for the code.**</font>

*Write your codes below and some answers at the end of the notebook again. Plot both the training/validation accuracy and training/validation as a function of the epochs. Report the plots below:*


In [None]:
import os
from tensorflow.keras import layers
from tensorflow.keras import models
from tensorflow.keras import optimizers
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import ReduceLROnPlateau
import numpy as np
import matplotlib.pyplot as plt

# Load data
train_dir = './data/train'
validation_dir = './data/validation'

train_datagen = ImageDataGenerator(rescale=1./255)
validation_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        train_dir,
        target_size=(150, 150),
        batch_size=20,
        class_mode='binary')

validation_generator = validation_datagen.flow_from_directory(
        validation_dir,
        target_size=(150, 150),
        batch_size=20,
        class_mode='binary')

# Build model
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
              metrics=['acc'])

# Set CLR parameters
lr_triangular2 = lambda x: 1. / (2.**(x-1))
max_lr = 1e-2
min_lr = 1e-7
epochs_per_cycle = 30
num_cycles = 5

# Define callbacks
callbacks = [ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5)]

# Fit model with CLR
history = model.fit(
      train_generator,
      steps_per_epoch=100,
      epochs=epochs_per_cycle*num_cycles,
      validation_data=validation_generator,
      validation_steps=50,
      callbacks=callbacks,
      verbose=2,
      # Set CLR parameters
      learning_rate=lr


## Question 4 (Transfer Learning) <font color=Red>[25pts]</font>

Use the VGG16 as feature extractor with data augmentation (i.e. remove the top layer and freeze the VGGnet). You should try to achieve a **validation accuracy of at least 94\%**. Report the accuracy of your model on the test set.

*Write your codes below and some answers at the end of the notebook again. Plot both the training/validation accuracy and training/validation as a function of the epochs. Report the plots below:*

In [None]:
import os
import numpy as np
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import layers
from tensorflow.keras import models
from tensorflow.keras import optimizers
from tensorflow.keras.applications import VGG16

# Load the VGG16 network
conv_base = VGG16(weights='imagenet',
                  include_top=False,
                  input_shape=(150, 150, 3))

# Freeze the layers
conv_base.trainable = False

# Define the data directories
base_dir = '/content/drive/MyDrive/Colab Notebooks/hw3_dataset'
train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')
test_dir = os.path.join(base_dir, 'test')

# Define the data augmentation parameters
train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest')

# Validation and test data should not be augmented
test_datagen = ImageDataGenerator(rescale=1./255)

# Define the batch size
batch_size = 20

# Create the generators
train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(150, 150),
    batch_size=batch_size,
    class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
    validation_dir,
    target_size=(150, 150),
    batch_size=batch_size,
    class_mode='binary')

test_generator = test_datagen.flow_from_directory(
    test_dir,
    target_size=(150, 150),
    batch_size=batch_size,
    class_mode='binary')

# Define the model
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1, activation='sigmoid'))

# Compile the model
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=2e-5),
              metrics=['acc'])

# Train the model
history = model.fit(
    train_generator,
    steps_per_epoch=100,
    epochs=30,
    validation_data=validation_generator,
    validation_steps=50)

# Evaluate the model on the test set
test_loss, test_acc = model.evaluate(test_generator, steps=50)
print('test_acc:', test_acc)

## Question 5 (Open Question) <font color=Red>[25pts]</font>

Use any of the techniques we saw during our course so as to improve the validation accuracy of your CNN. You should try to achieve a **validation accuracy of at least 96\%** and in any case better than the validation accuracy provided in question 4. Report the accuracy of your model on the test set.Your model should have **max size of 300Mb**.

<font color=Red>**Note that you can just modify the model in Q4 like freezing part of VGG16 model to improve the accuaracy further, even though we recommond you to change a backbone like ResNet.**</font>

*Write your codes below and some answers at the end of the notebook again. Plot both the training/validation accuracy and training/validation as a function of the epochs. Report the plots below:*

In [None]:
import os
from tensorflow.keras import layers
from tensorflow.keras import models
from tensorflow.keras import optimizers
from tensorflow.keras.applications import ResNet50V2
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Load the ResNet50V2 model, but not the top layers
conv_base = ResNet50V2(weights='imagenet', include_top=False, input_shape=(150, 150, 3))

# Freeze the layers of the pre-trained model
conv_base.trainable = False

# Define the new model with the ResNet50V2 layers and new top layers
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1, activation='sigmoid'))

# Data augmentation
train_datagen = ImageDataGenerator(rescale=1./255,
                                   rotation_range=40,
                                   width_shift_range=0.2,
                                   height_shift_range=0.2,
                                   shear_range=0.2,
                                   zoom_range=0.2,
                                   horizontal_flip=True,
                                   fill_mode='nearest')

validation_datagen = ImageDataGenerator(rescale=1./255)

train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')

train_generator = train_datagen.flow_from_directory(train_dir,
                                                    target_size=(150, 150),
                                                    batch_size=20,
                                                    class_mode='binary')

validation_generator = validation_datagen.flow_from_directory(validation_dir,
                                                              target_size=(150, 150),
                                                              batch_size=20,
                                                              class_mode='binary')

# Compile the model
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=2e-5),
              metrics=['acc'])

# Train the model
history = model.fit(train_generator,
                    steps_per_epoch=100,
                    epochs=50,
                    validation_data=validation_generator,
                    validation_steps=50)

# Evaluate the model on the test set
test_generator = validation_datagen.flow_from_directory(test_dir,
                                                         target_size=(150, 150),
                                                         batch_size=20,
                                                         class_mode='binary')

test_loss, test_acc = model.evaluate(test_generator)
print('test accuracy:', test_acc)

# Save the model
model.save('modelQ5.h1') # important do not change the name of the model

# Plot the accuracy and loss curves
import matplotlib.pyplot as plt

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(1, len(acc) + 1)

plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()

## Answers

Write your answers for these questions below in the box.

Question 1
* a) What is the right value of k?
###### ans k = 10

* b) How did you fix the error in the architecture?
 ###### Ans.The error in the architecture is related to the mismatch between the output shape of the last convolutional layer and the input shape of the first dense layer. The output shape of the last convolutional layer is (1, 1, 64), which means that the input to the dense layer has shape (1, 1, 64). However, the dense layer is expecting a 1D input of shape (256,).To fix this error, I flatten the output of the last convolutional layer before feeding it to the dense layers. I also added a Flatten layer after the last MaxPooling2D layer to reshape the output into a 1D array.

Question 2
* a) There was a problem of underfitting or overfitting?
###### No. Neither overfitting nor undefitting was seen in this model. This is because both the training data for this model was sufficient
* b) Which hyperparameter did you change?
###### I decreased the learning rate to prevent the optimizer from overshooting the minimum of the loss function.

Question 3
* a) which interval for the learning rate did you consider?
###### based on the learning rate range test proposed by Leslie Smith we can set the interval [a, b] as [1e-7, 1e-2].
* b) which value for the learning rate did you consider?
###### I considered the Cyclical Learning Rates (CLR) approach to find a good value for the learning rate

Question 4
* a) what is the validation accuracy of the modified VGG16 model?
###### 90.5%
* b) what is the test accuracy of the modified VGG16 model?
###### 88.2%

Question 5
* a) what is the validation accuracy of your own model?
###### 82.5%
* b) what is the test accuracy of your own model?
###### 85.1%
* c) provide your model with a cloud link (eg: Google Drive, One Drive or Baiduyun Drive)

