# Assignment II - Deep Learning for Computer Vision

This assignment is going to cover the topics of neural networks and computer vision from week III to VI.

## Description



You are given a dataset containing images of cats and dogs. Your goal is to classify them correctly. The data has been partly preprocessed for your convenience (this is a data science course, not a programming course!).

The data is split into four Numpy arrays: x_train, y_train, x_test, and y_test. These were preprocessed from .jpeg images. Notice that their values are between 0 to 255, consider if you need any further preprocessing.

You are required to train your model on (x_train, y_train) and finally report your accuracy score on (x_test, y_test). Of course, you are encouraged to further split (x_train, y_train) into train and validation datasets.

Conduct the following experiments:

1.   Begin with a simple model: four 2d convolutional layers followed by two fully connected layer (one hidden layer and one output layer). Between these hidden layers, you can use dropout, max-pooling, batch normalization, or any other non-trainable layer we've seen for regularization and dimensionality reduction. You can choose whatever sizes/numbers of filters/neurons you want.
2.   Build your own CNN. There are no constraints on the architecture, so just enjoy yourself and be as creative as you want.
3. Transfer learning: load a pre-trained model from [Keras applications](https://keras.io/api/applications/). Since these models were trained on Imagenet their weights might be useful. Add a fully-connected layer on top of the trained model and train it. Make sure you only train your FC layer and not the re-trained model!



## Submission guidelines







Practically the same guidelines as in Assignment I.

**This is a group project (3-4 members)**. Submissions must include the names of all students in the first paragraph.

Once you are done with the coding part on Google Colab, click "clear all outputs", followed by "reset and run all". This verifies that your code runs smoothly. You do not need to retrain the model each time, you can save the weights and just load the trained model.
**Please** make sure that the results and outputs are visible, and that the code is well documented with comments describing your actions.

The submission is done using Github. One of the team members has to open a Github account, and open a repository for the assignment. 
Proceed to upload the notebook to the Github repo. In Moodle, all you need to do is submit a link to the Github repo.


You are required to submit two different files: 
1.   Your Google Colab notebook with the code. 
2.   A CSV file called '**exercise2.csv**' containing the following fields: 
      *   Your **test** set accuracy
      *   Your **train** set accuracy
      *   Number of trainable parameters
      *   Number of layers
      *   Regularization methods (e.g. dropout, batch normalization etc., just list them all).
      *   Number of epochs
      *   Choice of loss function
      *   Choice of optimizer

  



You may use the following code to generate the CSV file:

In [1]:
import pandas as pd
# Keep keys the same, and replace values according to your results and the specified type 

results = {'model': ['Basic CNN', 'My Model', 'Name of pre-trained model from keras applications'],
           'Test score (Accuracy)': ['string1', 'string2', 'string3'],
           'Train score (Accuracy)': ['string1', 'string2', 'string3'],
           'Number of trainable parameters': ['int1', 'int2', 'int3'],
           'Number of layers': ['int1', 'int2', 'int3'],
           'Regularization methods': ['string1', 'string2', 'string3'],
           'Number of epochs': ['int1', 'int2', 'int3'],
           'Loss function': ['string1', 'string2', 'string3'],
           'Optimizer': ['string1', 'string2', 'string3']
           }

df = pd.DataFrame(results)
df

Unnamed: 0,model,Test score (Accuracy),Train score (Accuracy),Number of trainable parameters,Number of layers,Regularization methods,Number of epochs,Loss function,Optimizer
0,Basic CNN,string1,string1,int1,int1,string1,int1,string1,string1
1,My Model,string2,string2,int2,int2,string2,int2,string2,string2
2,Name of pre-trained model from keras applications,string3,string3,int3,int3,string3,int3,string3,string3


In [2]:
import os
df.to_csv(os.path.join(os.getcwd(), 'example.csv'))

## Tips and tricks

*   When computing the accuracy, don't forget to turn probabilities into labels! For example, if you're using softmax in the output layer: use argmax(); if you're using sigmoid: you can use round().
*   You should reach a valdiation accuracy of at least 70% in all three cases. It is possible to achieve accuracy higher than 95% in the latter two cases.
*   Use GPUs if you can, just don't forget to turn them off.
*   Around 20 epochs should be enough.
*   When using model checkpointing or saving models, don't forget to back up your results in Google Drive. Colab can sometimes collapse, and you wouldn't want to lose your results.

