# Homework

In [1]:
import numpy as np
import tensorflow as tf

SEED = 42
np.random.seed(SEED)
tf.random.set_seed(SEED)

### Model

For this homework we will use Convolutional Neural Network (CNN). Like in the lectures, we'll use Keras.

You need to develop the model with following structure:

* The shape for input should be `(200, 200, 3)`
* Next, create a convolutional layer ([`Conv2D`](https://keras.io/api/layers/convolution_layers/convolution2d/)):
    * Use 32 filters
    * Kernel size should be `(3, 3)` (that's the size of the filter)
    * Use `'relu'` as activation 
* Reduce the size of the feature map with max pooling ([`MaxPooling2D`](https://keras.io/api/layers/pooling_layers/max_pooling2d/))
    * Set the pooling size to `(2, 2)`
* Turn the multi-dimensional result into vectors using a [`Flatten`](https://keras.io/api/layers/reshaping_layers/flatten/) layer
* Next, add a `Dense` layer with 64 neurons and `'relu'` activation
* Finally, create the `Dense` layer with 1 neuron - this will be the output
    * The output layer should have an activation - use the appropriate activation for the binary classification case

As optimizer use [`SGD`](https://keras.io/api/optimizers/sgd/) with the following parameters:

* `SGD(lr=0.002, momentum=0.8)`

For clarification about kernel size and max pooling, check [Office Hours](https://www.youtube.com/watch?v=1WRgdBTUaAc).


In [2]:
from tensorflow import keras

In [3]:
model = keras.models.Sequential()

# Convolutional Layer (Conv2D)
model.add(keras.layers.Conv2D(filters=32, kernel_size=(3, 3), activation='relu', input_shape=(200, 200, 3)))

# Reduce the size of the feature map with max pooling (MaxPooling2D)
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))

# Turn the multi-dimensional result into vectors using a Flatten Layer
model.add(keras.layers.Flatten())

# Add a Dense layer with 64 neurons and 'relu' activation
model.add(keras.layers.Dense(64, activation='relu'))

# Add a Dense layer with 1 neuron
model.add(keras.layers.Dense(1, activation='sigmoid'))

model.compile(loss=keras.losses.BinaryCrossentropy(), optimizer=keras.optimizers.SGD(learning_rate=0.002, momentum=0.8),metrics=['accuracy'])

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


### Question 1

Since we have a binary classification problem, what is the best loss function for us?

* `mean squared error`
* `binary crossentropy`
* `categorical crossentropy`
* `cosine similarity`

> **Note:** since we specify an activation for the output layer, we don't need to set `from_logits=True`

A/ `binary crossentropy`

### Question 2

What's the total number of parameters of the model? You can use the `summary` method for that. 

* 896 
* 11214912
* 15896912
* 20072512

In [4]:
model.summary()

A/ `20,073,473`

### Generators and Training

For the next two questions, use the following data generator for both train and test sets:

```python
ImageDataGenerator(rescale=1./255)
```

* We don't need to do any additional pre-processing for the images.
* When reading the data from train/test directories, check the `class_mode` parameter. Which value should it be for a binary classification problem?
* Use `batch_size=20`
* Use `shuffle=True` for both training and test sets. 

For training use `.fit()` with the following params:

```python
model.fit(
    train_generator,
    epochs=10,
    validation_data=test_generator
)
```

In [5]:
train_generator = keras.preprocessing.image.ImageDataGenerator(rescale=1./255)
train_ds = train_generator.flow_from_directory(
    './homework-data/train', 
    target_size=(200, 200), 
    batch_size=20, 
    shuffle=True, 
    class_mode='binary' # since we use binary_crossentropy loss, we need binary labels
)

Found 800 images belonging to 2 classes.


In [6]:
test_generator = keras.preprocessing.image.ImageDataGenerator(rescale=1./255)
test_ds = test_generator.flow_from_directory(
    './homework-data/test', 
    target_size=(200, 200), 
    batch_size=20, 
    shuffle=True, 
    class_mode='binary' # since we use binary_crossentropy loss, we need binary labels
)

Found 201 images belonging to 2 classes.


In [7]:
history = model.fit(
    train_ds,
    epochs=10,
    validation_data=test_ds
)

  self._warn_if_super_not_called()


Epoch 1/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 123ms/step - accuracy: 0.5237 - loss: 0.7079 - val_accuracy: 0.5323 - val_loss: 0.6835
Epoch 2/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 119ms/step - accuracy: 0.6189 - loss: 0.6622 - val_accuracy: 0.6219 - val_loss: 0.6515
Epoch 3/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 108ms/step - accuracy: 0.7202 - loss: 0.5966 - val_accuracy: 0.6318 - val_loss: 0.6304
Epoch 4/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 110ms/step - accuracy: 0.6620 - loss: 0.5999 - val_accuracy: 0.6219 - val_loss: 0.6233
Epoch 5/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 116ms/step - accuracy: 0.6936 - loss: 0.5741 - val_accuracy: 0.6418 - val_loss: 0.6342
Epoch 6/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 108ms/step - accuracy: 0.7519 - loss: 0.5284 - val_accuracy: 0.6517 - val_loss: 0.6115
Epoch 7/10
[1m40/40[0m [3

### Question 3

What is the median of training accuracy for all the epochs for this model?

* 0.10
* 0.32
* 0.50
* 0.72

In [8]:
round(np.median(history.history['accuracy']), 2)

0.7

A/ `0.7`

### Question 4

What is the standard deviation of training loss for all the epochs for this model?

* 0.028
* 0.068
* 0.128
* 0.168

In [15]:
round(np.std(history.history['loss']), 3)

0.072

A/ `0.072`

### Data Augmentation

For the next two questions, we'll generate more data using data augmentations. 

Add the following augmentations to your training data generator:

* `rotation_range=50,`
* `width_shift_range=0.1,`
* `height_shift_range=0.1,`
* `zoom_range=0.1,`
* `horizontal_flip=True,`
* `fill_mode='nearest'`


In [10]:
augmented_train_generator = keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
    rotation_range=50,
    width_shift_range=0.1,
    height_shift_range=0.1,
    zoom_range=0.1,
    horizontal_flip=True,
    fill_mode='nearest'
)
augmented_train_ds = train_generator.flow_from_directory(
    './homework-data/train', 
    target_size=(200, 200), 
    batch_size=20, 
    shuffle=True, 
    class_mode='binary' # since we use binary_crossentropy loss, we need binary labels
)

Found 800 images belonging to 2 classes.


### Question 5 

Let's train our model for 10 more epochs using the same code as previously.
> **Note:** make sure you don't re-create the model - we want to continue training the model
we already started training.

What is the mean of test loss for all the epochs for the model trained with augmentations?

* 0.26
* 0.56
* 0.86
* 1.16


In [11]:
history = model.fit(
    augmented_train_ds,
    epochs=10,
    validation_data=test_ds
)

Epoch 1/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 105ms/step - accuracy: 0.7858 - loss: 0.4609 - val_accuracy: 0.6766 - val_loss: 0.6094
Epoch 2/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 105ms/step - accuracy: 0.7778 - loss: 0.4593 - val_accuracy: 0.6766 - val_loss: 0.5990
Epoch 3/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 102ms/step - accuracy: 0.8382 - loss: 0.3894 - val_accuracy: 0.6617 - val_loss: 0.6367
Epoch 4/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 106ms/step - accuracy: 0.8118 - loss: 0.4342 - val_accuracy: 0.6119 - val_loss: 0.7421
Epoch 5/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 108ms/step - accuracy: 0.8422 - loss: 0.3644 - val_accuracy: 0.6617 - val_loss: 0.6552
Epoch 6/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 107ms/step - accuracy: 0.8533 - loss: 0.3361 - val_accuracy: 0.7065 - val_loss: 0.5897
Epoch 7/10
[1m40/40[0m [3

In [16]:
round(np.mean(history.history['val_loss']), 2)

0.62

A/ `0.62`

### Question 6

What's the average of test accuracy for the last 5 epochs (from 6 to 10)
for the model trained with augmentations?

* 0.31
* 0.51
* 0.71
* 0.91

In [17]:
round(np.mean(history.history['val_accuracy']), 2)

0.69

A/ `0.69`