# Session 08 Homework

### Dataset

In this homework, we'll build a model for classifying various hair types. 
For this, we will use the Hair Type dataset that was obtained from 
[Kaggle](https://www.kaggle.com/datasets/kavyasreeb/hair-type-dataset) 
and slightly rebuilt. 

We can download the target dataset for this homework from 
[here](https://github.com/SVizor42/ML_Zoomcamp/releases/download/straight-curly-data/data.zip):


In [1]:
# Download and unzip the data
!wget https://github.com/SVizor42/ML_Zoomcamp/releases/download/straight-curly-data/data.zip
!unzip data.zip

--2024-12-06 21:59:09--  https://github.com/SVizor42/ML_Zoomcamp/releases/download/straight-curly-data/data.zip
Resolving github.com (github.com)... 140.82.113.4
Connecting to github.com (github.com)|140.82.113.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/405934815/e712cf72-f851-44e0-9c05-e711624af985?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20241206%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20241206T215909Z&X-Amz-Expires=300&X-Amz-Signature=4fe9ce60337b03b37081c2062b6a2a7c885f173970314bfacc3f2648bd740b4f&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B%20filename%3Ddata.zip&response-content-type=application%2Foctet-stream [following]
--2024-12-06 21:59:09--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/405934815/e712cf72-f851-44e0-9c05-e711624af985?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Cr

In the lectures we saw how to use a pre-trained neural network. In the homework, we'll train a much smaller model from scratch. 

> **Note:** We will use [Saturn Cloud](https://bit.ly/saturn-mlzoomcamp) environment for its GPU.


### Data Preparation

The dataset contains around 1000 images of hairs in the separate folders 
for training and test sets. 

### Reproducibility

Reproducibility in deep learning is a multifaceted challenge that requires attention 
to both software and hardware details. In some cases, we can't guarantee exactly 
the same results during the same experiment runs. Therefore, in this homework we will:
* install tensorflow version 2.17.1
* set the seed generators by.

In [2]:
# import tensorflow and numpy
import numpy as np
import tensorflow as tf

# Set the seed
SEED = 42
np.random.seed(SEED)
tf.random.set_seed(SEED)

2024-12-06 21:59:11.676371: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-12-06 21:59:11.693878: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1733522351.713489   67399 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1733522351.719427   67399 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-06 21:59:11.739183: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instr

In [3]:
# Tensorflow version
tf.__version__

'2.18.0'

Note that using this version instead of `2.17.1` is also fine for this homework.

### Model

For this homework, we will use Convolutional Neural Network (CNN). Like in the lectures, we'll use Keras.

We need to develop the model with following structure:

* The shape for input should be `(200, 200, 3)`
* Next, we create a convolutional layer ([`Conv2D`](https://keras.io/api/layers/convolution_layers/convolution2d/)):
    * We use 32 filters
    * Kernel size should be `(3, 3)` (that's the size of the filter)
    * We use `'relu'` as activation 
* We reduce the size of the feature map with max pooling ([`MaxPooling2D`](https://keras.io/api/layers/pooling_layers/max_pooling2d/))
    * We set the pooling size to `(2, 2)`
* We turn the multi-dimensional result into vectors using a [`Flatten`](https://keras.io/api/layers/reshaping_layers/flatten/) layer
* Next, we add a `Dense` layer with 64 neurons and `'relu'` activation
* Finally, we create the `Dense` layer with 1 neuron - this will be the output
    * The output layer has a `sigmoid` activation, appropriate for the binary classification case

As optimizer use [`SGD`](https://keras.io/api/optimizers/sgd/) with the following parameters:

* `SGD(lr = 0.002, momentum = 0.8)`

For clarification about kernel size and max pooling, check [Office Hours](https://www.youtube.com/watch?v=1WRgdBTUaAc).


In [4]:
# import keras
from tensorflow import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator

### Question 1

Since we have a binary classification problem, the best loss function for us `binary crossentropy`.

> **Note:** since we specify an activation for the output layer, we don't need to set `from_logits=True`


In [5]:
# Make a function to create a CNN
def make_model(input_size = 200, size_inner = 64, learning_rate = 0.002, momentum = 0.8):
    # Specify the inputs (part of the model that receives the images)
    inputs = keras.Input(shape = (input_size, input_size, 3))
    
    # Create a convolutional layer
    conv1 = keras.layers.Conv2D(filters = 32, kernel_size = (3, 3), activation = "relu")(inputs)
    # Maximum Pooling layer to reduce the dimensionality
    max_pool1 = keras.layers.MaxPooling2D(pool_size = (2, 2))(conv1)
    # Add a flatten layer
    vectors = keras.layers.Flatten()(max_pool1)
    # Add an inner layer
    inner = keras.layers.Dense(size_inner, activation = 'relu')(vectors)
    
    # Dense layer for the output
    outputs = keras.layers.Dense(1, activation = "sigmoid")(inner)
    # model
    model = keras.Model(inputs, outputs)
    
    # Set the optimizer
    optimizer = keras.optimizers.SGD(learning_rate = learning_rate, momentum = momentum)
    # Set the loss function
    loss = keras.losses.BinaryCrossentropy()
    
    # Compile everything in our model, setting optimizer, loss, and metric's evaluation
    model.compile(
        optimizer = optimizer,
        loss = loss,
        metrics = ['accuracy']
    )

    # return model
    return model

### Question 2

The total number of parameters of the model is: `20072512`.

In [6]:
# Set the input size for images
input_size = 200

In [7]:
# Build our model
model = make_model(input_size = input_size, size_inner = 64, learning_rate = 0.002, momentum = 0.8)

# number of parameters of the model
model.summary()

I0000 00:00:1733522354.359823   67399 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 689 MB memory:  -> device: 0, name: Tesla T4, pci bus id: 0000:00:1e.0, compute capability: 7.5


### Generators and Training

For the next two questions, we will use the following data generator for both train and test sets:

```python
ImageDataGenerator(rescale = 1./255)
```

* We don't need to do any additional pre-processing for the images.
* When reading the data from train/test directories, we check the `class_mode` parameter. For a binary classification problem, it should be `'binary'`.
* We use `batch_size = 20` and `shuffle = True` for both training and test sets. 

In [8]:
# Initialize training data generator
train_gen = ImageDataGenerator(rescale = 1./255)
# Extract training images
train_ds = train_gen.flow_from_directory(
    './data/train',
    class_mode = 'binary',
    target_size = (input_size, input_size),
    batch_size = 20,
    shuffle = True
)

# Initialize validation data generator
test_gen = ImageDataGenerator(rescale = 1./255)
# Extract validation images
test_ds = test_gen.flow_from_directory(
    './data/test',
    class_mode = 'binary',
    target_size = (input_size, input_size),
    batch_size = 20,
    shuffle = True
)

Found 800 images belonging to 2 classes.
Found 201 images belonging to 2 classes.


For training we use `.fit()` with the following params:

```python
model.fit(
    train_ds,
    epochs = 10,
    validation_data = test_ds
)
```

In [9]:
# Model training
history = model.fit(train_ds, epochs = 10, validation_data = test_ds)

  self._warn_if_super_not_called()


Epoch 1/10


I0000 00:00:1733522356.172600   67456 service.cc:148] XLA service 0x7f8994003970 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1733522356.172628   67456 service.cc:156]   StreamExecutor device (0): Tesla T4, Compute Capability 7.5
2024-12-06 21:59:16.191952: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:268] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
I0000 00:00:1733522356.273427   67456 cuda_dnn.cc:529] Loaded cuDNN version 90300
2024-12-06 21:59:16.560116: I external/local_xla/xla/service/gpu/autotuning/conv_algorithm_picker.cc:557] Omitted potentially buggy algorithm eng14{k25=0} for conv (f32[20,32,198,198]{3,2,1,0}, u8[0]{0}) custom-call(f32[20,3,200,200]{3,2,1,0}, f32[32,3,3,3]{3,2,1,0}, f32[32]{0}), window={size=3x3}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"cudnn_conv_backend_config":{"activation_mode":"

[1m 3/40[0m [32m━[0m[37m━━━━━━━━━━━━━━━━━━━[0m [1m2s[0m 74ms/step - accuracy: 0.6333 - loss: 0.6711

I0000 00:00:1733522357.692391   67456 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 105ms/step - accuracy: 0.6052 - loss: 0.6790

2024-12-06 21:59:22.376247: I external/local_xla/xla/service/gpu/autotuning/conv_algorithm_picker.cc:557] Omitted potentially buggy algorithm eng14{k25=0} for conv (f32[20,32,198,198]{3,2,1,0}, u8[0]{0}) custom-call(f32[20,3,200,200]{3,2,1,0}, f32[32,3,3,3]{3,2,1,0}, f32[32]{0}), window={size=3x3}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"cudnn_conv_backend_config":{"activation_mode":"kRelu","conv_result_scale":1,"leakyrelu_alpha":0,"side_input_scale":0},"force_earliest_schedule":false,"operation_queue_id":"0","wait_on_operation_queues":[]}
2024-12-06 21:59:22.401027: W external/local_xla/xla/tsl/framework/bfc_allocator.cc:306] Allocator (GPU_0_bfc) ran out of memory trying to allocate 338.50MiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.


[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 160ms/step - accuracy: 0.6052 - loss: 0.6787 - val_accuracy: 0.6219 - val_loss: 0.6538
Epoch 2/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 125ms/step - accuracy: 0.6691 - loss: 0.6062 - val_accuracy: 0.6517 - val_loss: 0.6239
Epoch 3/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 128ms/step - accuracy: 0.6543 - loss: 0.5794 - val_accuracy: 0.6368 - val_loss: 0.6199
Epoch 4/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 126ms/step - accuracy: 0.7158 - loss: 0.5402 - val_accuracy: 0.6617 - val_loss: 0.6064
Epoch 5/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 125ms/step - accuracy: 0.7622 - loss: 0.5017 - val_accuracy: 0.6468 - val_loss: 0.6160
Epoch 6/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 125ms/step - accuracy: 0.7704 - loss: 0.4794 - val_accuracy: 0.6368 - val_loss: 0.6854
Epoch 7/10
[1m40/40[0m [32m━━━━━━━━━

### Question 3

The median of training accuracy for all the epochs for this model is `0.72`.

In [10]:
# Median of accuracy scores
np.median(history.history['accuracy'])

np.float64(0.7437500059604645)

### Question 4

The standard deviation of training loss for all the epochs for this model is `0.068`.

In [11]:
# Standard deviation of training loss
np.std(history.history['loss'])

np.float64(0.06971815611883433)

### Data Augmentation

For the next two questions, we'll generate more data using data augmentations. 

Let's add the following augmentations to our training data generator:

* `rotation_range = 50,`
* `width_shift_range = 0.1,`
* `height_shift_range = 0.1,`
* `zoom_range = 0.1,`
* `horizontal_flip = True,`
* `fill_mode = 'nearest'`

In [12]:
# Initialize training data generator
train_gen = ImageDataGenerator(
    rescale = 1./255,
    rotation_range = 50,
    width_shift_range = 0.1,
    height_shift_range = 0.1,
    zoom_range = 0.1,
    horizontal_flip = True,
    fill_mode = 'nearest'
)
# Extract training images
train_ds = train_gen.flow_from_directory(
    './data/train',
    class_mode = 'binary',
    target_size = (input_size, input_size),
    batch_size = 20,
    shuffle = True
)

Found 800 images belonging to 2 classes.


### Question 5 

Let's train our model for 10 more epochs using the same code as previously.
> **Note:** We will make sure not to re-create the model - we want to continue training the model
we already started training.

In [13]:
# Model training
history = model.fit(train_ds, epochs = 10, validation_data = test_ds)

Epoch 1/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 267ms/step - accuracy: 0.6869 - loss: 0.5877 - val_accuracy: 0.7114 - val_loss: 0.5779
Epoch 2/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 261ms/step - accuracy: 0.7054 - loss: 0.5624 - val_accuracy: 0.7164 - val_loss: 0.5609
Epoch 3/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 261ms/step - accuracy: 0.6801 - loss: 0.5855 - val_accuracy: 0.7264 - val_loss: 0.5455
Epoch 4/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 254ms/step - accuracy: 0.6957 - loss: 0.5776 - val_accuracy: 0.6965 - val_loss: 0.5914
Epoch 5/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 262ms/step - accuracy: 0.6787 - loss: 0.5961 - val_accuracy: 0.7463 - val_loss: 0.5282
Epoch 6/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 261ms/step - accuracy: 0.7334 - loss: 0.5585 - val_accuracy: 0.6468 - val_loss: 0.6937
Epoch 7/10
[1m40/40[

The mean of test loss for all the epochs for the model trained with augmentations is `0.56`.

In [14]:
# Mean of the test loss
np.mean(history.history['val_loss'])

np.float64(0.5809308767318726)

### Question 6

The average of test accuracy for the last 5 epochs (from 6 to 10)
for the model trained with augmentations is `0.71`.

In [15]:
# Average of the test accuracy for the last 5 epochs
np.mean(history.history['val_accuracy'][5:])

np.float64(0.6935323476791382)

---