In [1]:
import os
os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'

import numpy as np 
import matplotlib.pyplot as plt 
%matplotlib inline
%config InlineBackend.figure_format='retina'
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.preprocessing.image import load_img

# pretrained model
from tensorflow.keras.applications.xception import Xception
from tensorflow.keras.applications.xception import preprocess_input
from tensorflow.keras.applications.xception import decode_predictions

from tensorflow.keras.preprocessing.image import ImageDataGenerator

In [2]:
from tensorflow.keras.layers import Conv2D, MaxPool2D, Flatten, Dense

### Data Preparation

The dataset contains around 2500 images of bees and around 2100 images of wasps. 

The dataset contains separate folders for training and test sets. 


### Model

For this homework we will use Convolutional Neural Network (CNN). Like in the lectures, we'll use Keras.

You need to develop the model with following structure:

* The shape for input should be `(150, 150, 3)`
* Next, create a convolutional layer ([`Conv2D`](https://keras.io/api/layers/convolution_layers/convolution2d/)):
    * Use 32 filters
    * Kernel size should be `(3, 3)` (that's the size of the filter)
    * Use `'relu'` as activation 
* Reduce the size of the feature map with max pooling ([`MaxPooling2D`](https://keras.io/api/layers/pooling_layers/max_pooling2d/))
    * Set the pooling size to `(2, 2)`
* Turn the multi-dimensional result into vectors using a [`Flatten`](https://keras.io/api/layers/reshaping_layers/flatten/) layer
* Next, add a `Dense` layer with 64 neurons and `'relu'` activation
* Finally, create the `Dense` layer with 1 neuron - this will be the output
    * The output layer should have an activation - use the appropriate activation for the binary classification case

As optimizer use [`SGD`](https://keras.io/api/optimizers/sgd/) with the following parameters:

* `SGD(lr=0.002, momentum=0.8)`

For clarification about kernel size and max pooling, check [Office Hours](https://www.youtube.com/watch?v=1WRgdBTUaAc).


In [4]:
# create a model
cnn = tf.keras.models.Sequential()

In [5]:
# add a convolutional layer
cnn.add(Conv2D(filters=32, kernel_size=(3, 3), activation="relu", input_shape=(150, 150, 3)))

In [6]:
# Reduce the size of the feature map with max pooling (MaxPooling2D)
# Set the pooling size to (2, 2)
cnn.add(MaxPool2D(pool_size=(2, 2)))

In [7]:
# Turn the multi-dimensional result into vectors using a Flatten layer
cnn.add(Flatten())

In [8]:
# Next, add a Dense layer with 64 neurons and 'relu' activation
cnn.add(Dense(64, activation="relu"))

In [9]:
# Finally, create the Dense layer with 1 neuron - this will be the output
# The output layer should have an activation - use the appropriate activation for the binary classification case
cnn.add(Dense(units=1, activation='sigmoid'))

In [10]:
# As optimizer use SGD with the following parameters:
# SGD(lr=0.002, momentum=0.8)
optimizer = keras.optimizers.SGD(learning_rate=0.002, momentum=0.8)

### Question 1

Since we have a binary classification problem, what is the best loss function for us?

* `mean squared error`
* __`binary crossentropy`__ -> correct
* `categorical crossentropy`
* `cosine similarity`

> **Note:** since we specify an activation for the output layer, we don't need to set `from_logits=True`

In [11]:
loss = keras.losses.BinaryCrossentropy()

### Question 2

What's the number of parameters in the convolutional layer of our model? You can use the `summary` method for that. 

* 1 
* 65
* __896__ -> correct
* 11214912

In [12]:
cnn.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 148, 148, 32)      896       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 74, 74, 32)       0         
 )                                                               
                                                                 
 flatten (Flatten)           (None, 175232)            0         
                                                                 
 dense (Dense)               (None, 64)                11214912  
                                                                 
 dense_1 (Dense)             (None, 1)                 65        
                                                                 
Total params: 11,215,873
Trainable params: 11,215,873
Non-trainable params: 0
__________________________________________

In [13]:
# compile the model

# Compile the model
cnn.compile(optimizer=optimizer,
              loss=loss,
              metrics=['accuracy'])

### Generators and Training

For the next two questions, use the following data generator for both train and test sets:

```python
ImageDataGenerator(rescale=1./255)
```

* We don't need to do any additional pre-processing for the images.
* When reading the data from train/test directories, check the `class_mode` parameter. Which value should it be for a binary classification problem?
* Use `batch_size=20`
* Use `shuffle=True` for both training and test sets. 

For training use `.fit()` with the following params:

```python
model.fit(
    train_generator,
    epochs=10,
    validation_data=test_generator
)
```

In [14]:
# prepare data for the model

# train data
train_gen = ImageDataGenerator(rescale=1./255)
train_ds = train_gen.flow_from_directory('./data/train/', 
                    class_mode="binary",
                    target_size=(150, 150), 
                    batch_size=20,
                    shuffle=True)
# test data
test_gen = ImageDataGenerator(rescale=1./255)
test_ds = test_gen.flow_from_directory('./data/test/', 
                    class_mode="binary",
                    target_size=(150, 150), 
                    batch_size=20,
                    shuffle=True) 

Found 3677 images belonging to 2 classes.
Found 918 images belonging to 2 classes.


In [15]:
# check classes
train_ds.class_indices

{'bee': 0, 'wasp': 1}

In [16]:
# check the shape of X input
next(train_ds)[0].shape

(20, 150, 150, 3)

In [17]:
# import random

RANDOM_SEED = 42
# np.random.seed(RANDOM_SEED)
# random.seed(RANDOM_SEED)
keras.utils.set_random_seed(RANDOM_SEED)

In [19]:
# fit the model

history = cnn.fit(
    train_ds,
    epochs=10,
    validation_data=test_ds
)

Epoch 1/10


2023-11-19 22:10:15.670650: I tensorflow/stream_executor/cuda/cuda_dnn.cc:384] Loaded cuDNN version 8100
2023-11-19 22:10:16.300238: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2023-11-19 22:10:16.300740: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2023-11-19 22:10:16.300773: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Couldn't invoke ptxas --version
2023-11-19 22:10:16.301264: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2023-11-19 22:10:16.301356: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] INTERNAL: Failed to launch ptxas
Relying on driver to perform ptx compilation. 
Modify $PATH to customize ptxas location.
This message will be only logged once.


Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [20]:
history.history

{'loss': [0.6823903918266296,
  0.6279281377792358,
  0.5773422718048096,
  0.5389232039451599,
  0.5116524696350098,
  0.49900102615356445,
  0.4663781523704529,
  0.43523722887039185,
  0.41161584854125977,
  0.38767746090888977],
 'accuracy': [0.5572477579116821,
  0.639651894569397,
  0.6997552514076233,
  0.738645613193512,
  0.7593146562576294,
  0.7696491479873657,
  0.7952134609222412,
  0.811531126499176,
  0.8224095702171326,
  0.8365515470504761],
 'val_loss': [0.6362471580505371,
  0.5790044069290161,
  0.5437665581703186,
  0.5401648879051208,
  0.5949493646621704,
  0.5085054039955139,
  0.5886616110801697,
  0.5128922462463379,
  0.48492664098739624,
  0.5107535123825073],
 'val_accuracy': [0.6209150552749634,
  0.7156862616539001,
  0.7233115434646606,
  0.7200435996055603,
  0.6928104758262634,
  0.7450980544090271,
  0.7080609798431396,
  0.7516340017318726,
  0.7766884565353394,
  0.7461873888969421]}

### Question 3

What is the median of training accuracy for all the epochs for this model?

* 0.20
* 0.40
* 0.60
* 0.80 -> approximate result

In [23]:
np.median(history.history['accuracy'])

0.7644819021224976

### Question 4

What is the standard deviation of training loss for all the epochs for this model?

* 0.031
* 0.061
* __0.091__ -> approx 
* 0.131

In [24]:
np.std(history.history["loss"])

0.09006098124296918

### Data Augmentation

For the next two questions, we'll generate more data using data augmentations. 

Add the following augmentations to your training data generator:

* `rotation_range=50,`
* `width_shift_range=0.1,`
* `height_shift_range=0.1,`
* `zoom_range=0.1,`
* `horizontal_flip=True,`
* `fill_mode='nearest'`

In [25]:
# re-generate train data with Data Augmentation
train_gen = ImageDataGenerator(rescale=1./255,
                            rotation_range=50,
                            width_shift_range=0.1,
                            height_shift_range=0.1,
                            zoom_range=0.1,
                            horizontal_flip=True,
                            fill_mode='nearest'
                            )
train_ds = train_gen.flow_from_directory('./data/train/', 
                    class_mode="binary",
                    target_size=(150, 150), 
                    batch_size=20,
                    shuffle=True)

Found 3677 images belonging to 2 classes.


### Question 5 

Let's train our model for 10 more epochs using the same code as previously.
> **Note:** make sure you don't re-create the model - we want to continue training the model
we already started training.

What is the mean of test loss for all the epochs for the model trained with augmentations?

* 0.18
* __0.48__ -> correct
* 0.78
* 0.108

In [27]:
history2 = cnn.fit(
    train_ds,
    epochs=10,
    validation_data=test_ds
)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [28]:
np.mean(history2.history["val_loss"])

0.4727381467819214

### Question 6

What's the average of test accuracy for the last 5 epochs (from 6 to 10)
for the model trained with augmentations?

* 0.38
* 0.58
* __0.78__
* 0.98

In [33]:
np.mean(history2.history["val_accuracy"][-5:])

0.7784313797950745

Try to build the model different way.

In [3]:
inputs = keras.Input(shape=(150, 150, 3))
conv = Conv2D(filters=32, kernel_size=(3, 3), activation="relu", input_shape=(150, 150, 3))(inputs)
pooling = MaxPool2D(pool_size=(2, 2))(conv)
flat = Flatten()(pooling)
inner = Dense(64, activation="relu")(flat)
outputs = Dense(units=1, activation='sigmoid')(inner)
model = keras.Model(inputs, outputs)

optimizer1 = keras.optimizers.SGD(learning_rate=0.002, momentum=0.8)
loss1 = keras.losses.BinaryCrossentropy()

2023-11-19 22:50:59.567119: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-11-19 22:50:59.574027: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-11-19 22:50:59.574617: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-11-19 22:50:59.575572: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the approp

In [4]:
model.compile(optimizer=optimizer1,
              loss=loss1,
              metrics=['accuracy'])

In [5]:
# train data
train_gen1 = ImageDataGenerator(rescale=1./255)
train_ds1 = train_gen1.flow_from_directory('./data/train/', 
                    class_mode="binary",
                    target_size=(150, 150), 
                    batch_size=20,
                    shuffle=True)
# test data
test_gen1 = ImageDataGenerator(rescale=1./255)
test_ds1 = test_gen1.flow_from_directory('./data/test/', 
                    class_mode="binary",
                    target_size=(150, 150), 
                    batch_size=20,
                    shuffle=True) 

Found 3677 images belonging to 2 classes.
Found 918 images belonging to 2 classes.


### Q.2
Number of parameters in Conv2D - 896

In [6]:
model.summary()

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 150, 150, 3)]     0         
                                                                 
 conv2d (Conv2D)             (None, 148, 148, 32)      896       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 74, 74, 32)       0         
 )                                                               
                                                                 
 flatten (Flatten)           (None, 175232)            0         
                                                                 
 dense (Dense)               (None, 64)                11214912  
                                                                 
 dense_1 (Dense)             (None, 1)                 65        
                                                             

In [7]:
# train the model
history3 = model.fit(
    train_ds1,
    epochs=10,
    validation_data=test_ds1
)

Epoch 1/10


2023-11-19 22:52:11.894750: I tensorflow/stream_executor/cuda/cuda_dnn.cc:384] Loaded cuDNN version 8100
2023-11-19 22:52:12.523482: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2023-11-19 22:52:12.523973: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2023-11-19 22:52:12.524004: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Couldn't invoke ptxas --version
2023-11-19 22:52:12.524522: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2023-11-19 22:52:12.524590: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] INTERNAL: Failed to launch ptxas
Relying on driver to perform ptx compilation. 
Modify $PATH to customize ptxas location.
This message will be only logged once.


Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


### Q.3
median of training accuracy for all the epochs for this model

In [8]:
np.median(history3.history['accuracy']) # same result 0.8

0.7642099559307098

### Q.4
the standard deviation of training loss

In [9]:
np.std(history3.history["loss"]) # approx 0.091

0.09209891809690528

Data Augmentation

In [10]:
# re-generate train data with Data Augmentation
train_gen2 = ImageDataGenerator(rescale=1./255,
                            rotation_range=50,
                            width_shift_range=0.1,
                            height_shift_range=0.1,
                            zoom_range=0.1,
                            horizontal_flip=True,
                            fill_mode='nearest'
                            )
train_ds2 = train_gen2.flow_from_directory('./data/train/', 
                    class_mode="binary",
                    target_size=(150, 150), 
                    batch_size=20,
                    shuffle=True)

Found 3677 images belonging to 2 classes.


In [11]:
history4 = model.fit(
    train_ds2,
    epochs=10,
    validation_data=test_ds1
)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


 ### Q.5
 test loss for all the epochs for the model trained with augmentations

In [16]:
np.mean(history4.history["val_loss"]) # 0.48

0.5271378576755523

### Q.6
the average of test accuracy for the last 5 epochs

In [14]:
np.mean(history4.history["val_accuracy"][-5:])

0.7535947680473327