In [1]:
from tqdm import tqdm
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style("darkgrid")

%matplotlib inline

In this homework, we'll build a model for predicting if we have an image of a bee or a wasp. For this, we will use the "Bee or Wasp?" dataset that was obtained from [Kaggle](https://www.kaggle.com/datasets/jerzydziewierz/bee-vs-wasp) and slightly rebuilt.

You can download the dataset for this homework from [here](https://github.com/SVizor42/ML_Zoomcamp/releases/download/bee-wasp-data/data.zip) or use the commands in this notebook.

In [2]:
# !wget -c https://github.com/SVizor42/ML_Zoomcamp/releases/download/bee-wasp-data/data.zip
# !unzip data.zip 

## Data Preparation

- The dataset contains around 2500 images of bees and around 2100 images of wasps.
- The dataset contains separate folders for training and test sets.



## Model

The architecture of the model required for this section can be found in text form in the homework-file.

In [3]:
import tensorflow as tf
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import Sequential


print("GPUs: ", len(tf.config.list_physical_devices('GPU')))

2023-11-14 15:33:05.056284: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-11-14 15:33:05.056307: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-11-14 15:33:05.056330: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-11-14 15:33:05.061924: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


GPUs:  1


2023-11-14 15:33:06.227333: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:894] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-11-14 15:33:06.231294: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:894] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-11-14 15:33:06.231433: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:894] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysf

In [4]:
model = Sequential([    
    keras.Input(shape=(150, 150, 3)),
    keras.layers.Conv2D(
        filters=32,
        kernel_size=(3, 3), 
        activation="relu"), 
    keras.layers.MaxPooling2D(pool_size=(2, 2)),
    keras.layers.Flatten(),
    keras.layers.Dense(units=64, activation="relu"),
    keras.layers.Dense(units=1, activation="sigmoid")
])

optimizer = keras.optimizers.SGD(learning_rate=0.002, momentum=0.8)

loss = keras.losses.BinaryCrossentropy(from_logits=False)

model.compile(
    optimizer=optimizer,
    loss=loss,
    metrics=[
        "accuracy",
        keras.metrics.Precision(),
        keras.metrics.Recall(),
        keras.metrics.AUC()
    ]
)

print(model)
print(optimizer)


2023-11-14 15:33:06.252961: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:894] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-11-14 15:33:06.253287: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:894] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-11-14 15:33:06.253454: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:894] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysf

<keras.src.engine.sequential.Sequential object at 0x7f35ee314340>
<keras.src.optimizers.sgd.SGD object at 0x7f35f5651700>


### Question 1

Since we have a binary classification problem, what is the best loss function for us?

**Answer**: `binary crossentropy`

### Question 2


What's the number of parameters in the convolutional layer of our model? You can use the `summary` method for that. 

In [5]:
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 148, 148, 32)      896       
                                                                 
 max_pooling2d (MaxPooling2  (None, 74, 74, 32)        0         
 D)                                                              
                                                                 
 flatten (Flatten)           (None, 175232)            0         
                                                                 
 dense (Dense)               (None, 64)                11214912  
                                                                 
 dense_1 (Dense)             (None, 1)                 65        
                                                                 
Total params: 11215873 (42.79 MB)
Trainable params: 11215873 (42.79 MB)
Non-trainable params: 0 (0.00 Byte)
______________

**Answer**: `896`

## Generators and Training

In [6]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

In [7]:
batch_size = 20

train_val_gen = ImageDataGenerator(rescale=1.0/255.0, validation_split=0.25) # 0.25 * 0.8 = 0.2:

train_ds = train_val_gen.flow_from_directory(
    "./data/train/", 
    target_size=(150, 150), 
    batch_size=batch_size,
    shuffle=True,
    class_mode="binary",
    subset="training"
)

val_ds = train_val_gen.flow_from_directory(
    "./data/train/", 
    target_size=(150, 150), 
    batch_size=batch_size,
    shuffle=True,
    class_mode="binary",
    subset="validation"
)


test_gen = ImageDataGenerator(rescale=1.0/255.0)

test_ds = test_gen.flow_from_directory(
    "./data/test/",
    target_size=(150, 150),
    batch_size=batch_size,
    shuffle=False, # just for eval. -> no shuffling needed
    class_mode="binary"
)

Found 2758 images belonging to 2 classes.
Found 919 images belonging to 2 classes.
Found 918 images belonging to 2 classes.


918

In [8]:
n_train = train_ds.samples
n_val = val_ds.samples
n_test = test_ds.samples
n = n_train + n_val + n_test
print(f"Train: {100 * n_train/n:.0f}%", n_train)
print(f"Val: {100 * n_val/n:.0f}%", n_val)
print(f"Test: {100 * n_test/n:.0f}%", n_test)

Train: 60% 2758
Val: 20% 919
Test: 20% 918


In [9]:
n_epochs = 10

history = model.fit(
    train_ds,
    epochs=n_epochs,
    validation_data=val_ds
)

Epoch 1/10


2023-11-14 15:33:07.233305: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8700
2023-11-14 15:33:08.468134: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f355c26ab90 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2023-11-14 15:33:08.468155: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 2070 SUPER, Compute Capability 7.5
2023-11-14 15:33:08.510254: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [10]:
np.median(np.array(history.history["accuracy"])).round(2)

0.76

**Answer**: `0.80` (nearest value)

## Question 4

What is the standard deviation of training loss for all the epochs for this model?

In [12]:
np.std(np.array(history.history["loss"])).round(3)

0.083

**Answer**: `0.091` (nearest value)