# Objective

This notebook corresponds to the homework of the week 8 (Deep Learning) of the Machine Learning Zoomcamp (2023 cohort). The subject can be found here : https://github.com/DataTalksClub/machine-learning-zoomcamp/blob/master/cohorts/2023/08-deep-learning/homework.md .

The goal is to predict whether the insect on the image is a bee or a wasp with a Convolutional Neural Network (CNN) built from scratch and an image dataset having bees and wasps photos to train and test the model.

<img src="images/maxresdefault.jpg" style="display:block;float:none;margin-left:auto;margin-right:auto;width:100%">


# Data

The used dataset can be downloaded from this link : https://github.com/SVizor42/ML_Zoomcamp/releases/download/bee-wasp-data/data.zip
It corresponds to the "Bee or Wasp?" Kaggle dataset that was slightly rebuilt, as specified in the homework description.

To download it easily using a Saturn cloud notebook, use these commands:
```bash
wget https://github.com/SVizor42/ML_Zoomcamp/releases/download/bee-wasp-data/data.zip
unzip data.zip
```

This dataset contains two folders: train and test. Each of these two folders contains two subfolders: bee and wasp. These bee and wasp folders contain photos (.jpg format).

# Notebook

## Imports

In [1]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.preprocessing.image import load_img
from tensorflow.keras.preprocessing.image import ImageDataGenerator

import statistics
import numpy as np




In [19]:
# Check tensorflow version
tf.__version__

'2.15.0'

## Functions

## Data preparation

### Model

Here is how the model should be built initially:

For this homework we will use Convolutional Neural Network (CNN). Like in the lectures, we'll use Keras.

You need to develop the model with following structure:

- The shape for input should be (150, 150, 3)
- Next, create a convolutional layer (Conv2D):
    - Use 32 filters
    - Kernel size should be (3, 3) (that's the size of the filter)
    - Use 'relu' as activation
- Reduce the size of the feature map with max pooling (MaxPooling2D)
    - Set the pooling size to (2, 2)
- Turn the multi-dimensional result into vectors using a Flatten layer
- Next, add a Dense layer with 64 neurons and 'relu' activation
- Finally, create the Dense layer with 1 neuron - this will be the output
    - The output layer should have an activation - use the appropriate activation for the binary classification case
- As optimizer use SGD with the following parameters:
    - SGD(lr=0.002, momentum=0.8)
    
    
    
For clarification about kernel size and max pooling, check Office Hours.

In [2]:
# Setup constants
INPUT_SHAPE=(150,150,3)
ACTIVATION="relu"
OUTPUT_ACTIVATION="sigmoid"
NUMBER_FILTERS=32
KERNEL_SIZE=(3,3)
POOLING_SIZE=(2,2)
DENSE_FIRST_NEURONS_NUMBER=64
DENSE_OUTPUT_NEURONS_NUMBER=1
SGD_LR = 0.002
SGD_MOMENTUM = 0.8

In [3]:
# Build the model
model = keras.Sequential(
    [
        layers.Conv2D(filters = NUMBER_FILTERS,
                      kernel_size=KERNEL_SIZE,
                      activation=ACTIVATION,
                      input_shape=INPUT_SHAPE),
        layers.MaxPooling2D(pool_size=POOLING_SIZE),
        layers.Flatten(),
        layers.Dense(units=DENSE_FIRST_NEURONS_NUMBER, 
                     activation=ACTIVATION),
        layers.Dense(units=DENSE_OUTPUT_NEURONS_NUMBER, 
                     activation=OUTPUT_ACTIVATION)
    ]
)

loss = keras.losses.BinaryCrossentropy()
opt = keras.optimizers.SGD(learning_rate=SGD_LR,
                           momentum = SGD_MOMENTUM)
model.compile(optimizer=opt, loss=loss, metrics=['accuracy'])





In [4]:
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 148, 148, 32)      896       
                                                                 
 max_pooling2d (MaxPooling2  (None, 74, 74, 32)        0         
 D)                                                              
                                                                 
 flatten (Flatten)           (None, 175232)            0         
                                                                 
 dense (Dense)               (None, 64)                11214912  
                                                                 
 dense_1 (Dense)             (None, 1)                 65        
                                                                 
Total params: 11215873 (42.79 MB)
Trainable params: 11215873 (42.79 MB)
Non-trainable params: 0 (0.00 Byte)
______________

The Conv2D layer has 896 parameters.

### Training and testing datasets

- We don't need to do any additional pre-processing for the images.
- When reading the data from train/test directories, check the class_mode parameter. Which value should it be for a binary classification problem?
- Use batch_size=20
- Use shuffle=True for both training and test sets.

In [5]:
# Data Generator 
datagen = ImageDataGenerator(rescale=1./255)

In [6]:
# Set target size (according to input shape)
TARGET_SIZE=(150,150)

In [7]:
train_ds = datagen.flow_from_directory(
    './data/train/',
    batch_size=20,
    shuffle=True,
    class_mode='binary',
    target_size=TARGET_SIZE
)

Found 3677 images belonging to 2 classes.


In [8]:
test_ds = datagen.flow_from_directory(
    './data/test/',
    batch_size=20,
    shuffle=True,
    class_mode='binary',
    target_size=TARGET_SIZE
)

Found 918 images belonging to 2 classes.


## Training the model

In [9]:
result = model.fit(
    train_ds,
    epochs=10,
    validation_data=test_ds
)

Epoch 1/10


Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [10]:
result.history

{'loss': [0.6512953042984009,
  0.5657526254653931,
  0.5348544716835022,
  0.5096314549446106,
  0.4719887673854828,
  0.46101152896881104,
  0.43318313360214233,
  0.4015069901943207,
  0.3677012026309967,
  0.329925537109375],
 'accuracy': [0.6032091379165649,
  0.7133532762527466,
  0.7394615411758423,
  0.7598586082458496,
  0.7892303466796875,
  0.7922219038009644,
  0.8131629228591919,
  0.833016037940979,
  0.8455262184143066,
  0.8773456811904907],
 'val_loss': [0.5797104239463806,
  0.5692185163497925,
  0.621886670589447,
  0.5350314974784851,
  0.5792214274406433,
  0.5414077639579773,
  0.520801842212677,
  0.49683448672294617,
  0.501284122467041,
  0.5782131552696228],
 'val_accuracy': [0.7167755961418152,
  0.6971677541732788,
  0.6492374539375305,
  0.7287581562995911,
  0.7080609798431396,
  0.7385621070861816,
  0.742919385433197,
  0.7636165618896484,
  0.7614378929138184,
  0.7352941036224365]}

In [11]:
# Median of training accuracy
statistics.median(result.history['accuracy'])

0.7907261252403259

In [12]:
# Standard deviation of training loss
statistics.stdev(result.history['loss'])

0.0965853857147639

## Data Augmentation

Add the following augmentations to your training data generator:

- rotation_range=50,
- width_shift_range=0.1,
- height_shift_range=0.1,
- zoom_range=0.1,
- horizontal_flip=True,
- fill_mode='nearest'

In [13]:
# Data Generator for data augmentation
datagen_augmentation = ImageDataGenerator(rescale=1./255,
                            rotation_range=50,
                            width_shift_range=0.1,
                            height_shift_range=0.1,
                            zoom_range=0.1,
                            horizontal_flip=True,
                            fill_mode='nearest')

In [14]:
train_augmentation = datagen_augmentation.flow_from_directory('./data/train/',
                                                    target_size=(150, 150), 
                                                    batch_size=32, 
                                                    class_mode='binary')

Found 3677 images belonging to 2 classes.


In [15]:
result_augmentation = model.fit(
    train_augmentation,
    epochs=10,
    validation_data=test_ds
)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [16]:
result_augmentation.history

{'loss': [0.49508029222488403,
  0.4808209240436554,
  0.47093522548675537,
  0.4693944752216339,
  0.4588375389575958,
  0.46520674228668213,
  0.46692657470703125,
  0.45710718631744385,
  0.44762396812438965,
  0.4511086642742157],
 'accuracy': [0.7707369923591614,
  0.7799836993217468,
  0.7875986099243164,
  0.7856948375701904,
  0.7870546579360962,
  0.781887412071228,
  0.7810715436935425,
  0.7968452572822571,
  0.7957574129104614,
  0.7971172332763672],
 'val_loss': [0.5457229614257812,
  0.4659121036529541,
  0.47818687558174133,
  0.4620767831802368,
  0.46662330627441406,
  0.4705820381641388,
  0.4729148745536804,
  0.44878822565078735,
  0.45068982243537903,
  0.47412049770355225],
 'val_accuracy': [0.7625272274017334,
  0.7897603511810303,
  0.7734204530715942,
  0.7930282950401306,
  0.7777777910232544,
  0.7875816822052002,
  0.7788671255111694,
  0.7875816822052002,
  0.7886710166931152,
  0.7755991220474243]}

In [17]:
# Mean of test loss
statistics.mean(result_augmentation.history['val_loss'])

0.47356174886226654

In [18]:
# Average of test accuracy for the last 5 epochs
statistics.mean(result_augmentation.history['val_accuracy'][-5:])

0.7836601257324218