# Convolutional Networks

+ image 150x150
+ 32 filters of size 3x3 (so it filters 3 pixels x 3 pixels)
+ the filter is moved around the image in strides, default is stride = (1,1) pixels
+ MaxPool2 pooling layers takes 2x2 windows of the feature map and take the maximum of it
+ Flatten layer: takes a chunk of whaterver dimension and makes a 1D vect 

## 1. Data Preparation

Download the [cats and dogs dataset](https://www.kaggle.com/c/dogs-vs-cats/data) from Kaggle or from here:

!wget https://github.com/alexeygrigorev/large-datasets/releases/download/dogs-cats/train.zip

In [1]:
import os
import shutil

import numpy as np

from tensorflow import keras

from tensorflow.keras.layers import Conv2D, Dense, MaxPool2D, Flatten, Dropout

from tensorflow.keras.preprocessing.image import ImageDataGenerator

#from tensorflow.keras.preprocessing.image import load_img


Create a train and validation folders. In each folder, create cats and dogs folders

In [2]:
if False: # this and the next 2 cells should be run only once, set 'False' to 'True'
    train_dir = './cats_dogs/train'
    val_dir = './cats_dogs/val'

    dog_train_dir = './cats_dogs/train/dog'
    dog_val_dir = './cats_dogs/val/dog'
    cat_train_dir = './cats_dogs/train/cat'
    cat_val_dir = './cats_dogs/val/cat'

    file_dirs = [train_dir, val_dir, dog_train_dir, dog_val_dir, cat_train_dir, cat_val_dir]

    for dir in file_dirs:
        os.makedirs(dir,exist_ok = True)
    
    print(len(os.listdir(train_dir)))
    print(len(os.listdir(val_dir)))

Move the first 10,000 images to the train folder (from 0 to 9999) for boths cats and dogs - and put them in respective folders. Move the remaining 2,500 images to the validation folder (from 10000 to 12499)

In [3]:
# Move the first 10,000 images to the train folder (from 0 to 9999) for boths cats and dogs in respective folders
if False:
    source_folder = './cats_dogs/train/'
    destination_folder = './cats_dogs/train/'

    # iterate files
    for spc in ['dog','cat']:
        for num in range(0,10000):
            # construct full file path
            source = source_folder + '{}.{}.jpg'.format(spc,num)
            destination = destination_folder + '{}/'.format(spc) + '{}.{}.jpg'.format(spc,num)
            # move file
            shutil.move(source, destination)
            # print('Moved:', '{}.{}.jpg'.format(spc,num)) 
    print(len(os.listdir(train_dir)))

In [4]:
# Move the remaining 2500 images to the val folder (from 10000 to 12499) for boths cats and dogs
if False:
    source_folder = './cats_dogs/train/'
    destination_folder = './cats_dogs/val/'

    # iterate files
    for spc in ['dog','cat']:
        for num in range(10000,12500):
            # construct full file path
            source = source_folder + '{}.{}.jpg'.format(spc,num)
            destination = destination_folder + '{}/'.format(spc) + '{}.{}.jpg'.format(spc,num)
            # move file
            shutil.move(source, destination)
            print('Moved:', '{}.{}.jpg'.format(spc,num))
    
    print(len(os.listdir(train_dir)))

## 2. Model

We use a Convolutional Neural Network (CNN) in Keras. This is the model structure:


* The shape for input should be `(150, 150, 3)`
* Create a convolutional layer ([`Conv2D`](https://keras.io/api/layers/convolution_layers/convolution2d/)):
    * Use 32 filters
    * Kernel size should be `(3, 3)` (that's the size of the filter)
    * Use `'relu'` as activation 
* Reduce the size of the feature map with max pooling ([`MaxPooling2D`](https://keras.io/api/layers/pooling_layers/max_pooling2d/))
    * Set the pooling size to `(2, 2)`
* Turn the multi-dimensional result into vectors using a [`Flatten`](https://keras.io/api/layers/reshaping_layers/flatten/) layer
* Add a `Dense` layer with 64 neurons and `'relu'` activation
* Create the `Dense` layer with 1 neuron - this will be the output
    * The output layer should have an activation - sigmoid is the appropiate for the binary classification case ([Ref](https://ecwuuuuu.com/post/sigmoid-softmax-binary-class/))
            
We can transform the sigmoid function into softmax form. So sigmoid activation can consider as a special case of softmax activation with one of the two nodes have no weight given to it (just one node is working). 
From the architectural point of view, they are clearly different. Although there is no empirical result to show which one is better. It is clear to show that if the softmax way is chosen, the model will have more parameters that need to learn. So I think that is why people usually use one output neuron and the sigmoid activation function for binary classification.


As optimizer use [`SGD`](https://keras.io/api/optimizers/sgd/) with the following parameters:

* `SGD(lr=0.002, momentum=0.8)`

In [5]:
def make_model(input_size=150, learning_rate=0.01, size_inner=64):#,droprate=0.5):

    #########################################

    inputs = keras.Input(shape=(input_size, input_size, 3))
    conv = keras.layers.Conv2D(32, (3,3), activation='relu',input_shape=(input_size, input_size, 3))(inputs)
    
    tensors = keras.layers.MaxPooling2D(pool_size=(2, 2))(conv)
    vectors = keras.layers.Flatten()(tensors)
    
    inner = keras.layers.Dense(size_inner, activation='relu')(vectors)
    # drop = keras.layers.Dropout(droprate)(inner)
    
    outputs = keras.layers.Dense(1, activation='sigmoid')(inner) #(drop)
    
    model = keras.Model(inputs, outputs)
    
    #########################################

    optimizer = keras.optimizers.SGD(lr=0.002, momentum=0.8)
    loss = keras.losses.BinaryCrossentropy(from_logits=False) # activation in output layer, no need from_logits=True

    model.compile(optimizer=optimizer, loss=loss, metrics=['accuracy'])
    
    return model

In [6]:
make_model().summary()

2021-11-22 14:08:02.913715: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-11-22 14:08:02.913758: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2021-11-22 14:08:02.913796: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (granada): /proc/driver/nvidia/version does not exist
2021-11-22 14:08:02.914960: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-11-22 14:08:02.943614: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 44859392 exceeds 10

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 150, 150, 3)]     0         
_________________________________________________________________
conv2d (Conv2D)              (None, 148, 148, 32)      896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 74, 74, 32)        0         
_________________________________________________________________
flatten (Flatten)            (None, 175232)            0         
_________________________________________________________________
dense (Dense)                (None, 64)                11214912  
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 65        
Total params: 11,215,873
Trainable params: 11,215,873
Non-trainable params: 0
_________________________________________________



Since we have a binary classification problem, we used BinaryCrossentropy as loss function.

The model has 11,215,873 parameters.

### Generators and Training

We use the following data generator for both train and validation:

```python
ImageDataGenerator(rescale=1./255)
```
* We don't need to do any additional pre-processing for the images.
* we use `class_mode = binary`, `batch_size=20` and `shuffle=True` for both training and validaition 

In [7]:
# TRAIN
train_dir = './cats_dogs/train' # in case the corresponding cell above is set False
train_gen = ImageDataGenerator(rescale=1./255)
train_ds = train_gen.flow_from_directory(
    train_dir,
    batch_size=20,
    class_mode='binary',
    target_size=(150, 150),
    shuffle=True
)
train_ds.class_indices # OHE

Found 20000 images belonging to 2 classes.


{'cat': 0, 'dog': 1}

In [8]:
# VAL
val_dir = './cats_dogs/val' # in case the corresponding cell above is set False
val_gen = ImageDataGenerator(rescale=1./255)
val_ds = val_gen.flow_from_directory(
    val_dir,
    batch_size = 10,
    class_mode = 'binary',
    target_size=(150, 150),
    shuffle=True
)
val_ds.class_indices # OHE

Found 5000 images belonging to 2 classes.


{'cat': 0, 'dog': 1}

We train with `.fit()` with the following params:

```python
model.fit(
    train_generator,
    steps_per_epoch=100,
    epochs=10,
    validation_data=validation_generator,
    validation_steps=50
)
```

Note `validation_steps=50` - this parameter says "run only 50 steps on the validation data for evaluating the results". 
This way we iterate a bit faster, but don't use the entire validation dataset.
That's why it's important to shuffle the validation dataset as well.

In [9]:
# Fit data to model  (10 epochs took about 6min)
model = make_model()
history = model.fit(
            train_ds,
            steps_per_epoch=100,
            epochs=10,
            validation_data=val_ds,
            validation_steps=50
            )

2021-11-22 14:08:04.198953: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 44859392 exceeds 10% of free system memory.
2021-11-22 14:08:04.250957: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 44859392 exceeds 10% of free system memory.
2021-11-22 14:08:04.921425: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


The median of training accuracy for this model is:

In [10]:
np.median(history.history['accuracy']).round(2)

0.56

The standard deviation of training loss for this model is:

In [11]:
np.std(history.history['loss']).round(2)

0.01

### Data Augmentation

We'll now generate more data using data augmentations. 

We add the following augmentations to the training data generator:

* `rotation_range=40,`
* `width_shift_range=0.2,`
* `height_shift_range=0.2,`
* `shear_range=0.2,`
* `zoom_range=0.2,`
* `horizontal_flip=True,`
* `fill_mode='nearest'`

In [13]:
# TRAIN
train_dir = './cats_dogs/train' # in case the corresponding cell above is set False
train_gen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40, # takes the range [-30, 30] and rotates each image randomly in that range
    width_shift_range=0.2, # shift the image on a specific axis in the range [-10, 10].
    height_shift_range=0.2, # shift the image on a specific axis in the range [-10, 10].
    shear_range=0.2, # distort the image on a specific axis in the range [-10, 10])
    zoom_range=0.2, # ndicates the amount of change, so 0.1 means the range will be [0.9, 1.1].
    horizontal_flip=True,
    fill_mode='nearest'
)

train_ds = train_gen.flow_from_directory(
    train_dir,
    batch_size=20,
    class_mode='binary',
    target_size=(150, 150),
    shuffle=True
)

Found 20000 images belonging to 2 classes.


Let's train our model for 10 more epochs using the same code as previously. We don't re-create the model - we want to continue training the model we already started training. We don't need to recompile it. But even if you compile again, it doesn't reset the model you trained previously (re-running the cell where we defined the model will change it, tough).

In [14]:
# Fit data to model  (10 epochs took about 6min) 
model = make_model()
history = model.fit(
            train_ds,
            steps_per_epoch=100,
            epochs=10,
            validation_data=val_ds,
            validation_steps=50
            )

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


The mean of validation loss for the model trained with augmentations is:

In [15]:
np.mean(history.history['val_loss']).round(2)

0.68

The average of validation accuracy for the last 5 epochs (from 6 to 10) for the model trained with augmentations is:

In [18]:
np.average(history.history['val_accuracy'][6:10]).round(3)

0.572

In [19]:
np.average(history.history['val_accuracy'][5:10]).round(3)

0.57