In [2]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator

In [3]:
tf.__version__

'2.4.1'

# Data Preprocessing

First we will do training, then test. And we will process them differently. 

We MUST apply transformations to the training set, otherwise we will overfit massively. We need these spatial variations and modifications.

We will do simple zooms and translations and rotations. This is Image Augmentation.

`ImageDataGenerator` is great as it gives us batches of data in real time with the data aug we pass it. Awesome! Obvs this is a parameter that can and will need to be tuned from project to project. 

Lol this course instructor guy really does just focus on speed. Low quality but just enough that actually I am learning something useful.

In [4]:
# Gonna apply these transformations randomly to our images
train_datagen = ImageDataGenerator(
        rescale=1./255,      # automatically perform normalization!
        shear_range=0.2,     # counter-clockwise rotation
        zoom_range=0.2,      # random zoom amount
        horizontal_flip=True # randomly flip horizontally
        ) 

Note that we *must* apply normalization and rescale our images. NNs are notorious for needing us to scale data before putting it in. 

In [5]:
# Take path to a dir and generate batches of augmented data
train_generator = train_datagen.flow_from_directory(
        'dataset/training_set',
        target_size=(64, 64), # size of images when fed into CNN
        batch_size=32,        # number of images per batch, 32 is classic
        class_mode='binary')

Found 8000 images belonging to 2 classes.


Note smaller target size will speed up training. Want to get it as small as possible while still getting good results.

In [6]:
# Only apply normalization, don't apply data aug to test set
test_datagen = ImageDataGenerator(rescale=1./255)

# Import test set images
test_generator = test_datagen.flow_from_directory(
        'dataset/test_set',
        target_size=(64, 64),
        batch_size=32,
        class_mode='binary')

Found 2000 images belonging to 2 classes.


And now data processing is done! Time to build the CNN.

# Building the CNN

In [7]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import (
    Conv2D, 
    MaxPool2D,
    Flatten,
    Dense
)

In [8]:
cnn = Sequential([
    Conv2D(filters=32, kernel_size=3, activation='relu', input_shape=(64, 64, 3)),
    MaxPool2D(pool_size=2, strides=2),
    Conv2D(filters=32, kernel_size=3, activation='relu'),
    MaxPool2D(pool_size=2, strides=2),
    Flatten(),
    Dense(128, activation='relu'),
    Dense(1, activation='sigmoid')
])

* `filters` is the number of feature detectors we want to the layer to have. Classic value to choose is 32 and then another 32. Has been shown to work well.
* `kernel_size` is the size of the window e.g. 3x3. Can you pass a tuple (5, 4) to have non-square kernels. Same thing with `MaxPool2D` layers. 
* `strides` is the spaces to jump between kernel windows.
* `input_shape` you should *always* specify the input shape of your images. I think this will be important for the TF exam, make sure you remember this. Since we manually resized the images in preprocessing and they are coloured, we know the answer is `(64, 64, 3)`. 

* Note for `padding` in `MaxPool2D` Hadelin checked it with using default and changing it and said it didn't change the results at all. Huh.

# Training the CNN

In [9]:
cnn.compile(optimizer='adam', 
            loss='binary_crossentropy',
            metrics=['accuracy'])

In [10]:
cnn.fit(x=train_generator,
        validation_data=test_generator,
        epochs=25)

Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25


<tensorflow.python.keras.callbacks.History at 0x107ae43a0>

# Making a Single Prediction

It's nice to do this... I guess the point is that we need to know how to make single predictions in production?

In [11]:
import numpy as np
from tensorflow.keras.preprocessing import image

In [48]:
# Loads as PIL image format in size specified
test_image = image.load_img(
    'dataset/single_prediction/cat_or_dog_2.jpg',
    target_size=(64, 64)
)

test_image = image.load_img(
    'dataset/test_set/cats/cat.4607.jpg',
    target_size=(64, 64)
)

# Now convert to an array - predict method expects 2D arrays as input
test_image = image.img_to_array(test_image)

Remember that everything in training is a batch, thus everything in predict must also be a batch even if it's just a batch of 1. So add the extra dimension.

In [49]:
# Add batch dimension of 1
test_image = np.expand_dims(test_image, axis=0)
# Can also do test_image.resize(1, etc.) but the former means we can 
# forget about the other dim sizes.

In [50]:
result = cnn.predict(test_image)
result

array([[0.]], dtype=float32)

In [51]:
train_generator.class_indices

{'cats': 0, 'dogs': 1}

In [52]:
if result[0][0] == 1:
    prediction = 'dog'
else:
    prediction = 'cat'
prediction

'cat'