### Understanding CNNs

Input image -> CNN -> Output label (image class)

Relate CNN to convolution of two integrable functions.

Convolution:
- Given an input image and a feature detector/filter, we can obtain a convolution (element-wise multiplication of matrices) and summing up the obtained result
- The result is a feature map
- In the simplest case, we have a "stride" or step-size of 1 pixel

ReLU (rectified linear unit) layer:
- Rectifier activation function: max(x, 0)
- Adds non-linearity

Max Pooling:
- When recognizing features in an image the neural network should have flexibility in terms of recognizing distortions in said features
- Different types of pooling: max, min, sum etc.
- Pooling is a form of non-linear down-sampling or compression
- Max pooling partitions the input image into a set of non-overlapping rectangles and for each sub-region, outputs the maximum value
- Pooling layers serve to progressively reduce the spatial size of the representation, to reduce the number of parameters and computation needed in the network. It also controls overfitting.

Flattening:
- The process of converting all resultant 2D arrays into a single long continuous linear vector
- This becomes useful for classification

Full Connection:
- Following the flattening process, we now have inputs for an artificial neural network
- The goal is to classify the image (for example to identify what is being illustrated in the image)
- Each classification is made with an associated probability
- In CNNs, we have an associated loss function that measures how far off the prediction is from the truth
- We use the cross-entropy function as our loss function
- We then use backpropagation and otpmization techniques to yield optimal weights that will minimize the loss function


SoftMax & Cross-Entropy:
- Normally, in the classification process, the output could be any real values
- We apply the softmax function to the output in order to normalize them to the (0, 1) range
- Cross-entropy function acts as our objective function that needs to be minimized
- 


### Importing libraries

In [4]:
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator

In [5]:
tf.__version__

'2.2.0'

### Pre-processing the training set

In [9]:
# Dataset contains 4000 pictures of cats & 4000 pictures of dogs
# the goal is for our CNN to be able to recognize these
# we pre-process the training set to limit overfitting

# We apply geometric transformations:
# Translations, rotations, zoom and flips
# this is called image augmentation

# train_datagen is an instance of ImageDataGenerator
# rescale parameter is an example of feature scaling
train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

# Import the training set
# Re-size the images to reduce computational intensity
# class mode is binary because we only have cat/dog outcome
training_set = train_datagen.flow_from_directory('CNN dataset/training_set',
                                                 target_size = (64, 64),
                                                 batch_size = 32,
                                                 class_mode = 'binary')

Found 8000 images belonging to 2 classes.


### Pre-processing the test set

In [10]:
test_datagen = ImageDataGenerator(rescale = 1./255)
test_set = test_datagen.flow_from_directory('CNN dataset/test_set',
                                            target_size = (64, 64),
                                            batch_size = 32,
                                            class_mode = 'binary')

Found 2000 images belonging to 2 classes.


### Initializing the CNN

In [11]:
# we will again create the CNN as a sequence of layers 
# instead of computational graph

cnn = tf.keras.models.Sequential()

### Convolution

In [12]:
# We add a 2D convolutional lyers
# filter =  the number of output filters in the convolution
# kernel_size = dimension of the convolution window
# input_shape = we resized our images to (64, 64) with RGB encoding
cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, 
                               activation="relu",
                               input_shape=[64, 64, 3]))

### Pooling

In [13]:
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))

### Adding a second convolution layer

In [14]:
cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu'))
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))

### Flattening

In [15]:
cnn.add(tf.keras.layers.Flatten()) 
# To produce a single continuous vector to serve as input
# for our ANN

### Full connection

In [16]:
cnn.add(tf.keras.layers.Dense(units=128, activation='relu'))

### Output layer

In [17]:
cnn.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))

### Compiling the CNN

In [18]:
cnn.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

### Training and evaluating the CNN

In [19]:
cnn.fit(x = training_set, validation_data = test_set, epochs = 25)

Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25


<tensorflow.python.keras.callbacks.History at 0x7f93ce1db0d0>

### Making a single prediction

In [25]:
import numpy as np
from tensorflow.keras.preprocessing import image
test_image_dog = image.load_img('CNN dataset/single_prediction/cat_or_dog_1.jpg', target_size = (64, 64))
test_image_dog = image.img_to_array(test_image_dog)
test_image_dog = np.expand_dims(test_image_dog, axis = 0)
result = cnn.predict(test_image_dog)
training_set.class_indices
if result[0][0] == 1:
    prediction = "dog"
else:
    prediction = "cat"

In [26]:
print(prediction)

dog


In [27]:
test_image_cat = image.load_img('CNN dataset/single_prediction/cat_or_dog_2.jpg', target_size = (64, 64))
test_image_cat = image.img_to_array(test_image_cat)
test_image_cat = np.expand_dims(test_image_cat, axis = 0)
result = cnn.predict(test_image_cat)
training_set.class_indices
if result[0][0] == 1:
    prediction = "dog"
else:
    prediction = "cat"

In [28]:
print(prediction)

dog
