# Convolutional Neural Network
Yann Lecun @ Facebook-> grandfather of CNN

### Importing the libraries

In [26]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
tf.__version__

'2.15.0'

## Part 1 - Data Preprocessing

### Preprocessing the Training set

In [27]:
# 1) transformations are required to avoid overfitting
# by modifying images, we get "image augmentation" (new images)
# 2) images reduced from 150x150 to 64x64 to minimise calc time

train_datagen = ImageDataGenerator(
    rescale=1.0 / 255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True
)

# `flow_from_directory` method identifies classes based on the directory structure:
# 2 classes because there were /dogs and /cats
training_set = train_datagen.flow_from_directory(
    "./filez_cats_dogs/training_set/",
    target_size=(64, 64),
    batch_size=32,
    class_mode="binary",
)

Found 8000 images belonging to 2 classes.


### Preprocessing the Test set

In [28]:
test_datagen = ImageDataGenerator(rescale=1.0 / 255)

test_set = test_datagen.flow_from_directory(
    "./filez_cats_dogs/test_set/",
    target_size=(64, 64),
    batch_size=32,
    class_mode="binary",
)

Found 2000 images belonging to 2 classes.


## Part 2 - Building the CNN

### Initialising the CNN

In [29]:
cnn = tf.keras.models.Sequential()

### Step 1 - Convolution

Adding a convolutional layer to a CNN using TensorFlow's Keras API

**Params**:

- `cnn.add(...)`: This method is called on a Sequential model object, and it adds a layer to the neural network.

- `tf.keras.layers.Conv2D(...)`: This specifies that the layer being added is a 2D convolutional layer, which is suitable for processing images that have height and width dimensions.

- `filters=32`: The convolutional layer will have 32 filters (or kernels). Each filter is responsible for capturing some specific feature from the image, like edges, textures, or more complex patterns. Having 32 filters means the layer will output 32 different feature maps.

- `kernel_size=3`: This defines the size of the filter window that will scan over the image. 3 here means a 3x3 grid. A smaller kernel size can capture finer details, while larger kernels capture more global features.

- `activation="relu"`: The activation function used here is the Rectified Linear Unit (ReLU). It introduces non-linearity into the model, allowing it to learn more complex patterns. The ReLU function outputs the input directly if it's positive, otherwise, it will output zero.

- `input_shape=[64, 64, 3]`: This defines the shape of the input data that the layer will receive. Since the images are 64x64 in size and color (which implies they have 3 color channels: Red, Green, Blue), the input_shape parameter is set to [64, 64, 3]. The input shape is typically only specified in the first layer of the network so it knows the size of the incoming data.

In [30]:
cnn.add(
    tf.keras.layers.Conv2D(
        filters=32,
        kernel_size=3, # 3x3 dimensions for RGB color
        activation="relu",
        input_shape=[64, 64, 3],  # image size is 64x64 and is in colors (3 for RGB)
    )
)

### Step 2 - Pooling

Adding a Max Pooling layer to the CNN. This layer looks at small portions of your image to spot the biggest features while making the image smaller and more manageable for the computer to process.

**Params**:

- `cnn.add(...)`: This adds a new layer to the CNN model cnn.

- `tf.keras.layers.MaxPool2D(...)`: This specifies that the layer being added is a 2D max pooling layer. Max pooling is a form of down-sampling which reduces the spatial dimensions (height and width) of the input volume for the next convolution layer.

- `pool_size=2`: This parameter defines the size of the window over which to take the maximum. 2 means that the max pooling window will have a size of 2x2. This is going to look at each 2x2 square of the image and pick the largest value.

- `strides=2`: This parameter specifies the “step” size of the window as it slides over the image. A stride of 2 means that the pooling window will move 2 pixels at a time, reducing the size of the output by a factor of 2. It effectively reduces the height and width of the input by half.

In [31]:
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))

### Adding a second convolutional layer

In [32]:
# same as before but without param `input_shape` in Conv2D
cnn.add(
    tf.keras.layers.Conv2D(
        filters=32,
        kernel_size=3,
        activation="relu",
    )
)

cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))

### Step 3 - Flattening

The Flatten layer is used to convert the multi-dimensional output of the previous layers (like the output from convolutional or pooling layers) into a one-dimensional array. This transformation is necessary because the following layers in a typical CNN, like fully connected (Dense) layers, require their input in a flat, one-dimensional format.

In [33]:
cnn.add(tf.keras.layers.Flatten())

### Step 4 - Full Connection

Adding a Dense layer with 128 neurons to the model cnn.
This layer will take the output (now flattened into a one-dimensional array by the previous Flatten layer) and perform further calculations.
The ReLU activation function allows this layer to capture non-linear relationships in the data.
Dense layers like this are crucial in a CNN for performing high-level reasoning and making predictions based on the features extracted by the previous convolutional and pooling layers.

In [34]:
cnn.add(tf.keras.layers.Dense(units=128, activation="relu"))

### Step 5 - Output Layer

In [35]:
# 1 neuron: cat or dog
# sigmoid: binary classification
cnn.add(tf.keras.layers.Dense(units=1, activation="sigmoid"))

## Part 3 - Training the CNN

### Compiling the CNN

Configures your CNN to use the Adam optimizer, calculates the loss using binary cross-entropy (appropriate for binary classification problems), and will track the accuracy of the model during training.

In [36]:
cnn.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])

### Training the CNN on the Training set and evaluating it on the Test set

In [37]:
# fine-tune the epochs depending on the results. eg: start with 10 and keep increasing.
cnn.fit(x=training_set, validation_data=test_set, epochs=25)

Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25


<keras.src.callbacks.History at 0x16f7c4b10>

## Part 4 - Making a single prediction

In [45]:
training_set.class_indices

{'cats': 0, 'dogs': 1}

In [46]:
import numpy as np
from keras.preprocessing import image


def predict(image_num: int):
    test_image = image.load_img(
        f"./filez_cats_dogs/single_prediction/cat_or_dog_0{image_num}.jpg",
        target_size=(64, 64),
    )

    test_image = image.img_to_array(test_image)
    test_image = np.expand_dims(test_image, axis=0)
    result = cnn.predict(test_image)
    if result[0][0] == 1:
        prediction = "dog"
    else:
        prediction = "cat"

    print(f"image is a {prediction}")


predict(1) # cat
predict(2) # dog

image is a cat
image is a dog
