# Convolutional Neural Network

### Importing the libraries

- The `ImageDataGenerator` in Keras converts real images into matrices of pixels. When working with image data in deep learning, it is common to represent images as numerical data in the form of matrices or tensors.

- The `ImageDataGenerator` class in Keras provides methods for loading and preprocessing image data from directories. When you use the `ImageDataGenerator` to load images, it automatically reads the image files, converts them into numerical matrices, and performs preprocessing steps as specified.

Here's a brief overview of how the `ImageDataGenerator` works with image data:

1. Loading Images: The `ImageDataGenerator` class provides methods such as `flow_from_directory` that reads images from a specified directory. It automatically traverses the directory structure, reads the image files, and loads them into memory.

2. Image Preprocessing: The `ImageDataGenerator` can perform various preprocessing operations on the loaded images. This includes rescaling the pixel values, applying normalization, handling image resizing, and applying data augmentation techniques like rotation, shifting, and flipping.

3. Batching and Conversion: The `ImageDataGenerator` typically loads images in batches, dividing the dataset into smaller subsets. Each image is then converted into a matrix or tensor representation, depending on the specific requirements of the deep learning model.

The resulting matrices or tensors represent the images as numerical data suitable for input to a deep learning model. Each element of the matrix corresponds to a pixel value, and the dimensions of the matrix depend on the image size and the number of color channels (e.g., RGB or grayscale).

By converting real pictures into matrices of pixels, deep learning models can process and learn from the pixel-level information. The matrices are then fed into the network, allowing the model to learn patterns, features, and relationships within the image data.

In summary, the `ImageDataGenerator` in Keras reads real images, converts them into matrices or tensors of pixel values, and performs preprocessing operations on the image data. This prepares the image data for training deep learning models that can learn from and make predictions on image datasets.

In [1]:
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator

In [2]:
tf.__version__

'2.13.0-rc2'

## Part 1 - Data Preprocessing

### Preprocessing the Training set

Let's go through each parameter and understand its purpose:

1. `rescale = 1./255`: This parameter rescales the pixel values of the images. Dividing each pixel value by 255 normalizes the values to be between 0 and 1. This rescaling is a common preprocessing step in deep learning to ensure that the input data falls within a desired range.

2. `shear_range = 0.2`: This parameter specifies the range for shear transformations. Shearing involves shifting one part of the image along a certain direction, creating a tilt effect. The shear range of 0.2 means that the image can be sheared up to 20% of its width or height.

3. `zoom_range = 0.2`: This parameter determines the range for random zooming. Zooming involves either enlarging or reducing the size of the image. The zoom range of 0.2 means that the image can be zoomed in or out by up to 20%.

4. `horizontal_flip = True`: This parameter enables random horizontal flipping of the images. Horizontal flipping mirrors the image horizontally, creating a new image with the same features but in the opposite direction. This augmentation technique can help increase the diversity of the training data and make the model more robust to horizontal spatial variations.

The `ImageDataGenerator` object, `train_datagen`, with these settings is typically used for data augmentation during the training phase of a deep learning model. Data augmentation introduces variations to the input images, artificially increasing the size and diversity of the training dataset. This can help improve the model's generalization ability and prevent overfitting.

During training, the `train_datagen` object is often used in conjunction with the `flow_from_directory` method of the `ImageDataGenerator` class. This method generates augmented batches of images on-the-fly from a directory structure, allowing for efficient training with augmented data.

The code you provided is used to generate a training set by loading and augmenting images from a directory. Let's break down the code and understand its functionality:

1. `train_datagen.flow_from_directory('dataset/training_set'...`: This line specifies the directory path where the training images are located. The `flow_from_directory` method of the `ImageDataGenerator` class is used to load images from the specified directory and generate augmented batches of images.

2. `target_size = (64, 64)`: This parameter specifies the desired size to which the input images should be resized. In this case, the images will be resized to a size of 64x64 pixels. Resizing the images to a consistent size is often necessary to ensure uniformity and compatibility with the deep learning model.

3. `batch_size = 32`: This parameter defines the number of images in each batch. During training, the training set will be divided into batches, and each batch will contain 32 images. Batching allows for efficient processing and updating of model parameters during training.

4. `class_mode = 'binary'`: This parameter specifies the type of problem or the format of the labels. In this case, the training data is expected to be categorized into two classes, hence the `'binary'` class mode. If you have multiple classes, you can use `'categorical'` as the class mode.


In summary, the code snippet you provided initializes a training set by loading and augmenting images from a directory using an `ImageDataGenerator`. The generated training set is suitable for training a deep learning model, and the provided settings determine the augmentation techniques, target size, batch size, and class mode used during the process.



In [3]:
train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)
training_set = train_datagen.flow_from_directory('dataset/training_set',
                                                 target_size = (64, 64),
                                                 batch_size = 32,
                                                 class_mode = 'binary')

Found 8000 images belonging to 2 classes.


### Preprocessing the Test set

In [4]:
test_datagen = ImageDataGenerator(rescale = 1./255)
test_set = test_datagen.flow_from_directory('dataset/test_set',
                                            target_size = (64, 64),
                                            batch_size = 32,
                                            class_mode = 'binary')

Found 2000 images belonging to 2 classes.


## Part 2 - Building the CNN

### Initialising the CNN

In [5]:
cnn = tf.keras.models.Sequential()

### Step 1 - Convolution

In [6]:
cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu', input_shape=[64, 64, 3]))

### Step 2 - Pooling

In [7]:
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))

### Adding a second convolutional layer

In [8]:
cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu'))
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))

### Step 3 - Flattening

In [9]:
cnn.add(tf.keras.layers.Flatten())

### Step 4 - Full Connection

In [10]:
cnn.add(tf.keras.layers.Dense(units=128, activation='relu'))

### Step 5 - Output Layer

In [11]:
cnn.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))

## Part 3 - Training the CNN

### Compiling the CNN

In [12]:
cnn.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

### Training the CNN on the Training set and evaluating it on the Test set

In [13]:
cnn.fit(x = training_set, validation_data = test_set, epochs = 25)

Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25


<keras.src.callbacks.History at 0x2b7fe6df0>

## Part 4 - Making a single prediction

In [14]:
import numpy as np
from keras.preprocessing import image

# Load and preprocess the test image
test_image = image.load_img('dataset/single_prediction/cat_or_dog_1.jpg', target_size=(64, 64))
test_image = image.img_to_array(test_image)
test_image = test_image / 255.0  # Rescale the pixel values to [0, 1]
test_image = np.expand_dims(test_image, axis=0)

# Perform prediction on the test image using the trained CNN model
result = cnn.predict(test_image)
print(result)
"""
The line `test_image = np.expand_dims(test_image, axis=0)` is used to add an extra dimension to the test image array. 

In the context of the code snippet, `test_image` is initially a 3D array representing the image, where the dimensions are height, width, and channels. However, the `predict` method of the CNN model expects a batch of images as input, even if it's just a single image. Therefore, the array needs to have an additional dimension to indicate the batch size.

By using `np.expand_dims(test_image, axis=0)`, a new dimension is added at `axis=0`, effectively creating a new axis for the batch size. This transforms the 3D array of shape (height, width, channels) into a 4D array of shape (1, height, width, channels), where the first dimension represents the batch size, which is 1 in this case.

Adding this extra dimension allows the single test image to be passed as input to the CNN model's `predict` method, which expects a batch of images.
"""

# Get the class indices of the training set
class_indices = training_set.class_indices
threshold = 0.5
# Determine the predicted label (dog or cat) based on the prediction result
if result[0][0] > threshold:
    prediction = 'dog'
else:
    prediction = 'cat'

print(prediction)


[[0.99871]]
dog


Next to do

- How to save the trained model and allow the user to input the image.