<a href="https://colab.research.google.com/github/farrelrassya/DeepLearning-A-Z-Python/blob/main/2_Chapter2ConvolutionalNeuralNetwork.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Convolutional Neural Network

### Importing the libraries

In [5]:
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator

In [6]:
tf.__version__

'2.14.0'

In [7]:
from google.colab import drive
drive.mount('/content/drive')


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Part 1 - Data Preprocessing

### Preprocessing the Training set

Image data generator (train_datagen) is configured to preprocess and augment images from a specified directory. The rescaling factor of 1/255 normalizes pixel values, shear range, zoom range, and horizontal flip are applied as data augmentation techniques to diversify the training data. The flow_from_directory function is used to create a generator (training_set) that loads and augments images from the My Drive/Dataset/training_set directory, resizes them to 64x64 pixels, groups them in batches of 32, and labels them with a binary class mode, typically used for binary classification tasks.

In [9]:
train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)
training_set = train_datagen.flow_from_directory('/content/drive/My Drive/Dataset/training_set',
                                                 target_size = (64, 64),
                                                 batch_size = 32,
                                                 class_mode = 'binary')

Found 8030 images belonging to 2 classes.


### Preprocessing the Test set

Separate image data generator (test_datagen) is configured for processing test images. It applies pixel value rescaling with a factor of 1/255. The flow_from_directory function is then used to create a test data generator (test_set) that loads and resizes images from the My Drive/Dataset/test_set directory to 64x64 pixels, organizes them into batches of 32, and assigns binary class labels, typically used for binary classification testing or evaluation purposes.

In [10]:
test_datagen = ImageDataGenerator(rescale = 1./255)
test_set = test_datagen.flow_from_directory('/content/drive/My Drive/Dataset/test_set',
                                            target_size = (64, 64),
                                            batch_size = 32,
                                            class_mode = 'binary')

Found 2000 images belonging to 2 classes.


## Part 2 - Building the CNN

### Initialising the CNN

Initializes a Sequential model in TensorFlow, a popular deep learning framework. This model is used to create a neural network architecture by adding layers sequentially one after the other. Subsequent code should include layer definitions and configurations to build the specific architecture of the Convolutional Neural Network (CNN) for your machine learning or deep learning task.

In [11]:
cnn = tf.keras.models.Sequential()

### Step 1 - Convolution

$$\text{Output}(x, y) = \sum_{i=1}^{k} \sum_{j=1}^{k} \text{Input}(x + i, y + j) \cdot \text{Kernel}(i, j)$$

represents the fundamental operation of feature extraction. It describes how a feature map element at position
(x,y) is computed by summing the weighted values of neighboring pixels from the input image, where the weights are determined by the convolutional kernel. This operation is performed iteratively across the entire image, sliding the kernel to capture local patterns. In CNNs, these convolutions are crucial for learning and detecting features like edges, textures, and higher-level patterns, forming the basis for hierarchical feature representation and ultimately aiding in tasks such as image recognition and object detection.

Adds the first layer to a Convolutional Neural Network (CNN) model using TensorFlow and Keras. This layer is a convolutional layer with 32 filters of size 3x3, applying the Rectified Linear Unit (ReLU) activation function. It's designed to process input images of size 64x64 pixels with 3 color channels (RGB). This layer's purpose is to detect various features and patterns in the input images, serving as the initial building block of the CNN architecture, which can be further expanded by adding more layers for tasks such as image classification or object recognition.

In [12]:
cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu', input_shape=[64, 64, 3]))

### Step 2 - Pooling

Adds a MaxPooling layer to a Convolutional Neural Network (CNN) model using TensorFlow and Keras. This MaxPooling layer is configured to reduce the spatial dimensions of the feature maps produced by previous convolutional layers. It employs a 2x2 pooling window to capture the maximum value from each 2x2 block of the input feature map and shifts this window by 2 pixels at a time. MaxPooling helps in retaining important features while decreasing the computational burden of the model. In a typical CNN architecture, this layer is often followed by additional convolutional layers to create a hierarchical representation of features, which is commonly used for image recognition and classification tasks.

$${Output}(x, y) = \max \left(\text{Input}(x', y') \text{ for all } x', y' \text{ in the pooling region} \right) $$


In [13]:
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))

### Adding a second convolutional layer

Adds two layers to a Convolutional Neural Network (CNN) using TensorFlow and Keras. The first line adds a convolutional layer with 32 filters of size 3x3 and a Rectified Linear Unit (ReLU) activation function, extracting features from the input data. The second line adds a MaxPooling layer with a 2x2 pooling window and a 2-pixel stride, reducing the spatial dimensions of the feature maps. These layers are common in CNN architectures for tasks like image recognition and classification, typically followed by more layers to complete the network.

In [14]:
cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu'))
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2))

### Step 3 - Flattening

Adds a Flatten layer to the Convolutional Neural Network (CNN) model. This layer is essential for transitioning from the convolutional and pooling layers to the fully connected layers in the network. It reshapes the 2D feature maps from the previous layers into a 1D vector, enabling them to be connected to a traditional feedforward neural network structure. This is often a crucial step in CNN architectures when preparing the data for classification or regression tasks that require fully connected layers.

$$ {Flatten}(\text{Input}) = \text{Input}(x, y) \text{ for all } x, y $$


In [15]:
cnn.add(tf.keras.layers.Flatten())

### Step 4 - Full Connection

Adds a Dense layer to the Convolutional Neural Network (CNN) model. This Dense layer has 128 units and uses the Rectified Linear Unit (ReLU) activation function. In a CNN, Dense layers are typically used for making final predictions or decisions based on the features extracted by the earlier convolutional and pooling layers. The 128 units represent neurons in the layer, and the ReLU activation introduces non-linearity into the network, helping it capture complex patterns in the data. This layer is often followed by another Dense layer and an output layer with appropriate activation functions to suit the specific machine learning or deep learning task, such as image classification.

$$ {ReLU}(x) = \max(0, x) $$

In [16]:
cnn.add(tf.keras.layers.Dense(units=128, activation='relu'))

### Step 5 - Output Layer

Code adds a Dense layer to the Convolutional Neural Network (CNN) model. This Dense layer has a single unit and uses the sigmoid activation function. In the context of a binary classification task, this layer is often the output layer, where the sigmoid activation function is used to produce a probability score between 0 and 1, indicating the likelihood of the input belonging to one of the two classes (e.g., 0 for class A and 1 for class B). This layer is responsible for making the final prediction based on the features extracted by the earlier layers in the CNN.

$$ {Sigmoid}(x) = \frac{1}{1 + e^{-x}} $$


In [17]:
cnn.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))

## Part 3 - Training the CNN

### Compiling the CNN

Adam Equation:
$$
\theta_{t+1} = \theta_{t} - \frac{\alpha}{\sqrt{v_t} + \epsilon} \cdot m_t
$$


The components in the Adam optimization equation can be described in mathematical terms as follows: θ_(t+1) represents the updated model parameters at the time step (t+1), while θ_t signifies the model parameters at the current time step, t. The learning rate, denoted as "alpha," is a critical factor that regulates the size of parameter updates during optimization. The variables m_t and v_t act as estimates of the first and second moments, respectively. These moments are computed using exponential moving averages of gradients and squared gradients, providing valuable insights into the trends and variations within the data. Lastly, "epsilon" is a small constant systematically introduced into the equations' denominators to prevent division by zero, enhancing numerical stability in the optimization process. Collectively, these elements constitute the foundation of the Adam optimizer, enabling adaptive and efficient adjustments of learning rates and making it highly effective for training complex neural networks.


Configuration for training Convolutional Neural Networks (CNNs) in binary classification tasks. The 'adam' optimizer offers efficient weight updates, 'binary_crossentropy' is a well-suited loss function for binary classification, and 'accuracy' provides a straightforward metric for evaluating classification performance. While these choices are widely adopted as strong starting points, it's important to consider that the most suitable configurations can vary depending on the specific dataset and problem, often requiring further fine-tuning and experimentation for optimal results.

In [18]:
cnn.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

### Training the CNN on the Training set and evaluating it on the Test set

In [19]:
cnn.fit(x = training_set, validation_data = test_set, epochs = 25)

Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25


<keras.src.callbacks.History at 0x7b507a0f2020>

## Part 4 - Making a single prediction

In [29]:
import numpy as np
from keras.preprocessing import image
test_image = image.load_img('/content/drive/My Drive/Dataset/single_prediction/cat_or_dog_1.jpg', target_size = (64, 64))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
result = cnn.predict(test_image)
training_set.class_indices
if result[0][0] == 1:
  prediction = 'dog'
else:
  prediction = 'cat'



In [30]:
print(prediction)

dog
