convolutional neural network

In [3]:
import tensorflow as tf

In [4]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

preprocessing the training set 

In [6]:
training_datagen = ImageDataGenerator( 
    rescale= 1./255,
    shear_range = 2.5,
    zoom_range = 2.5,
    horizontal_flip = True)


#The ImageDataGenerator class in Keras is used for image preprocessing and augmentation. It applies real-time transformations to images, making the model more robust by providing varied versions of the training data.

Why Use ImageDataGenerator?
Prevents overfitting by providing varied training samples.
Expands the dataset size by creating augmented versions of images.
Improves generalization to unseen data.

1. rescale=1.0/255 = Pixel values in images range from 0 to 255.
Dividing by 255 normalizes them to a range of 0 to 1, which helps in faster and stable training.
2. shear_range=2.5 = Applies shear transformation, shifting image pixels in a diagonal direction.
This helps the model generalize better to slight distortions.
3. zoom_range=2.5 = Randomly zooms in or out on the image by up to 2.5x.
Useful for training models to recognize objects at different scales.
4. horizontal_flip=True = Randomly flips images left to right.
Useful for cases like facial recognition, cats/dogs, where left-right orientation doesn’t matter.

In [8]:
training_set = training_datagen.flow_from_directory(
    'AI/cats_dogs/train',
    target_size =(64,64),
    batch_size = 32,
    class_mode = 'binary'
)

Found 557 images belonging to 2 classes.


The flow_from_directory() method is used to load images from a directory and apply real-time data augmentation. It automatically labels images based on the folder structure and prepares them for training.

1. 'PetImages'
Specifies the path where images are stored.
The directory must contain subfolders, each representing a class.

2. target_size=(64,64)
Resizes all images to 64x64 pixels (ensuring uniform input size).
CNN models require fixed-size inputs.

3. batch_size=32
Specifies how many images to process at a time.
A batch of 32 images is fed to the model in each step.
Larger batches → Faster training (but needs more memory).

4. class_mode='binary'
For two classes (e.g., Cats & Dogs), use 'binary'.
Labels will be 0 or 1.
If more than two classes, use 'categorical' (one-hot encoding).



Preprocessing the traind set

In [11]:
test_datagen = ImageDataGenerator(rescale = 1./255) #Creates a data generator for the test set
test_set = test_datagen.flow_from_directory(        #Loads images from the directory and prepares them for the model
    'AI/cats_dogs/test',
    target_size =(64,64),
    batch_size = 32,
    class_mode = 'binary')

Found 141 images belonging to 2 classes.


This code is preprocessing test images using ImageDataGenerator and loading them from the cat_dog/test/ directory

rescale=1./255: Normalizes pixel values from [0, 255] to [0, 1] for better model performance.
No data augmentation (since test data should not be altered).

('cat_dog/test/')
target_size=(64,64) → Resizes all images to 150x150 pixels for consistency.
batch_size=32 → Loads 32 images per batch for training.
class_mode='binary' → Binary classification (e.g., Cat vs. Dog).


INITIALISING THE CNN

In [14]:
cnn = tf.keras.models.Sequential() #This line of code creates a Convolutional Neural Network (CNN) model using TensorFlow and Keras.

CONVOLUTION

In [16]:
# Define the input layer separately
cnn.add(tf.keras.layers.Input(shape=(64, 64, 3)))  

# Add convolutional layer (without input_shape)
cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu'))


Pooling

In [18]:
cnn.add(tf.keras.layers.MaxPool2D(pool_size = 2, strides = 2)) #This adds a Max Pooling layer to your Convolutional Neural Network (CNN).

1. MaxPool2D: Downsamples the feature maps → Reduces image dimensions while preserving important features.
2. Improves computational efficiency → Reduces the number of parameters & speeds up training.
3. Helps prevent overfitting → Removes unnecessary details while keeping key patterns.
4. pool_size=2	Takes a 2×2 window and selects the maximum value in that region.
5. strides=2	Moves the pooling window 2 pixels at a time, reducing the feature map size by half.


ADDING A SECOND CONVOLUTIONAL LAYER

In [26]:
cnn.add(tf.keras.layers.Conv2D(filters = 32, kernel_size = 3, activation = 'relu')) #do not add input shape for the second layer.
cnn.add(tf.keras.layers.MaxPool2D(pool_size = 2, strides = 2)) 

FLATTENING 

In [32]:
cnn.add(tf.keras.layers.Flatten()) #This adds a Flatten layer to your Convolutional Neural Network (CNN).

1. Converts a multi-dimensional feature map into a 1D vector
2. Prepares data for the Dense (fully connected) layers
3. Maintains important extracted features while making them suitable for classification

Full Connection

In [36]:
cnn.add(tf.keras.layers.Dense(units= 128, activation ='relu')) #This adds a fully connected (Dense) layer with 128 neurons and ReLU activation to your CNN.

What Does Dense() Do?
1. Creates a fully connected layer, where every neuron is connected to all previous layer neurons.
2. Processes extracted features from the convolutional layers for classification.
3. Uses activation functions (like ReLU, Softmax) to model complex patterns.

Output Layer

In [40]:
cnn.add(tf.keras.layers.Dense(units= 1, activation ='sigmoid')) #This adds an output layer with 1 neuron and Sigmoid activation for binary classification problems.

TRAINING THE CNN

compiling the CNN

In [44]:
cnn.compile(optimizer ='adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

training the CNN on the training set and evaluating it on the Test set

In [47]:
cnn.fit(x = training_set, validation_data = test_set, epochs = 25) #This trains the CNN model using the provided dataset. It runs for 25 epochs, validating performance on the test set.

  self._warn_if_super_not_called()


Epoch 1/25
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 131ms/step - accuracy: 0.4633 - loss: 0.8461 - val_accuracy: 0.4965 - val_loss: 0.6975
Epoch 2/25
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 116ms/step - accuracy: 0.5291 - loss: 0.6887 - val_accuracy: 0.5745 - val_loss: 0.6911
Epoch 3/25
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 121ms/step - accuracy: 0.5957 - loss: 0.6916 - val_accuracy: 0.5248 - val_loss: 0.6908
Epoch 4/25
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 111ms/step - accuracy: 0.5166 - loss: 0.6964 - val_accuracy: 0.5248 - val_loss: 0.6917
Epoch 5/25
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 118ms/step - accuracy: 0.5208 - loss: 0.6935 - val_accuracy: 0.5248 - val_loss: 0.6890
Epoch 6/25
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 112ms/step - accuracy: 0.5545 - loss: 0.6888 - val_accuracy: 0.5035 - val_loss: 0.6942
Epoch 7/25
[1m18/18[0m [3

<keras.src.callbacks.history.History at 0x16becb470>

MAKING A SINGLE PREDICTION

In [56]:
import numpy as np
from keras.preprocessing import image 
test_image = image.load_img('AI/cats_dogs/predict/pup2.jpg',target_size=(64,64))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
result = cnn.predict(test_image)
training_set.class_indices
if result[0][0] == 1:
    prediction = 'Dog'
else:
    prediction = 'cat'

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 27ms/step


In [58]:
print(prediction)

Dog
