
# <span style = "color : green"> DOGS vs CATS </span>

### Data Augmentation

#### Import Libraries

In [1]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator, img_to_array, load_img, array_to_img

#### Testing if the Image data generator works fine

In [2]:
datagen= ImageDataGenerator(
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest')

##### Explanation

rotation_range: The range (in degrees) for random rotations of the input images.

width_shift_range: The range (as a fraction of total width) for random horizontal shifts of the input images.

height_shift_range: The range (as a fraction of total height) for random vertical shifts of the input images.

shear_range: The range (in degrees) for random shearing transformations of the input images.

zoom_range: The range for random zooming of the input images.

horizontal_flip: A boolean value indicating whether to randomly flip the input images horizontally.

fill_mode: The method used for filling in any pixels that may be lost during the above transformations. In this case, 'nearest' is used, which means that the pixel value of the nearest neighboring pixel will be used to fill in any lost pixels.

In [3]:
from PIL import Image

image_path = "C:/Users/Administrator/Desktop/DATA-SCIENCE/DL/DogsVsCats/DATASET/cats_or_dogs_1.jpg"

with Image.open(image_path) as image:
    print(f"Image size: {image.size}")
    print(f"Image format: {image.format}")


Image size: (500, 520)
Image format: JPEG


In [4]:
img = load_img('C:/Users/Administrator/Desktop/DATA-SCIENCE/DL/DogsVsCats/DATASET/cats_or_dogs_1.jpg')

In [5]:
img = img.resize((150, 150))

In [6]:
x=img_to_array(img)

In [7]:
print(img.size)

(150, 150)


In [8]:
x=x.reshape((1,)+x.shape) # Numpy array with shape (1,3,150,150)

#### Explanation

This line of code uses the Numpy function reshape() to change the shape of the Numpy array x. The new shape of x will have one extra dimension with a size of 1 compared to the original shape of x.

The original shape of x is (3,150,150), which means x has 3 channels and each channel has a 2D image of size 150x150.

The new shape of x after reshape() is (1,3,150,150). This means that the array now has an extra dimension at the beginning with a size of 1. The remaining dimensions have the same sizes as the original array. This can be useful when working with deep learning models that expect inputs with a certain number of dimensions, such as convolutional neural networks (CNNs) that expect inputs in the format of (batch_size, channels, height, width).

In [9]:
i=0
for batch in datagen.flow(x,save_to_dir='C:/Users/Administrator/Desktop/DATA-SCIENCE/DL/DogsVsCats/preview',save_prefix='cat',save_format='jpeg'):
    i+=1
    if i>20:
        break

### Explanation

i = 0: Initializes a counter variable to keep track of the number of augmented images that have been generated.

for batch in datagen.flow(x, batch_size=1, save_to_dir='preview', save_prefix='cat', save_format='jpeg'):: This is a for loop that uses the flow method of the datagen object to generate batches of augmented images. The flow method takes as input the x image data array and generates batches of augmented images on-the-fly. The batch_size parameter specifies the number of images to generate in each batch. In this case, we set it to 1. The save_to_dir parameter specifies the directory where the augmented images will be saved. The save_prefix parameter specifies the prefix to be added to the filename of each augmented image. The save_format parameter specifies the file format for the augmented images (in this case, JPEG).

i += 1: Increments the counter variable by 1 after each batch of augmented images is generated.

if i > 20: break: This statement checks if the number of augmented images generated (i) is greater than 20. If it is, the loop is exited using the break statement. This ensures that only 20 augmented images are generated.

### Actual model 

In [10]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D,MaxPooling2D
from tensorflow.keras.layers import Activation,Dropout,Flatten,Dense
from tensorflow.keras.layers import ZeroPadding2D

In [11]:
model=Sequential()
model.add(Conv2D(32,(3,3),input_shape=(150,150,3)))

The size of the filters in a convolutional layer is typically chosen based on the size and complexity of the input image, as well as the desired output. In this case, the input image is 150x150 pixels and has 3 color channels (RGB), so a 3x3 filter size is a common choice for the first convolutional layer.

The number of filters in a convolutional layer is often chosen based on the complexity of the problem and the capacity of the model. In this case, 32 filters is a relatively small number and is a common choice for the first convolutional layer in a simple CNN architecture.

In [12]:
model.add(Activation('relu'))

This line adds a rectified linear unit (ReLU) activation function to the layer. ReLU is commonly used as an activation function in deep learning models because it is simple, fast, and has been shown to work well in practice

In [13]:
model.add(MaxPooling2D(pool_size=(2, 2)))

In [14]:
#Hidden layer
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

In [15]:
#  2 Fully-connected layer
model.add(Flatten())


#1st
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5)) #avoids overfitting

#2nd
model.add(Dense(1))
model.add(Activation('sigmoid'))

model.add(Flatten()): This layer flattens the input array to a 1D array. This is necessary because the convolutional and pooling layers output a 3D array, and the next layer in the model is a fully connected layer which expects a 1D array.

model.add(Dense(64)): This adds a fully connected layer with 64 neurons.

model.add(Activation('relu')): This applies the rectified linear unit (ReLU) activation function to the output of the previous layer. ReLU is a common activation function used in neural networks to introduce nonlinearity.

model.add(Dropout(0.5)): This adds a dropout layer with a rate of 0.5, which randomly drops out 50% of the neurons during training. Dropout is a regularization technique used to prevent overfitting.

model.add(Dense(1)): This adds a final fully connected layer with a single neuron, which is used for binary classification.

model.add(Activation('sigmoid')): This applies the sigmoid activation function to the output of the previous layer. The sigmoid function squashes the output to a range between 0 and 1, which is useful for binary classification.






In [16]:
model.compile(loss='binary_crossentropy',
             optimizer='rmsprop',
             metrics=['accuracy'])

Let's prepare our data. We will use .flow_from_directory() to generate batches of image data (and their labels) directly from our jpgs in their respective folders.

In [17]:
batch_size=16

#training set
train_datagen=ImageDataGenerator(rescale=1./255,
                                shear_range=0.2,
                                zoom_range=0.2,
                                horizontal_flip=True)
test_datagen=ImageDataGenerator(rescale=1./255)


rescale=1./255: rescales the pixel values of the images to be between 0 and 1, which is a common preprocessing step for image data.

shear_range=0.2: randomly applies a shearing transformation to the images, which shifts the positions of pixels along a certain direction.

zoom_range=0.2: randomly applies a zooming transformation to the images, which either zooms in or out of the image by a certain factor.

horizontal_flip=True: randomly flips the images horizontally, which helps to improve the model's ability to detect objects regardless of their orientation.

Together, these transformations create a more diverse and robust training set, which helps to improve the model's performance.

In [18]:
train_generator = train_datagen.flow_from_directory(
        'C:/Users/Administrator/Desktop/DATA-SCIENCE/DL/DogsVsCats/train',  # this is the target directory
        target_size=(150, 150),  # all images will be resized to 150x150
        batch_size=batch_size,
        class_mode='binary')

Found 557 images belonging to 2 classes.


In [19]:
test_generator=test_datagen.flow_from_directory(
    'C:/Users/Administrator/Desktop/DATA-SCIENCE/DL/DogsVsCats/test',
    target_size=(150,150), batch_size=batch_size, class_mode='binary')

Found 140 images belonging to 2 classes.


The rescale parameter normalizes the pixel values of the images to the range [0, 1], the shear_range, zoom_range, and horizontal_flip parameters perform various transformations on the images to augment the dataset and help prevent overfitting.

The train_generator variable is created by calling the flow_from_directory method of the ImageDataGenerator instance. This method generates batches of training data by reading images from a directory on the local file system. In this case, the train_generator will read images from the directory specified by the path 'C:/Users/Administrator/Desktop/DATA-SCIENCE/DL/DogsVsCats/train'.

The target_size parameter specifies the size to which all images will be resized to before they are fed into the model. Here, all images will be resized to (150, 150) pixels.

The batch_size parameter specifies the number of images to include in each batch of training data. This can be adjusted depending on the available memory on the machine running the code.

The class_mode parameter specifies the type of label encoding to use for the labels associated with the images. Here, since the problem is a binary classification task (dogs vs cats), the class_mode is set to 'binary'. This means that the labels will be encoded as either 0 or 1.

In summary, this code creates an instance of ImageDataGenerator that applies various data augmentation techniques to the images, and then generates batches of training data from a directory of images by calling the flow_from_directory method of the ImageDataGenerator instance.

In [20]:
model.fit_generator(train_generator,
                    steps_per_epoch=557//batch_size,epochs=50,
                    validation_data=test_generator,
                    validation_steps=140//batch_size)

  model.fit_generator(train_generator,


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x1f44e0a0a30>

In [34]:
model.save('DogsVsCats.h5')

In [42]:
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing import image
import numpy as np
import tensorflow as tf

In [44]:
nmodel=load_model('DogsVsCats.h5')

In [81]:
img=image.load_img('cat1.jpg',target_size=(150,150))

In [82]:
img_array=image.img_to_array(img)

In [83]:
img_array=img_array.reshape((1,)+img_array.shape)

In [84]:
prediction=nmodel.predict(img_array)



In [85]:
if prediction[0]<0.5:
    print('the image is a cat')
elif prediction[0]>0.5:
    print('the image is a dog')
else:
    print('image is neither dog or cat')

the image is a cat
