# Cats and Dogs binary classification using transfer learning

In this project, we tackle the classic binary image classification problem of distinguishing between cats and dogs using deep learning. Given the complexity of visual features in animal images, training a high-accuracy model from scratch requires a large dataset and significant computational resources.

To overcome these challenges, we leverage Transfer Learning—a powerful technique where we utilize a pre-trained deep learning model (InceptionV3) that has already learned rich feature representations from a massive dataset (ImageNet). By fine-tuning this model, we can achieve strong performance even with a relatively small dataset of cat and dog images.

## Model: InceptionV3
We use InceptionV3, a state-of-the-art CNN architecture, as our base model. Key advantages:

--> Depth and complexity: 48 layers with optimized inception modules for multi-scale feature extraction.

--> Pre-trained on ImageNet: Trained on ImageNet which is large image dataset, containing about 1.2M training images, with 1000 classes.

--> Efficient fine-tuning: We freeze most layers and only train a custom classifier head.

### Import modules

In [None]:
import urllib.request
import os
import zipfile
import random
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import layers
from tensorflow.keras import Model
from tensorflow.keras.applications.inception_v3 import InceptionV3
from tensorflow.keras.optimizers import RMSprop
from shutil import copyfile


# Data Preparation

We are using the dataset, originally hosted by Microsoft Research as part of a Kaggle competition (2013), contains 12501 images of each class.

In [None]:
data_url = "https://download.microsoft.com/download/3/E/1/3E1C3F21-ECDB-4869-8368-6DEBA77B919F/kagglecatsanddogs_5340.zip"
data_file_name = "catsdogs.zip"

download_dir = '/tmp/'
urllib.request.urlretrieve(data_url, data_file_name)
zip_ref = zipfile.ZipFile(data_file_name, 'r')#open the zip file in read mode
zip_ref.extractall(download_dir)
zip_ref.close()



In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:

print("Number of cat images:",len(os.listdir('/tmp/PetImages/Cat/')))
print("Number of dog images:", len(os.listdir('/tmp/PetImages/Dog/')))


Create some folders that will store the training and test data.
- There will be a training folder and a testing folder.
- Each of these will have a subfolder for cats and another subfolder for dogs.

In [None]:
try:
    os.mkdir('/tmp/cats-v-dogs')
    os.mkdir('/tmp/cats-v-dogs/training')
    os.mkdir('/tmp/cats-v-dogs/testing')
    os.mkdir('/tmp/cats-v-dogs/training/cats')
    os.mkdir('/tmp/cats-v-dogs/training/dogs')
    os.mkdir('/tmp/cats-v-dogs/testing/cats')
    os.mkdir('/tmp/cats-v-dogs/testing/dogs')
except OSError:
    pass

### Split data into training and test sets

- The following code put first checks if an image file is empty (zero length)
- Of the files that are not empty, it puts 90% of the data into the training set, and 10% into the test set.

In [None]:
import random
from shutil import copyfile
random.seed(42)

def split_data(SOURCE, TRAINING, TESTING, SPLIT_SIZE):
    files = []

    for filename in os.listdir(SOURCE):
        file = SOURCE + filename
        if os.path.getsize(file) > 0:
            files.append(filename)
        else:
            print(filename + " is zero length, so ignoring.")

    #finding the size
    training_length = int(len(files) * SPLIT_SIZE)
    testing_length = int(len(files) - training_length)

    #shuffle random before spliting to avoid bias
    shuffled_set = random.sample(files, len(files))

    training_set = shuffled_set[0:training_length]
    testing_set = shuffled_set[training_length:]

    #copies each file in training set to TRAINING dir
    for filename in training_set:
        this_file = SOURCE + filename
        destination = TRAINING + filename
        copyfile(this_file, destination)

    for filename in testing_set:
        this_file = SOURCE + filename
        destination = TESTING + filename
        copyfile(this_file, destination)



In [None]:
#define paths
CAT_SOURCE_DIR = "/tmp/PetImages/Cat/"
TRAINING_CATS_DIR = "/tmp/cats-v-dogs/training/cats/"
TESTING_CATS_DIR = "/tmp/cats-v-dogs/testing/cats/"
DOG_SOURCE_DIR = "/tmp/PetImages/Dog/"
TRAINING_DOGS_DIR = "/tmp/cats-v-dogs/training/dogs/"
TESTING_DOGS_DIR = "/tmp/cats-v-dogs/testing/dogs/"

split_size = .9 # 90% in training dataset

split_data(CAT_SOURCE_DIR, TRAINING_CATS_DIR, TESTING_CATS_DIR, split_size)
split_data(DOG_SOURCE_DIR, TRAINING_DOGS_DIR, TESTING_DOGS_DIR, split_size)


let's check that the training and test sets

In [None]:

print("Number of training cat images", len(os.listdir('/tmp/cats-v-dogs/training/cats/')))
print("Number of training dog images", len(os.listdir('/tmp/cats-v-dogs/training/dogs/')))
print("Number of testing cat images", len(os.listdir('/tmp/cats-v-dogs/testing/cats/')))
print("Number of testing dog images", len(os.listdir('/tmp/cats-v-dogs/testing/dogs/')))


### Data augmentation

Here, we are using the `ImageDataGenerator` to perform data augmentation.  
Things like rotating and flipping the existing images allows you to generate training data that is more varied, and can help the model generalize better during training and improve test accuacy at the end.

flow_from_directory() automatically assigns labels based on subfolder names: cats images will store in /tmp/training/cats/ directory. And dogs images in /tmp/training/dogs/ for training data. And similarly for testing data.

In [None]:

TRAINING_DIR = "/tmp/cats-v-dogs/training/"

train_datagen = ImageDataGenerator(rescale=1./255,
      rotation_range=40,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True,
      fill_mode='nearest')
train_generator = train_datagen.flow_from_directory(TRAINING_DIR,
                                                    batch_size=100,
                                                    class_mode='binary',
                                                    target_size=(150, 150))

VALIDATION_DIR = "/tmp/cats-v-dogs/testing/"



validation_datagen = ImageDataGenerator(rescale=1./255)#we are just normalizing it.
validation_generator = validation_datagen.flow_from_directory(VALIDATION_DIR,
                                                              batch_size=100,
                                                              class_mode='binary',
                                                              target_size=(150, 150))



## Transfer Learning: (Get and prepare the model)

We will be using the `InceptionV3` model.  
- Since we're making use of transfer learning, we will load the pre-trained weights of the model.
- we'll also freeze the existing layers so that they aren't trained on the downstream task with the cats and dogs data.
- we'll also get a reference to the last layer, 'mixed7' because we'll add some layers after this last layer.

In [None]:
weights_url = "https://storage.googleapis.com/mledu-datasets/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5"
weights_file = "inception_v3.h5"

#downlaod and retrieve the weights
urllib.request.urlretrieve(weights_url, weights_file)


In [None]:

# Instantiate the model, remove the top layer
pre_trained_model = InceptionV3(input_shape=(150, 150, 3),
                                include_top=False,
                                weights=None)

# load pre-trained weights
pre_trained_model.load_weights(weights_file)

# freeze the layers, prevent these weights from being updated.
for layer in pre_trained_model.layers:
    layer.trainable = False

# pre_trained_model.summary()
last_layer = pre_trained_model.get_layer('mixed7')

print('last layer output shape: ', last_layer.output.shape)
last_output = last_layer.output



### Add layers: A custom classifier
Will add some layers that will be trained on the cats and dogs data.
- `Flatten`: This will take the output of the `last_layer` and flatten it to a vector.
- `Dense`: a dense layer with a relu activation.
- `Dense`: After that, a dense layer with a sigmoid activation.  The sigmoid will scale the output to range from 0 to 1, and allow you to interpret the output as a prediction between two categories (cats or dogs).

The instantiate the model

In [None]:
# Flatten to 1 dimension
x = layers.Flatten()(last_output)
# fully connected layer with 1,024 hidden units and ReLU activation
x = layers.Dense(1024, activation='relu')(x)
# Add a final sigmoid layer for binary classification
x = layers.Dense(1, activation='sigmoid')(x)

model = Model(pre_trained_model.input, x)


## Model Training
Compile the model, and then train it on the test data.


In [None]:
# compile the model
model.compile(optimizer=RMSprop(learning_rate=0.0001),
              loss='binary_crossentropy',
              metrics=['acc'])

# train the model (adjust the number of epochs from 1 to improve performance)
history = model.fit(
            train_generator,
            validation_data=validation_generator,
            epochs=10,
            verbose=1)

### Visualize the training and validation accuracy

In [None]:
%matplotlib inline

import matplotlib.image  as mpimg
import matplotlib.pyplot as plt

acc=history.history['acc']
val_acc=history.history['val_acc']
loss=history.history['loss']
val_loss=history.history['val_loss']

epochs=range(len(acc)) # Get number of epochs

# Plot training and validation accuracy per epoch
plt.plot(epochs, acc, 'r', "Training Accuracy")
plt.plot(epochs, val_acc, 'b', "Validation Accuracy")
plt.title('Training and validation accuracy')
plt.figure()



### Predict on a test image

You can upload any image and have the model predict whether it's a dog or a cat.
- Find an image of a dog or cat
- Run the following code cell.  It will ask you to upload an image.
- The model will print "is a dog" or "is a cat" depending on the model's prediction.

In [None]:
import numpy as np
from google.colab import files
from tensorflow.keras.utils import load_img, img_to_array

uploaded = files.upload()

for fn in uploaded.keys():

  # predicting images
  path = '/content/' + fn
  img = load_img(path, target_size=(150, 150))
  x = img_to_array(img)
  x /= 255
  x = np.expand_dims(x, axis=0)

  image_tensor = np.vstack([x])
  classes = model.predict(image_tensor)
  print(classes[0])
  if classes[0]>0.5:
    print(fn + " is a dog")
  else:
    print(fn + " is a cat")