# **Classifying Cats or Dogs using Transfer Learning**
  In this notebook,we will classify cats or dogs using a Pretrained MobileNet model.

Install the required Packages!

In [1]:
!pip install -U tensorflow_datasets

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting tensorflow_datasets
  Downloading tensorflow_datasets-4.6.0-py3-none-any.whl (4.3 MB)
[K     |████████████████████████████████| 4.3 MB 8.1 MB/s 
Collecting toml
  Downloading toml-0.10.2-py2.py3-none-any.whl (16 kB)
Collecting etils[epath]
  Downloading etils-0.6.0-py3-none-any.whl (98 kB)
[K     |████████████████████████████████| 98 kB 4.4 MB/s 
Installing collected packages: etils, toml, tensorflow-datasets
  Attempting uninstall: tensorflow-datasets
    Found existing installation: tensorflow-datasets 4.0.1
    Uninstalling tensorflow-datasets-4.0.1:
      Successfully uninstalled tensorflow-datasets-4.0.1
Successfully installed etils-0.6.0 tensorflow-datasets-4.6.0 toml-0.10.2


Import the necessary Packages that are required

In [2]:
import time
import numpy as np
import matplotlib.pylab as plt
import os

import tensorflow as tf
import tensorflow_datasets as tfds
tfds.disable_progress_bar()

from tensorflow.keras import layers

**Loading the datasets**


Load the data and split it into train and validation sets!!

In [3]:
(train_examples, validation_examples), info = tfds.load(
    'cats_vs_dogs',
    split = ('train[:80%]', 'train[80%:]'),
    with_info = True,
    as_supervised = True
)

[1mDownloading and preparing dataset 786.68 MiB (download: 786.68 MiB, generated: Unknown size, total: 786.68 MiB) to ~/tensorflow_datasets/cats_vs_dogs/4.0.0...[0m




[1mDataset cats_vs_dogs downloaded and prepared to ~/tensorflow_datasets/cats_vs_dogs/4.0.0. Subsequent calls will reuse this data.[0m


# **Preprocessing the data**

In [4]:
def format_image(image, label):
  image = tf.image.resize(image, IMG_SIZE)/255.0
  return image,label

num_examples = info.splits['train'].num_examples

BATCH_SIZE = 16
IMG_SIZE = (224, 224)

train_batches = train_examples.cache().shuffle(num_examples//4).map(format_image).batch(BATCH_SIZE).prefetch(1)
validation_batches = validation_examples.map(format_image).batch(BATCH_SIZE).prefetch(1)

To use a pretrained model,we should install the tensorflow-hub package,in this we are going to use Google's mobilenet for getting better results

In [5]:
!pip install tensorflow_hub

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


# **Ge the MobileNet pretrained layer!**

In [6]:
feature_extractor = tf.keras.applications.MobileNetV2(input_shape=(IMG_SIZE + (3,)), include_top=False)

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224_no_top.h5


In [7]:
feature_extractor.trainable = False

Train the pretrained model!!,just change the output layer,in this we are classifying cats vs dogs, as it is a binary classification,there are only two output layers

In [8]:
model = tf.keras.Sequential([
                             feature_extractor,
                             tf.keras.layers.GlobalMaxPooling2D(),
                             layers.Dense(2, activation="softmax")                           
], name="cats_vs_dogs")
model.summary()

Model: "cats_vs_dogs"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 mobilenetv2_1.00_224 (Funct  (None, 7, 7, 1280)       2257984   
 ional)                                                          
                                                                 
 global_max_pooling2d (Globa  (None, 1280)             0         
 lMaxPooling2D)                                                  
                                                                 
 dense (Dense)               (None, 2)                 2562      
                                                                 
Total params: 2,260,546
Trainable params: 2,562
Non-trainable params: 2,257,984
_________________________________________________________________


# **Compile the model**


In [9]:
# Create a function to implement a ModelCheckpoint callback with a specific filename
def create_model_checkpoint(model_name, save_path="model_experiments"):
  return tf.keras.callbacks.ModelCheckpoint(filepath=os.path.join(save_path, model_name),
                                            monitor="val_loss",
                                            verbose=0,
                                            save_best_only=True)

In [10]:
model.compile(
    optimizer = 'adam',
    loss = tf.losses.SparseCategoricalCrossentropy(),
    metrics=['accuracy']
)
EPOCHS = 5
history = model.fit(train_batches,
                    epochs = EPOCHS,
                    validation_data = validation_batches,
                    callbacks=[create_model_checkpoint(model_name=model.name)])

Epoch 1/5



INFO:tensorflow:Assets written to: model_experiments/cats_vs_dogs/assets


INFO:tensorflow:Assets written to: model_experiments/cats_vs_dogs/assets


Epoch 2/5



INFO:tensorflow:Assets written to: model_experiments/cats_vs_dogs/assets


INFO:tensorflow:Assets written to: model_experiments/cats_vs_dogs/assets


Epoch 3/5
Epoch 4/5
Epoch 5/5


We can see that it has a nearly 99% accuracy on validation examples

In [11]:
class_names = np.array(info.features['label'].names)
class_names

array(['cat', 'dog'], dtype='<U3')

Predict for the next batch of images!!

# Save the model

Save the model in h5 format to use it later!!

In [12]:
# Load in the best saved model
model_1 = tf.keras.models.load_model("/content/model_experiments/cats_vs_dogs")

In [13]:
model_1.evaluate(validation_batches)



[0.10142698138952255, 0.9819432497024536]

# Saving the model in .h5 format
Save the trained reloaded model,to export it later

In [14]:
model_1.save("cats_vs_dogs.h5")

## Unzip the testing data for benchmarking

In [15]:
import zipfile

# Unzip the downloaded file
zip_ref = zipfile.ZipFile("/content/cats_vs_dogs.zip", "r")
zip_ref.extractall()
zip_ref.close()

In [16]:
val_data = tf.keras.utils.image_dataset_from_directory(directory="/content/cats_vs_dogs",
                                              #color_mode="grayscale",
                                              image_size=IMG_SIZE,
                                              batch_size=BATCH_SIZE,
                                              label_mode="int")

Found 100 files belonging to 2 classes.


In [17]:
def normalize_img(image, label):
  """Normalizes images: `uint8` -> `float32`."""
  return tf.cast(image, tf.float32) / 255., label

In [18]:
val_data = val_data.map(normalize_img)
val_data

<MapDataset element_spec=(TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int32, name=None))>

### Benchmark Accuracy

In [19]:
print(f"The Accuracy of the validation data {model_1.evaluate(val_data)[1] * 100:.2f}%")

The Accuracy of the validation data 97.00%


### Function to read in benchmarking images

In [20]:
# Function to infer the val images 
import numpy as np
import os
import time
from PIL import Image
import PIL.Image

def benchmark(val_dir, model, class_names=class_names, image_size=IMG_SIZE):
  file_count = 0
  infer_times = []
  for (root, dirs, files) in os.walk(val_dir):
    for name in files:
      if name.endswith(".jpg"):
        filename = os.path.join(root, name)
        if file_count < 1 :
          init_timer_start = time.time()
          img = np.array(Image.open(filename).resize(IMG_SIZE))/255.
          pred = model.predict(np.expand_dims(img, axis=0))
          pred_class = class_names[int(np.argmax(pred[0]))]
          init_timer_end = time.time()
          init_timer = init_timer_end - init_timer_start
          file_count+=1
        else:
          timer_start = time.time()
          img = np.array(Image.open(filename).resize(IMG_SIZE))/255.
          pred = model.predict(np.expand_dims(img, axis=0))
          pred_class = class_names[int(np.argmax(pred[0]))]
          timer_end = time.time()
          infer_times.append((timer_end - timer_start))
          file_count+=1

  return init_timer, np.mean(infer_times), np.std(infer_times)

In [21]:
init_time, avg_time, std = benchmark(val_dir="/content/cats_vs_dogs",
          model=model_1)
print(f"The first image takes {init_time * 1000:.2f} ms")
print(f"The average time taken per 99 images {avg_time * 1000:.2f} ms")
print(f"The standard deviation of samples is {std * 1000:.2f} ms")

The first image takes 972.27 ms
The average time taken per 99 images 49.36 ms
The standard deviation of samples is 3.84 ms


In [22]:
init_time, avg_time, std = benchmark(val_dir="/content/cats_vs_dogs",
          model=model_1)
print(f"The first image takes {init_time * 1000:.2f} ms")
print(f"The average time taken per 99 images {avg_time * 1000:.2f} ms")
print(f"The standard deviation of samples is {std * 1000:.2f} ms")

The first image takes 67.61 ms
The average time taken per 99 images 50.54 ms
The standard deviation of samples is 4.83 ms
