# Cats vs Dogs!

<img src="https://raw.githubusercontent.com/benjum/UCLA-24W-DH150/main/Data/cat-dog-data/test/cat/cat.1500.jpg" width="250px"/> <img src="https://raw.githubusercontent.com/benjum/UCLA-24W-DH150/main/Data/cat-dog-data/test/dog/dog.1500.jpg" width="250px"/>

#### We're going to use TensorFlow and Convolutional Neural Networks to identify whether a picture is a dog or a cat.

Side-note:  I think the dataset is biased against cats, because most of the cats I've looked at in this dataset look crazy, a bit ugly, or like they totally have it out for the dogs!

This dataset is a subset of the data that can be obtained at [Kaggle's Dogs Vs Cats page](https://www.kaggle.com/c/dogs-vs-cats/data).

The code below was motivated by Chollet's Deep Learning with Python book, in which you will find many, many more interesting details and tidbits about doing deep learning.

In [None]:
import numpy as np 
import matplotlib.pyplot as plt

import tensorflow as tf
from tensorflow import keras 
from tensorflow.keras import layers
from tensorflow.keras.utils import image_dataset_from_directory

In [None]:
tf.__version__

In [None]:
tf.config.list_physical_devices()

In [None]:
# tf.get_logger().setLevel('ERROR')

## Getting our data

Our data images are in folders that are already split up into training, test, and validation sets.  Furthermore, they are aleady split up into cat and dog folders.

The following image_dataset_from_directory gets class labels based on whether the images are retrieved from these "cat" or "dog" folders.

In [None]:
import os

In [None]:
os.path.join(os.getcwd(), "..", "..", "Data", "cat-dog-data", "train")

In [None]:
train_dataset = image_dataset_from_directory(
    os.path.join(os.getcwd(), "..", "..", "Data", "cat-dog-data", "train"),
    image_size=(180, 180),
    batch_size=32)
validation_dataset = image_dataset_from_directory(
    os.path.join(os.getcwd(), "..", "..", "Data", "cat-dog-data", "validation"),
    image_size=(180, 180),
    batch_size=32)
test_dataset = image_dataset_from_directory(
    os.path.join(os.getcwd(), "..", "..", "Data", "cat-dog-data", "test"),
    image_size=(180, 180),
    batch_size=32)

Brief aside:  when you have these batches of data, how can you work with these Python data objects?

In [None]:
train_ds = train_dataset.unbatch()
a = list(train_ds)

In [None]:
len(a)

The image is at index 0:

In [None]:
a[2][0].shape

The label is at index 1:

In [None]:
a[2][1]

In [None]:
a[2][0][0][2]

In [None]:
a[2][1]

In [None]:
b = 1002
print("The label for image",b,"is",a[b][1].numpy())
print("The picture for image",b,"is")
plt.imshow(a[b][0].numpy().astype('int32'));

## Building our model

In [None]:
inputs = keras.Input(shape=(180, 180, 3))
x = layers.Rescaling(1./255)(inputs)
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Flatten()(x)
outputs = layers.Dense(1, activation="sigmoid")(x)

In [None]:
model = keras.Model(inputs=inputs, outputs=outputs)

In [None]:
model.summary()

Note that since we are doing binary classification, we now use binary_crossentropy rather than categorical_crossentropy.

In [None]:
model.compile(loss="binary_crossentropy",
              optimizer="rmsprop",
              metrics=["accuracy"])

During our model fitting, we can save the model in a file after each epoch.  This allows us to retrieve model information later if needed.
* Key to our needs, when we use `save_best_only=True` and `monitor="val_loss"`, we only save the model into the file (and overwrite the previous model) if the current value of the val_loss metric is lower. 
* Our saved file thus saves that model corresponding its best performance on the validation data.

In [None]:
callbacks = [
    keras.callbacks.ModelCheckpoint(
        filepath="convnet_from_scratch.keras",
        save_best_only=True,
        monitor="val_loss",
        # for below to work with gpu and tf 2.13
        # save_weights_only=True
    )
]

Do the fit!

Note that we include validation now as a part of this fit.  Validation is useful for assessing whether the trained model generalizes well to unseen data, and we do it on data that is not the test data because we may want to alter our model hyperparameters before doing our final training and then our final testing on our test data.

In [None]:
history = model.fit(
    train_dataset,
    epochs=20,
    validation_data=validation_dataset,
    callbacks=callbacks)

## How does the model perform?

* Does it overfit to the training data?
    * Does the accurarcy and loss on the validation data match up reasonably with the accuracy and loss on the training data?
* What is the accuracy of classification?
    * Assess the model on the test data (potentially only using the number of epochs for which validation has shown us that overfitting is not occurring)

In [None]:
accuracy = history.history["accuracy"]
val_accuracy = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]

epochs = range(1, len(accuracy) + 1)

plt.plot(epochs, accuracy, "bo", label="Training accuracy")
plt.plot(epochs, val_accuracy, "b", label="Validation accuracy")
plt.title("Training and validation accuracy")
plt.legend()

plt.figure()

plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.legend()

plt.show()

To test the model accurracy on test data, we take advantage of the fact that we've saved the model (in "convnet_from_scratch.keras") at that point for which the "val_loss" was lowest.
* We can assess the model from a point before we started overfitting
* We can do it without having to retrain the model completely using a smaller number of epochs.

In [None]:
test_model = keras.models.load_model("convnet_from_scratch.keras")
test_loss, test_acc = test_model.evaluate(test_dataset) 
print(f"Test accuracy: {test_acc:.3f}")

In [None]:
# # with gpu and tf 2.13

# model.load_weights("convnet_from_scratch.keras")
# test_loss, test_acc = model.evaluate(test_dataset) 
# print(f"Test accuracy: {test_acc:.3f}")

## Model #2

The overfitting occurs in part because we have such a small dataset.  To bypass this without getting a lot of new data, we're going to use a clever bit of data manipulation.

Rather than using new images, we're going to use the same images with a small bit of zoom, rotation, and/or horizontal flipping.  This allows us to get images that have different pixel values and that are new images to the algorithm, but that are essentially the same dog or cat image.

In [None]:
data_augmentation = keras.Sequential(
    [
        layers.RandomFlip("horizontal"),
        layers.RandomRotation(0.1),
        layers.RandomZoom(0.2),
    ]
)

In [None]:
plt.figure(figsize=(10, 10))
for images, _ in train_dataset.take(1):
    for i in range(9):
        augmented_images = data_augmentation(images)
        ax = plt.subplot(3, 3, i + 1)
        plt.imshow(augmented_images[0].numpy().astype("uint8"))
        plt.axis("off")

In [None]:
inputs = keras.Input(shape=(180, 180, 3))

x = data_augmentation(inputs)

x = layers.Rescaling(1./255)(x)
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Flatten()(x)

x = layers.Dropout(0.5)(x)

outputs = layers.Dense(1, activation="sigmoid")(x)

In [None]:
model = keras.Model(inputs=inputs, outputs=outputs)

In [None]:
model.compile(loss="binary_crossentropy",
              optimizer="rmsprop",
              metrics=["accuracy"])

In [None]:
callbacks = [
    keras.callbacks.ModelCheckpoint(
        filepath="convnet_from_scratch_with_augmentation.keras",
        save_best_only=True,
        monitor="val_loss",
        # with gpu and tf 2.13
        # save_weights_only=True
    ),
    keras.callbacks.TensorBoard(
        log_dir="./tf_logs",
    )
]

In [None]:
history = model.fit(
    train_dataset,
    #epochs=20,
    epochs=60,
    validation_data=validation_dataset,
    callbacks=callbacks)

In [None]:
accuracy = history.history["accuracy"]
val_accuracy = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]

epochs = range(1, len(accuracy) + 1)

plt.plot(epochs, accuracy, "bo", label="Training accuracy")
plt.plot(epochs, val_accuracy, "b", label="Validation accuracy")
plt.title("Training and validation accuracy")
plt.legend()

plt.figure()

plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.legend()

plt.show()

Much better behavior relative to overfitting!

In [None]:
test_model = keras.models.load_model("convnet_from_scratch_with_augmentation.keras")
test_loss, test_acc = test_model.evaluate(test_dataset) 
print(f"Test accuracy: {test_acc:.3f}")

In [None]:
# # with gpu and tf 2.13

# model.load_weights("convnet_from_scratch_with_augmentation.keras")
# test_loss, test_acc = model.evaluate(test_dataset) 
# print(f"Test accuracy: {test_acc:.3f}")

Getting better at accuracy.

We can actually do much better with more convolutional and pooling layers! (although the computational demand for resources will be even higher -- these networks can quickly consume resources!!)

## Checking out a couple images and their cat/dog classification

The test (and train and validation) data is stored in special TensorFlow batch data structures (retrieved from the file hierarchy).... these aren't exactly easy to unwind, but the following couple cells will allow us to retrieve example images and feed them into the model for making a classification prediction.

In [None]:
test_ds = test_dataset.unbatch()
a = list(test_ds)

In [None]:
a[2][0].shape

In [None]:
b = 5
print("The label for image",b,"is",a[b][1].numpy())
print("The picture for image",b,"is")
plt.imshow(a[b][0].numpy().astype('int32'));

In [None]:
test_model.predict(a[5][0].numpy().reshape(-1,180,180,3))

In [None]:
b = 1051
print("The label for image",b,"is",a[b][1].numpy())
plt.imshow(a[b][0].numpy().astype('int32'))
if int(test_model.predict(a[b][0].numpy().reshape(-1,180,180,3))[0][0]>0.5) == 0:
    print('Predict Cat')
else:
    print('Predict Dog')

In [None]:
# # with gpu and tf 2.13

# b = 1051
# print("The label for image",b,"is",a[b][1].numpy())
# plt.imshow(a[b][0].numpy().astype('int32'))
# if int(model.predict(a[b][0].numpy().reshape(-1,180,180,3))[0][0]>0.5) == 0:
#     print('Predict Cat')
# else:
#     print('Predict Dog')