# Using a pre-trained convnet for image classification

We now learn how to use powerful deep learning models, specifically, *pre-trained convnets*, to classify more realistic image datasets (as compared to the overly simplified Fashion MNIST dataset). We will use the popular [dogs vs. cats](https://www.kaggle.com/c/dogs-vs-cats/overview) dataset.

This hands-on is based on the code in Chapter 5, Section 3 of [Deep Learning with Python](https://www.manning.com/books/deep-learning-with-python?), and simplified by the professor to focus on the key ideas & implementation behind utilizing a pretrained convnet.



In [None]:
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

print(f'tensorflow version is {tf.__version__}')
print(f'keras version is {keras.__version__}')

## Importing a pre-trained convnet

As we discussed in class, the benefit of a pre-trained convnet is that it helps with better feature extraction. We will use a pre-trained convnet called VGG16. Other pretrained convnets available in Keras are :

* Xception
* InceptionV3
* ResNet50
* VGG19
* MobileNet

Let's instantiate the VGG16 model:

In [None]:
from tensorflow.keras.applications import VGG16

conv_base = VGG16(include_top=False, input_shape=(150, 150, 3))

We passed two arguments to the constructor:

* `include_top`, which refers to including or not the densely-connected classifier on top of the network. By default, this 
densely-connected classifier would correspond to the 1000 classes from ImageNet. Since we intend to use our own densely-connected 
classifier (with only two classes, cat and dog), we don't need to include it.
* `input_shape`, the shape of the image tensors that we will feed to the network. This argument is purely optional: if we don't pass it, 
then the network will be able to process inputs of any size.

Note that this VGG16 model is already pre-trained with the imagenet dataset, thus all trained weight values are already available within the model. Usually we don't touch these weight values. But if needed, the `"weights"` parameter allows you to replace or re-train these weights.

Here's the detail of the architecture of the VGG16 convolutional base: it's very similar to the simple convnets that you are already 
familiar with.

In [None]:
conv_base.summary()

The final feature map has shape `(4, 4, 512)`. That's the feature on top of which we will stick a densely-connected classifier.

**Note**: If your input images are *not* of size 150*150, the shape of the final feature map may change accordingly. So to be safe, always check what it will be.

At this point, there are two ways we could proceed: 

* Running the convolutional base over our dataset, recording its output to a Numpy array on disk, then using this data as input to a 
standalone densely-connected classifier similar to those you have seen in the first chapters of this book. This solution is very fast and 
cheap to run, because it only requires running the convolutional base once for every input image, and the convolutional base is by far the 
most expensive part of the pipeline. However, for the exact same reason, this technique would not allow us to leverage data augmentation at 
all.
* Extending the model we have (`conv_base`) by adding `Dense` layers on top, and running the whole thing end-to-end on the input data. This 
allows us to use data augmentation, because every input image is going through the convolutional base every time it is seen by the model. 
However, for this same reason, this technique is far more expensive than the first one.

We will only try the first technique. 

## Prepare the dataset
We use a smaller dataset of the images of dogs and cats -- 2000 in training, 1000 in validation and 1000 in testing. About 90MB in total size. I share this dataset with you in the form of a single zip file in Canvas. First download and save this file onto your computer.

Next, we need to upload this zip file from your computer to Google Colab. Notes:
* Why upload this zip file, instead of the 4,000 individual image files? -- The reason is that Google Colab is notorious for being very slow in uploading many small files. This "zip then unzip" approach get data ready for us in a few minutes (as compared to 20-30 minutes if we don't zip before uploading).
* Why not just mount our Google Drive as we did before? -- Same reason as above, it is very slow to transfer many small files from Google Drive to Google Colab.

There are two methods to upload, as below:


### (Not used today) Method 1 for uploading data into Google Colab

The code below uploads this zip file into Google Colab's virtual machine.


In [None]:
from google.colab import files
# import io

uploaded = files.upload()

In [None]:
for k,v in uploaded.items():
  open(k,'wb').write(v)

### Method 2 for uploading data into Google Colab

Instead of coding, use the "Upload to session storage" button on the left panel. We'll use this method today as it is much faster that Method 1.

### Unzip the file

In [None]:
# By default, uploaded files are saved under directory /content
# The unzip command below creates a folder /content/dogs_vs_cats_small
# that contains all the images.
!unzip -q /content/dogs_vs_cats_small.zip

## Feature extraction using the pre-trained convnet


In [None]:
import os
from tensorflow.keras.preprocessing.image import ImageDataGenerator

In [None]:
base_dir = '/content/dogs_vs_cats_small'

train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')
test_dir = os.path.join(base_dir, 'test')


In [None]:

datagen = ImageDataGenerator(rescale=1./255)
batch_size = 100

def extract_features(directory, sample_count):
    features = np.zeros(shape=(sample_count, 4, 4, 512))
    labels = np.zeros(shape=(sample_count))
    generator = datagen.flow_from_directory(
        directory,
        target_size=(150, 150),
        batch_size=batch_size,
        class_mode='binary')
    i = 0
    for inputs_batch, labels_batch in generator:
        features_batch = conv_base.predict(inputs_batch)
        features[i * batch_size : (i + 1) * batch_size] = features_batch
        labels[i * batch_size : (i + 1) * batch_size] = labels_batch
        i += 1
        if i * batch_size >= sample_count:
            # Note that since generators yield data indefinitely in a loop,
            # we must `break` after every image has been seen once.
            break
    return features, labels

In [None]:
train_features, train_labels = extract_features(train_dir, 2000)
validation_features, validation_labels = extract_features(validation_dir, 1000)
test_features, test_labels = extract_features(test_dir, 1000)

In [None]:
train_features.shape

The extracted features are currently of shape `(samples, 4, 4, 512)`. We will feed them to a densely-connected classifier, so first we must 
flatten them to `(samples, 8192)`:

In [None]:
train_features = np.reshape(train_features, (2000, 4 * 4 * 512))
validation_features = np.reshape(validation_features, (1000, 4 * 4 * 512))
test_features = np.reshape(test_features, (1000, 4 * 4 * 512))

## Classification
At this point, we can define our densely-connected classifier (note the use of dropout for regularization), and train it on the data and 
labels that we just recorded:

In [None]:
from tensorflow.keras import models
from tensorflow.keras import layers
from tensorflow.keras import optimizers

model = models.Sequential()
model.add(layers.Dense(256, activation='relu', input_dim=4 * 4 * 512))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1, activation='sigmoid'))

model.compile(optimizer=optimizers.RMSprop(lr=2e-5),
              loss='binary_crossentropy',
              metrics=['acc'])

history = model.fit(train_features, train_labels,
                    epochs=30,
                    batch_size=20,
                    validation_data=(validation_features, validation_labels))

Training is very fast, since we only have to deal with two `Dense` layers -- an epoch takes less than one second even on CPU.

Let's take a look at the loss and accuracy curves during training:

In [None]:
import matplotlib.pyplot as plt

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()


We reach a validation accuracy of about 90%, which is already impressive. 

However, our plots also indicate that we are overfitting almost from the start -- despite using dropout with a fairly large rate. This is because our dataset is very small in size (for an image classification task). Training over a larger dataset will help alleviating overfitting.

Another possible way to improve is to leverage data augmentation, a topic covered in Chapter 5.2 of Chollet's book (not required).

### Performance of the trained classifier

Let's see how the trained classifier performs on the hold-out test dataset:

In [None]:
test_loss, test_acc = model.evaluate(test_features, test_labels)
print('test acc:', test_acc)

Once a model is trained, we can save and load it whenever needed (rather than to repeatedly train it everytime).

Note: If you want to try the save/load code below, you need to first mount your Google Drive to Colab.

In [None]:
## This is how to save a trained model. Note that you might need to change the path.
# model.save('/content/drive/My Drive/AMA/cats_and_dogs_small_model.h5')

## This is how to load a trained model. For now I commented it out to avoid accidental mistake.
# from tensorflow.keras.models import load_model
# model = load_model('/content/drive/My Drive/AMA/cats_and_dogs_small_model.h5')

## Prediction

In [None]:
img_path = '/content/dogs_vs_cats_small/test/cats/cat.1700.jpg'

# We preprocess the image into a 4D tensor
from keras.preprocessing import image
import numpy as np

img = image.load_img(img_path, target_size=(150, 150))
img_tensor = image.img_to_array(img)
img_tensor = np.expand_dims(img_tensor, axis=0)
# Remember that the model was trained on inputs
# that were preprocessed in the following way:
img_tensor /= 255.

# Its shape is (1, 150, 150, 3)
print(img_tensor.shape)

In [None]:
features_extracted = conv_base.predict(img_tensor)
features_flattened = np.reshape(features_extracted, (1, 4 * 4 * 512))
predicted = model.predict(features_flattened)
predicted

In [None]:
import matplotlib.pyplot as plt

plt.imshow(img_tensor[0])
plt.show()