# Transfer Learning

In [2]:
import matplotlib.pyplot as plt
import scipy
import numpy as np
from PIL import Image
from scipy import ndimage
from keras.preprocessing.image import ImageDataGenerator,\
array_to_img, img_to_array, load_img
from keras.models import Sequential
from keras.layers import Dense, Conv2D, MaxPooling2D, Flatten
from keras.optimizers import RMSprop

The idea behind transfer learning is simply to make use of some pre-trained model to make predictions, rather than building a new model from scratch.

The plain fact is that several very powerful image-processing networks have already been built and perfected by scientists who have detailed knowledge about how all the layers of their models work. Moreover, many successful models have been trained on hundreds of thousands if not millions of images, and so they could be used for your images as well.

In general, the target will of course be different from the original that was used in training the model in the first place. But the idea is that the model will be good at picking up on the *deep features* of images, and so we can use *most* of the pre-trained model, in order to extract those deep features, and then just stick on a couple extra layers at the end that are appropriate for the data we have.

In what follows here we'll try building a network from scratch on some chest X-ray data. And then we'll see if we can get better accuracy by using [Imagenet](https://qz.com/1034972/the-data-that-changed-the-direction-of-ai-research-and-possibly-the-world/), a leading CNN for image recognition.

## CNN from Scratch

Let's look at some X-rays of lungs!

In [15]:
train_f = './chest_xray/train/'
test_f = './chest_xray/test/'
val_f = './chest_xray/val'

Keras's ImageDataGenerator can convert images (we have JPEGs here) to tensors of visual information!

In [1]:
test_generator = ImageDataGenerator(rescale=1./255).flow_from_directory(
        test_f, 
        target_size=(64, 64))

val_generator = ImageDataGenerator(rescale=1./255).flow_from_directory(
        val_f, 
        target_size=(64, 64))

train_generator = ImageDataGenerator(rescale=1./255).flow_from_directory(
        train_f, 
        target_size=(64, 64))

ImageDataGenerator uses *data augmentation*, which means that it will take each image and transform it in various ways, ultimately using *only these transformations* as training data. [Here's](https://www.pyimagesearch.com/2019/07/08/keras-imagedatagenerator-and-data-augmentation/) a nice resource on keras's `ImageDataGenerator`.

And [here](https://bair.berkeley.edu/blog/2019/06/07/data_aug/) is a page with more information about data augmentation.

In [4]:
train_images, train_labels = next(train_generator)
test_images, test_labels = next(test_generator)
val_images, val_labels = next(val_generator)

In [25]:
train_images[0][0]

### Model Building

In [2]:
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu',
                        input_shape=(64 ,64,  3)))
model.add(MaxPooling2D((2, 2)))

model.add(Conv2D(32, (4, 4), activation='relu'))
model.add(MaxPooling2D((2, 2)))

model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))

model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(2, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer="sgd",
              metrics=['acc'])

In [3]:
history_log = model.fit(train_images,
                    train_labels,
                    epochs=10,
                    batch_size=32,
                    validation_data=(test_images, test_labels))

Note the acc and val_acc scores!

## Now with Transfer Learning!

In [7]:
from keras.applications import VGG19

This tool comes from the [Visual Geometry Group](http://www.robots.ox.ac.uk/~vgg/research/very_deep/).

In [4]:
cnn_base = VGG19(weights='imagenet',
                  include_top=False,
                  input_shape=(64, 64, 3)
                )

cnn_base.summary()

In [9]:
batch_size=8

In [10]:
def extract_features(directory, sample_amount):
    features = np.zeros(shape=(sample_amount, 2, 2, 512)) 
    labels = np.zeros(shape=(sample_amount))
    generator = ImageDataGenerator(rescale=1./255).flow_from_directory(
        directory, target_size=(64, 64), 
        batch_size=batch_size, 
        class_mode='binary')
    i=0
    for inputs_batch, labels_batch in generator:
        features_batch = cnn_base.predict(inputs_batch)
        features[i * batch_size: (i + 1) * batch_size] = features_batch 
        labels[i * batch_size: (i + 1) * batch_size] = labels_batch
        i += 1
        if i * batch_size >= sample_amount:
            break
    return features, labels

In [5]:
train_features, train_labels = extract_features(train_f, 5216) 
validation_features, validation_labels = extract_features(val_f, 16) 
test_features, test_labels = extract_features(test_f, 624)

train_features = np.reshape(train_features, (5216, 2048))
validation_features = np.reshape(validation_features, (16, 2048))
test_features = np.reshape(test_features, (624, 2048))

In [6]:
train_features.shape

In [13]:
model = Sequential()
model.add(Dense(256, activation='relu', input_dim=2048))
model.add(Dense(1, activation='sigmoid'))

In [7]:
model.compile(optimizer=RMSprop(),
              loss='binary_crossentropy',
              metrics=['acc'])
history = model.fit(train_features, train_labels,
                    epochs=10,
                    batch_size=10,
                    validation_data=(test_features, test_labels))

## Explore
What other networks are available inside keras?

In [26]:
from keras.applications import *

In [27]:
# Exercise: Use transfer learning with another pre-trained CNN on these data.
# See if you can improve on our metrics!

