<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Objectives-for-this-lecture" data-toc-modified-id="Objectives-for-this-lecture-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Objectives for this lecture</a></span></li><li><span><a href="#Introducing-Tensorflow-and-Keras" data-toc-modified-id="Introducing-Tensorflow-and-Keras-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Introducing Tensorflow and Keras</a></span></li><li><span><a href="#VGG-16" data-toc-modified-id="VGG-16-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>VGG-16</a></span></li><li><span><a href="#Loading-VGG-16" data-toc-modified-id="Loading-VGG-16-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Loading VGG-16</a></span><ul class="toc-item"><li><span><a href="#Make-a-prediction-using-an-image" data-toc-modified-id="Make-a-prediction-using-an-image-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>Make a prediction using an image</a></span></li></ul></li><li><span><a href="#Transfer-Learning" data-toc-modified-id="Transfer-Learning-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Transfer Learning</a></span></li><li><span><a href="#Example-of-Transfer-Learning" data-toc-modified-id="Example-of-Transfer-Learning-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Example of Transfer Learning</a></span><ul class="toc-item"><li><span><a href="#Step-1:-load-the-pre-trained-model" data-toc-modified-id="Step-1:-load-the-pre-trained-model-6.1"><span class="toc-item-num">6.1&nbsp;&nbsp;</span>Step 1: load the pre-trained model</a></span></li><li><span><a href="#Step-2:-download-the-new-training-images" data-toc-modified-id="Step-2:-download-the-new-training-images-6.2"><span class="toc-item-num">6.2&nbsp;&nbsp;</span>Step 2: download the new training images</a></span></li><li><span><a href="#Step-3:-download-the-new-labels" data-toc-modified-id="Step-3:-download-the-new-labels-6.3"><span class="toc-item-num">6.3&nbsp;&nbsp;</span>Step 3: download the new labels</a></span></li></ul></li></ul></div>

# Objectives for this lecture

After today, you will know how to:
* Load a pre-trained CNN image classifier
* Classify your own image using a pre-trained CNN
* Download remote sensing data from Google Earth Engine
* Build a transfer learning model to make predictions from satellite images

# Introducing Tensorflow and Keras

**Tensorflow** is Google's machine learning library. Tensorflow is very powerful, and it is easy to get overwhelmed if you aren't familiar with it. 

**Keras** is a library that serves as a "wrapper" for Tensorflow. A wrapper is a set of high-level functions that make it easy to accomplish basic tasks without writing a lot of code. Keras is written for both Python and R.

![](notebook_images/keras_tensorflow.jpeg)

For this exercise, we will use Keras. 

# VGG-16

VGG-16 is the name of a convolutional neural network (CNN) that was developed by the Oxford Visual Geometry Group (VGG). The model has 16 layers, hence the name VGG-16. 

Consider the model below, which is taken from the slides in class. The model below has 4 layers (convolution, pooling, convolution, pooling). VGG-16 is a similar concept, except VGG-16 has 16 layers instead of 4.

![](notebook_images/simple_model.png)

VGG-16 became famous when it won the ImageNet Large Scale Visual Recognition Challenge in 2014. The challenge is to see who can build the best image classifier for a very large set (1000 classes) of images. The model above classifies images into 10 classes (you can see the 10 output classes as red dots on the right side). Imagine having 1000 classes - that is what the VGG-16 was trained to do. Let's take a look at some example images from the ImageNet dataset. 

![Samples from ImageNet](notebook_images/ImageNet.png)

Each image has a label. For example, the image of pizza slices is labeled "pizza". VGG-16 was trained to classify each image into 1 of the 1000 classes with only 7% classification error. Not bad! (Actually, 7% error refers to the *top 5* error rate, which means that the correct label was in the top 5 model predictions). There are better models available now, but the prediction accuracy isn't much higher than that of the VGG-16. 

The VGG group from Oxford has made the pre-trained model freely available for anyone to download and use. This means that we can load the model ourselves and theoretically classify ImageNet images with 93% accuracy. 


# Loading VGG-16 

You must have tensorflow and keras installed in order to execute the following code. [This link](https://www.tensorflow.org/install/) shows you how to install tensorflow using pip. You can also install it from Anaconda. 

In [1]:
# Load the model. This may take a minute because the model weights file is pretty large. 
from keras.applications.vgg16 import VGG16
model = VGG16(weights="imagenet")

# Look at the model architecture. There should be 16 layers.
print(model.summary())


Using TensorFlow backend.


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
__________

Note that we do not have to use the pre-trained ImageNet weights. We could also just load the VGG-16 architecture and then train it ourselves. However, training a model from scratch requires a lot of data. 

## Make a prediction using an image

Let's load a new image and see how the model classifies it. Here is an image of Professor Carlson. How do you think the model will classify this image? 

![](notebook_images/carlson.jpg)

First, we need to pre-process the image so that it is in the format that VGG-16 expects. 

In [2]:
from keras.preprocessing.image import load_img, img_to_array
from keras.applications.vgg16 import preprocess_input

# load an image from file
image = load_img('notebook_images/carlson.jpg', target_size=(224, 224))

# convert the image pixels to a numpy array
image = img_to_array(image)

# reshape data for the model (add a third dimension)
image = image.reshape((1, image.shape[0], image.shape[1], image.shape[2]))

# the training images were pre-processed a bit. This function applies the same pre-processing to the test image
image = preprocess_input(image)

Then, make a prediction!

In [3]:
from keras.applications.vgg16 import decode_predictions

# predict the probability across all output classes
prediction = model.predict(image)

# convert the probabilities to class labels
label = decode_predictions(prediction)

# retrieve the top 5 most likely results, e.g. top 5 highest probabilities
label = label[0][0:5]

# print the classifications for the top 5 predictions
print('%s (%.2f%%)' % (label[0][1], label[0][2]*100))
print('%s (%.2f%%)' % (label[1][1], label[1][2]*100))
print('%s (%.2f%%)' % (label[2][1], label[2][2]*100))
print('%s (%.2f%%)' % (label[3][1], label[3][2]*100))
print('%s (%.2f%%)' % (label[4][1], label[4][2]*100))

suit (76.90%)
Windsor_tie (16.07%)
bow_tie (1.67%)
oboe (0.98%)
groom (0.71%)


The model is 76.9% certain that the image corresponds to "suit". There probably is not an ImageNet class called "data science professor". 

We now have successfully utilized a pre-trained CNN for the purpose that it was orginally designed for (i.e., classifying "typical" images).

What if we wanted to use the pre-trained model for a task other than the one it was designed for?

# Transfer Learning

Transfer learning is process of taking useful knowledge from one problem and transfering it to another problem. Transfer learning is common practice for developing CNNs because rarely does one have enough data to train the network from scratch. If we try to train a network with too few observations, then the model will overfit and will not generalize to the population. 

Instead, we can use the weights from a pre-trained model (like VGG-16 with ImageNet weights) and then re-train some (but not all) of the layers to suit our needs. 

For example, suppose we wanted to predict air quality from satellite images of cities. Satellite images of cities look different than the ImageNet images (pizza, cats, etc.), but not *that* different. After all, they're still photographs, right? The low-level features that correspond to basic elements of images (lines, edges, etc.) are still relevent for the new task. So, what we could do is keep *most* of the ImageNet weights and then train *some* upper-level layers to specialize the model to the new task. 

Let's see how it looks if we load the pre-trained VGG-16 but allow the top 3 layers to be specialized to the new task. 

In [4]:
model = VGG16(weights = 'imagenet', include_top = True)

# Freeze the layers which you don't want to train. Here I am freezing the first 13 layers.
for layer in model.layers[:13]:
    layer.trainable = False
    
print(model.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         (None, None, None, 3)     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, None, None, 64)    1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, None, None, 64)    36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, None, None, 64)    0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, None, None, 128)   73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, None, None, 128)   147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, None, None, 128)   0         
__________

Notice that now we have non-trainable parameters as well as trainable parameters. We are *transferring* the first 13 layers from the ImageNet model. The last 3 layers we can train to suit our needs. 

How many layers of a pre-trained model should we freeze for transfer learning? It depends on the task. 

If the new datast is **small**, then we should probably just train the last layer or two. Training the model with few observations will lead to overfitting. 

If the new dataset is **large**, then we can train as many layers as we want. 

# Example of Transfer Learning

Let's do a concrete example of a transfer learning for environmental engineering. Suppose we wanted to train a CNN to estimate the population density of a given neighborhood (like Duke campus) from satellite images. Population density can be useful for many environmental applications, such as catastrophe risk analysis or hedonic pricing analysis. 

**Note**: Unfortunately, we cannot actually train a new model during this demonstration because it would take too long! However, we will go over all of the steps so that you can use transfer learning for your own purposes.

We need 3 things in order to train the model. 
* pre-trained CNN (with *some* layers trainable)
* new images (satellite images)
* new labels (population density)

## Step 1: load the pre-trained model

We already did this!

## Step 2: download the new training images

In this case, we want satellite images of neighborhoods. To do this, I first obtained neighborhood shapefiles from [Zillow](https://www.zillow.com/research/data/), then downloaded the corresponding NAIP (National Agriculture Imagery Program) images using [Google Earth Engine](https://earthengine.google.com/). 

(quick [tour](https://code.earthengine.google.com/2afafdd6e7f906f0d2d40f9ec0416bbd) of Google Earth Engine! Note that you will need a Google Earth Engine account to use this service. It's free.)

## Step 3: download the new labels

I downloaded population density data from the [CoolClimate network](https://coolclimate.org/), and then joined the data to the neighborhood shapefiles. 

Once you've downloaded your data (new images and new labels), the best thing to do is make a pandas dataframe in which each row corresponds to an observation. In our case, we would have one column called "image" and one column called "popden". The "image" column would contain strings that correspond to the filepath of the image. The "popden" column would be numeric. 

In [5]:
# take a look at the dataframe
import pandas as pd
df=pd.read_csv("data/Zillow.csv")
df.head()

Unnamed: 0.1,Unnamed: 0,RegionID,ZipCode,Population,PersonsPerHousehold,AverageHouseValue,IncomePerHousehold,Latitude,Longitude,Elevation,...,State.y,County,City.y,Name,AFFGEOID10,GEOID10,ALAND10,AWATER10,image,image_remote
0,1,7787,27107,38788,2.61,93400,36279,36.060969,-80.184068,912,...,NC,Forsyth,Winston-Salem,Waughtown,8600000US27107,27107,168507642,1428803,/Users/Jon/Documents/NEWOS_DL/images/Zillow_NC...,images/Zillow_NC/Zillow7787.jpg
1,2,8382,27006,10762,2.48,166200,51467,36.01038,-80.44734,782,...,NC,Davie,Advance,Bermuda Run,8600000US27006,27006,165428089,2643968,/Users/Jon/Documents/NEWOS_DL/images/Zillow_NC...,images/Zillow_NC/Zillow8382.jpg
2,3,12870,28792,27062,2.38,111300,32820,35.337939,-82.450039,2146,...,NC,Henderson,Hendersonville,Mountain Home,8600000US28792,28792,260576005,716642,/Users/Jon/Documents/NEWOS_DL/images/Zillow_NC...,images/Zillow_NC/Zillow12870.jpg
3,4,19609,28213,25882,2.71,110600,41340,35.291316,-80.780822,721,...,NC,Mecklenburg,Charlotte,Newell,8600000US28213,28213,35846902,231444,/Users/Jon/Documents/NEWOS_DL/images/Zillow_NC...,images/Zillow_NC/Zillow19609.jpg
4,5,24954,28601,46176,2.41,122200,42046,35.75358,-81.3249,969,...,NC,Catawba,Hickory,Green Park,8600000US28601,28601,119209116,8448396,/Users/Jon/Documents/NEWOS_DL/images/Zillow_NC...,images/Zillow_NC/Zillow24954.jpg


Here is a complete code that will allow us to train the model using transfer learning. Don't worry if you don't understand it all right now. The main point is to give you a template.

In [6]:
# User input
num_classes = 1 # how many output classes? 1 for regression
y_col = 'popden' # name of y column in dataframe
x_col = 'image' # name of x column in dataframe
batch_size = 25 # how many images to process in parallel
epochs = 1 # how long to train the model
model_path = 'saved_models/vgg16_popden_model' # where to save the trained model
results_path = 'output/popden.csv' # where to save the results

In [7]:
# Load packages
print("Loading Packages...")
from tensorflow.python import keras
from keras.applications.vgg16 import VGG16, preprocess_input
from keras import backend as K
from keras.models import Sequential, Model
from keras.layers import Flatten, Dense, Input, GlobalMaxPooling2D, Dropout
from keras.callbacks import ModelCheckpoint
import pandas as pd
from sklearn import preprocessing

import sys
sys.path.append('packages/')
from im import ImageDataGenerator

Loading Packages...


In [8]:
# Build model
print("building model...")
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(None, None, 3), pooling='max')

# Freeze the layers except the last 3 layers
for layer in base_model.layers[:-3]:
    layer.trainable = False

# add extra dense and dropout layers on top
x = base_model.output
x = Dense(128, activation='tanh')(x)
x = Dropout(0.5)(x)
predictions = Dense(1)(x) 
model = Model(inputs=base_model.input, outputs=predictions)
model.compile(optimizer='Adam', loss='mse', metrics=['mse'])

building model...


In [9]:
# load data
print("loading data...")
df=pd.read_csv("data/Zillow.csv") # this is the dataframe that contains the image paths and response variable

# load standard scaler for the response variable
std_scaler = preprocessing.StandardScaler()

df_train=df.sample(frac=0.75) # sample the training set
df_test=df.drop(df_train.index) # sample the testing set

std_scaler.fit(df_train[[y_col]]) # apply standard scaler
df_train[[y_col]] = std_scaler.transform(df_train[[y_col]])
df_test[[y_col]]=std_scaler.transform(df_test[[y_col]])

loading data...


We're not actually going to run the next chunk because it trains the model, which takes a long time. 

In [10]:
# train model
print("training model...")

train = []
test = []

# augment the data
data_generator = ImageDataGenerator(preprocessing_function=preprocess_input, 
                                   rotation_range=30,        # data augmentation
                                   horizontal_flip=True,     # data augmentation
                                   width_shift_range = 0.2,  # data augmentation
                                   height_shift_range = 0.2) # data augmentation

train_generator=data_generator.flow_from_dataframe(dataframe=df_train, 
                                                  directory = None,
                                                  x_col=x_col, 
                                                  y_col= y_col, 
                                                  has_ext=True, 
                                                  class_mode="other",#r numeric, use "other", for categorical use "categorical"
                                                  #target_size=(1000,1000), 
                                                  batch_size=batch_size,
                                                  seed = 42,
                                                  shuffle = True)

valid_generator=data_generator.flow_from_dataframe(dataframe=df_test,
                                                  directory=None,
                                                  x_col=x_col,
                                                  y_col= y_col,
                                                  has_ext=True,
                                                  class_mode="other",
                                                  batch_size=batch_size,
                                                  seed = 42,
                                                  shuffle = True)

STEP_SIZE_TRAIN=train_generator.n//train_generator.batch_size
STEP_SIZE_VALID=valid_generator.n//valid_generator.batch_size

history = model.fit_generator(generator=train_generator,
                    steps_per_epoch=STEP_SIZE_TRAIN,
                    validation_data=valid_generator,
                    validation_steps=STEP_SIZE_VALID,
                    epochs=epochs,
                    #class_weight = class_weights,
                    callbacks = [ModelCheckpoint(model_path,
                                                 monitor="val_mean_squared_error", #numeric use "val_loss", for categorical use "val_mean_squarred_error"
                                                 save_best_only=True)])

train.extend(history.history['mean_squared_error'])
test.extend(history.history['val_mean_squared_error'])

training model...
Found 424 images.
Found 142 images.
Epoch 1/1
1/8 [==>...........................] - ETA: 2:56:27 - loss: 4.4093 - mean_squared_error: 4.4093

KeyboardInterrupt: 

In [None]:
# save results
print("saving results...")

results = pd.DataFrame(train, test) 
results.to_csv(results_path)

Let's take a look at the training results.

In [None]:
metrics = pd.read_csv("output/popden.csv", header=0, names=['training', 'validation'])

fig, ax = plt.subplots(nrows=1, ncols=1)
plt.plot(metrics['training'], label = 'training MSE')
plt.plot(metrics['validation'], label = 'validation MSE')
plt.xlabel('Epochs', fontsize=20)
plt.ylabel('Mean Squared Error', fontsize=20)
plt.legend(fontsize=20)
plt.show()