### LULC EuroSat Satellite Imagery - SavingLoading - Callbacks

The EuroSAT dataset
27000 labelled Sentinel-2 satellite images of 10 different land uses: residential, industrial, highway, river, forest, pasture, herbaceous vegetation, annual crop, permanent crop and sea/lake. For a reference, see the following papers:

Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification. Patrick Helber, Benjamin Bischke, Andreas Dengel, Damian Borth. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2019.
Introducing EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification. Patrick Helber, Benjamin Bischke, Andreas Dengel. 2018 IEEE International Geoscience and Remote Sensing Symposium, 2018.

Subset (roughly stratified): 
4000 training images
1000 testing images

In [None]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import load_img, img_to_array
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
import os
import numpy as np
import pandas as pd

In [None]:
def load_eurosat_data():
    data_dir = 'data/'
    x_train = np.load(os.path.join(data_dir, 'x_train.npy'))
    y_train = np.load(os.path.join(data_dir, 'y_train.npy'))
    x_test  = np.load(os.path.join(data_dir, 'x_test.npy'))
    y_test  = np.load(os.path.join(data_dir, 'y_test.npy'))
    return (x_train, y_train), (x_test, y_test)

(x_train, y_train), (x_test, y_test) = load_eurosat_data()
x_train = x_train / 255.0
x_test = x_test / 255.0

In [None]:
def get_new_model(input_shape):
    """
    This function should build a Sequential model according to the above specification. Ensure the 
    weights are initialised by providing the input_shape argument in the first layer, given by the
    function argument.
    Your function should also compile the model with the Adam optimiser, sparse categorical cross
    entropy loss function, and a single accuracy metric.
    """
    model = Sequential([
        Conv2D(16, (3,3), activation='relu', padding='SAME', input_shape=input_shape, name='conv_1'), 
        Conv2D(8, (3,3), activation='relu', padding='SAME', name='conv_2'), 
        MaxPooling2D((8,8), name='pool_1'),
        Flatten(name='flatten'),
        Dense(32, activation='relu', name='dense_1'),
        Dense(10, activation='softmax', name='dense_2')
    ])
    model.compile(optimizer='adam',
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy'])
    return model

In [None]:
model = get_new_model(x_train[0].shape)

In [None]:
def get_test_accuracy(model, x_test, y_test):
    """Test model classification accuracy"""
    test_loss, test_acc = model.evaluate(x=x_test, y=y_test, verbose=0)
    print('accuracy: {acc:0.3f}'.format(acc=test_acc))

In [None]:
model.summary()
get_test_accuracy(model, x_test, y_test)

3 Callbacks:
- checkpoint_every_epoch: checkpoint that saves the model weights every epoch during training
- checkpoint_best_only: checkpoint that saves only the weights with the highest validation accuracy. Use the testing data as the validation data.
- early_stopping: early stopping object that ends training if the validation accuracy has not improved in 3 epochs.

In [None]:
def get_checkpoint_every_epoch():
    """
    This function should return a ModelCheckpoint object that:
    - saves the weights only at the end of every epoch
    - saves into a directory called 'checkpoints_every_epoch' inside the current working directory
    - generates filenames in that directory like 'checkpoint_XXX' where
      XXX is the epoch number formatted to have three digits, e.g. 001, 002, 003, etc.
    """
    checkpoint_every_epoch = ModelCheckpoint('checkpoints_every_epoch/checkpoint_{epoch:03d}', 
                                 save_weights_only = True, 
                                 frequency='epoch',
                                verbose=1)
    return checkpoint_every_epoch


def get_checkpoint_best_only():
    """
    This function should return a ModelCheckpoint object that:
    - saves only the weights that generate the highest validation (testing) accuracy
    - saves into a directory called 'checkpoints_best_only' inside the current working directory
    - generates a file called 'checkpoints_best_only/checkpoint' 
    """
    checkpoint_best_only = ModelCheckpoint(
        filepath = 'checkpoints_best_only/checkpoint',
        frequency='epoch',
        save_best_only=True,
        monitor='val_accuracy',
        save_weights_only=True, 
        verbose=1)
    return checkpoint_best_only


In [None]:
def get_early_stopping():
    """
    This function should return an EarlyStopping callback that stops training when
    the validation (testing) accuracy has not improved in the last 3 epochs.
    HINT: use the EarlyStopping callback with the correct 'monitor' and 'patience'
    """
    early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=3)

    return early_stopping

In [None]:
checkpoint_every_epoch = get_checkpoint_every_epoch()
checkpoint_best_only = get_checkpoint_best_only()
early_stopping = get_early_stopping()

Train model using the callbacks
Now, you will train the model using the three callbacks you created. If you created the callbacks correctly, three things should happen:

At the end of every epoch, the model weights are saved into a directory called checkpoints_every_epoch
At the end of every epoch, the model weights are saved into a directory called checkpoints_best_only only if those weights lead to the highest test accuracy
Training stops when the testing accuracy has not improved in three epochs.
You should then have two directories:

A directory called checkpoints_every_epoch containing filenames that include checkpoint_001, checkpoint_002, etc with the 001, 002 corresponding to the epoch
A directory called checkpoints_best_only containing filenames that include checkpoint, which contain only the weights leading to the highest testing accuracy

In [None]:
callbacks = [checkpoint_every_epoch, checkpoint_best_only, early_stopping]
model.fit(x_train, y_train, epochs=50, validation_data=(x_test, y_test), callbacks=callbacks)

Create new instance of model and load on both sets of weights
Use the saved weights in a fresh model. You should create two functions, both of which take a freshly instantiated model instance:

- model_last_epoch should contain the weights from the latest saved epoch
- model_best_epoch should contain the weights from the saved epoch with the highest testing accuracy
- use the tf.train.latest_checkpoint function to get the filename of the latest saved checkpoint file.

In [None]:
def get_model_last_epoch(model):
    """
    This function should create a new instance of the CNN you created earlier,
    load on the weights from the last training epoch, and return this model.
    """
    model.load_weights(tf.train.latest_checkpoint(checkpoint_dir='checkpoints_every_epoch'))
    return model
    
    
def get_model_best_epoch(model):
    """
    This function should create a new instance of the CNN you created earlier, load 
    on the weights leading to the highest validation accuracy, and return this model.
    """
    model.load_weights('checkpoints_best_only/checkpoint')
    return model

In [None]:
# Run this cell to create two models: one with the weights from the last training
# epoch, and one with the weights leading to the highest validation (testing) accuracy.
# Verify that the second has a higher validation (testing) accuarcy.

model_last_epoch = get_model_last_epoch(get_new_model(x_train[0].shape))
model_best_epoch = get_model_best_epoch(get_new_model(x_train[0].shape))
print('Model with last epoch weights:')
get_test_accuracy(model_last_epoch, x_test, y_test)
print('')
print('Model with best epoch weights:')
get_test_accuracy(model_best_epoch, x_test, y_test)