<img src="NotebookAddons/blackboard-banner.jpg" width="100%" />
<font face="Calibri">
<br>
<font size="5"> <b>Deep Learning in Earth Observation: RNN Cosine Demo</b> </font>

<br>
<font size="4"> <b> Lichao Mou, German Aerospace Center; Xiaoxiang Zhu, German Aerospace Center & Technical University Munich </b> <br>
</font>

<img src="NotebookAddons/dlr-logo-png-transparent.png" width="170" align="right" border="2"/> <font size="3"> This notebook introduces you to the basic concepts of Deep Learning in Earth Observation. Specifically, it uses the simple example of learning the temporal pattern of a cosine curve to demonstrate the concepts of Recurrent Neural Networks (RNNs). The notebook let's you experiment with several hyper-parameters needed for training Deep Learning Networks such as RNNs, CNNs, or similar.
    
This notebook will introduce the following data analysis concepts:
<br>
- How to set up a recurrent deep network within the Python-based <i>keras/tensorflow</i> environment
- How to create an LSTM (long-term/short-term memory) recurrent network 
- How to optimize hyper-parameters when training a deep neural network
</font>
</font>
<hr>

<font face="Calibri" size="5" color="red"> <b>Important Note about JupyterHub</b> </font>
<br><br>
<font face="Calibri" size="3"> <b>Your JupyterHub server will automatically shutdown when left idle for more than 1 hour. Your notebooks will not be lost but you will have to restart their kernels and re-run them from the beginning. You will not be able to seamlessly continue running a partially run notebook.</b> </font>
<hr>

# Predict a cosine wave using RNNs

* A simple tutorial on LSTM and GRU to perdict a trigonometric wave.

* Data noise can be added to test the robustness of the model.

* Hyperparamters of the RNNs can be tweaked


<hr>
<font face="Calibri" size="4"> <b>0. Importing Relevant Python Packages </b> </font>

<font size="3">Our first step is to <b>import the necessary python libraries into your Jupyter Notebook.</b></font>

In [None]:
import sys
import os
from PIL import Image
import glob

import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense, Activation, LSTM, GRU, TimeDistributed
from keras.optimizers import RMSprop

from asf_notebook import new_directory

<hr>
<font face="Calibri" size="4"><b>1. Create a working directory for the analysis and change into it:</b></font>

In [None]:
base_path = "/home/jovyan/notebooks/SAR_Training/English/data_RNN_cosine"
new_directory(base_path)
os.chdir(base_path)
print(f"Current working directory: {os.getcwd()}")

<hr>
<font face="Calibri" size="4"><b>2. Get cosine data</b></font>
<br><br>
<font face="Calibri" size="3">Data to train and evaluate the RNN:</font>

- Start, end and step define the range of the data series.
- Sequence length defines the series to look back to train the model
- Noisy data can be added to make the training data imperfect.

<br>
<font face="Calibri" size="3"><b>Write a function to define training data:</b></font>

In [None]:
# Takes: starting point, end point, number of steps between points
# number of steps to backpropagate through time, and noise to create imperfect data
# Returns: X,Y data


def cosine_data(start, end, step, sequence_length, noise_level=0):

    t = np.arange(start, end, step)
    cosine = np.cos(2 * np.pi * t) + noise_level * \
        np.random.normal(0, 1, np.shape(t))
    cosine = cosine.reshape((cosine.shape[0], 1))

    dX, dY = [], []
    for i in range(len(cosine) - 2*sequence_length):
        dX.append(cosine[i:i + sequence_length])
        dY.append(cosine[i + sequence_length:i + 2*sequence_length])
    dataX = np.array(dX)
    dataY = np.array(dY)
    return dataX, dataY


<hr>
<font face="Calibri" size="4"><b>3. Create an LSTM model</b></font>

- Linear activation
- Loss in mean squared error

<font face="Calibri" size="3"><b> Write a function that creates an LSTM model:</b></font>

In [None]:
# Takes: number of neurons to train a GRU network, number of features to predict, and learning_rate
# Returns: model for training


def LSTM_(hidden_neurons, feature_count, learning_rate):
    model = Sequential()
    model.add(LSTM(input_dim=feature_count,
                   output_dim=hidden_neurons, return_sequences=True))
    model.add(TimeDistributed(Dense(feature_count)))
    model.add(Activation('linear'))
    optimizer = RMSprop(lr=learning_rate)
    model.compile(loss='mean_squared_error',
                  optimizer=optimizer, metrics=['mse'])
    return model


<hr>
<font face="Calibri" size="4"><b>3. Create a GRU model</b></font>

- Linear activation
- Get loss using a mean squared error

<font face="Calibri" size="3"><b>Write a function that creates a GRU model:</b></font>

In [None]:
# Takes number of neurons to train a GRU network, number of features to predict, and learning_rate
# Returns: model for training


def GRU_(hidden_neurons, feature_count, learning_rate):
    model = Sequential()
    model.add(GRU(input_dim=feature_count,
                  output_dim=hidden_neurons, return_sequences=True))
    model.add(TimeDistributed(Dense(feature_count)))
    model.add(Activation('linear'))
    optimizer = RMSprop(lr=learning_rate)
    model.compile(loss='mean_squared_error',
                  optimizer=optimizer, metrics=['mse'])
    return model

<hr>
<font face="Calibri" size="4"><b>4. Write a function to train an RNN model:</b></font>

In [None]:
# Takes: load RNN model, X cosine train data, Y cosine train data,
# number of samples to be propagated through the network, and
# number of time dataset is processed
# Returns: training and validation loss


def train_cosine(model, dataX, dataY, batch_size, epoch_count, count):

    history = model.fit(dataX, dataY, batch_size=batch_size,
                        epochs=epoch_count, validation_split=0.05)
    loss_history = history.history['loss']
    loss_history = np.array(loss_history)
    #np.savetxt("loss_history.txt", numpy_loss_history, delimiter=",")
    val_loss_history = history.history['val_loss']
    val_loss_history = np.array(val_loss_history)
    #np.savetxt("val_loss_history.txt", numpy_loss_history, delimiter=",")
    loss = history.history['loss']
    loss_val = history.history['val_loss']
    plt.rcParams.update({'font.size': 18})
    fig = plt.figure(figsize=(8, 7))
    ax = fig.add_subplot(1, 1, 1)
    plt.plot(loss)
    plt.plot(loss_val)
    plt.ylabel('loss')
    plt.xlabel('epoch')
    plt.legend(['train', 'val'], loc='upper right')
    plt.savefig(f"{os.getcwd()}/loss_{count}.png", dpi=72)
    plt.show()

    return loss_history, val_loss_history

<hr>
<font face="Calibri" size="4"><b>5. Write a function to run an RNN model:</b></font>

In [None]:
# Takes: number of Epochs, noise level in training data, sequence length available to train a RNN,
# learning_rate, batch_size, nb_units, boolean if results should be plotted
# Return: Loss and Plot


def test_cosine(EPOCHS, count, noise_level=0.3, sequence_length=100, learning_rate=1e-3, batch_size=16, nb_units=32, plot_results=False):

    dataX, dataY = cosine_data(
        0.0, 10, 0.02, sequence_length, noise_level)  # 4.0
    # create and fit the LSTM network
    print('creating model...')

    # Choose RNN to train
    model = LSTM_(nb_units, 1, learning_rate)
    #model = GRU_(nb_units, 1, learning_rate)

    # Train RNN model
    tr_loss, val_loss = train_cosine(model, dataX, dataY, batch_size, EPOCHS, count)

    # now test
    dataX1, dataY1 = cosine_data(15.0, 21.0, 0.02, sequence_length)
    predict = model.predict(dataX1)
    if plot_results:
        plot_RNN_results(dataX, dataX1, dataY1, predict, sequence_length, count)

    return tr_loss, val_loss

<hr>
<font face="Calibri" size="4"><b>6. Write a function to plot RNN results:</b></font>

In [None]:
def plot_RNN_results(dataX, dataX1, dataY1, predict, sequence_length, count):
    nan_array = np.empty((sequence_length - 1))
    nan_array.fill(np.nan)
    nan_array2 = np.empty(sequence_length)
    nan_array2.fill(np.nan)
    ind = np.arange(2*sequence_length)
    plt.rcParams.update({'font.size': 18})
    fig = plt.figure(figsize=(8, 7))
    ax = fig.add_subplot(1, 1, 1)
    forecasts = np.concatenate(
        (nan_array, dataX1[0, -1:, 0], predict[0, :, 0]))
    ground_truth = np.concatenate(
        (nan_array, dataX1[0, -1:, 0], dataY1[0, :, 0]))
    network_input = np.concatenate((dataX[0, :, 0], nan_array2))

    ax.plot(ind, network_input, 'b-x', label='Network input')
    ax.plot(ind, forecasts, 'r-x', label='Many to many model forecast')
    ax.plot(ind, ground_truth, 'g-x', label='Ground truth')
    handles, labels = ax.get_legend_handles_labels()
    plt.xlabel('t')
    plt.ylabel('cos(t)')
    plt.title('Cosine Many to Many Forecast')
    text = ax.text(-0.2,1.05, " ", transform=ax.transAxes) #this is dummy text, needed by bbox_inches='tight', which requires >1 artist 
    lgd = ax.legend(handles, labels, bbox_to_anchor=(0.5, -0.1), loc='upper center')
    plt.savefig(f"{os.getcwd()}/cosine_wave_{count}.png", dpi=72, bbox_extra_artists=(lgd, text), bbox_inches='tight')
    plt.show()

<hr>
<font face="Calibri" size="4"><b>7. Write functions to group and save the plots created by train_cosine() and plot_RNN_results():</b></font>
<br><br>
<font face="Calibri" size="3"><b>Write a function to group the paths to the plots that will be concatenated:</b></font>

In [None]:
def group_plot_paths(count):
    paths = []
    for i in range (0, count):
        paths.append(glob.glob(f"*_{i}.*"))
    for i in range (0, len(paths)):
        paths[i].sort()
    return paths

<font face="Calibri" size="3"><b>Write a function to concatenate and save the plots:</b></font>

In [None]:
def concat_plots(count):
    plot_pairs = group_plot_paths(count)
    for i in range(0, len(plot_pairs)):
        images = list(map(Image.open, plot_pairs[i]))
        widths, heights = zip(*(x.size for x in images))
        total_width = sum(widths)
        max_height = max(heights)
        x_offset = 0
        new_image = Image.new('RGBA', (total_width, max_height))
        for im in images:
            new_image.paste(im, (x_offset, 0))
            x_offset += im.size[0]
        new_image.save(f"cosine_wave_loss_{i}.png", "png")
    delete_files(plot_pairs)

<font face="Calibri" size="3"><b>Write a function to delete the original seperate image files after they have been concatenated:</b></font>

In [None]:
def delete_files(files_list):
    assert type(files_list) == list
    assert type(files_list[0]) == list
    for files in files_list:
        for file in files:
            try:
                os.remove(file)
            except:
                FileNotFoundError

<font face="Calibri" size="4"><b>8. Train and run the RNN model:</b></font>

In [None]:
nb_epochs = 10
noise_level = 0.0
sequence_length = 100
'''
learning_rate = 1e-3  # 1e-1, 1e-2, 1e-3, 1e-4, 1e-5
batch_size = 16  # 1, 2, 4, 8, 16
nb_units = 32  # 8, 16, 32, 64, 128

_, _ = test_cosine(EPOCHS = nb_epochs, noise_level =noise_level, 
                   plot_results=True)
'''
# try with different noise
'''
noise_level_range = [0, 0.1, 0.2, 0.3, 0.4, 0.5]
for nl in noise_level_range:
    tr_loss, val_loss = test_cosine(EPOCHS = nb_epochs, noise_level=nl)
    print('tr_loss:', tr_loss)
    print('val_loss:', val_loss)
'''

# hyperparameter: 1) lr
learning_rate_range = [1e-1, 1e-2, 1e-3, 1e-4, 1e-5]
count = 0
for lr in learning_rate_range:
    tr_loss, val_loss = test_cosine(nb_epochs, count, learning_rate=lr, plot_results=True)
    count += 1
    print('tr_loss:', tr_loss)
    print('val_loss:', val_loss)
concat_plots(len(learning_rate_range))

'''
# hyperparameter: 2) batch size
batch_size_range = [2, 4, 8, 16, 32]
for bs in batch_size_range:
    tr_loss, val_loss = test_cosine(EPOCHS = nb_epochs, batch_size=bs, plot_results=True)
    print('tr_loss:', tr_loss)
    print('val_loss:', val_loss)
'''
'''
# hyperparameter: 3) sequence length
sequence_length_range = [20, 50, 100, 200, 500]
for sl in sequence_length_range:
    tr_loss, val_loss = test_cosine(EPOCHS = nb_epochs, sequence_length=sl, plot_results=True)
    print('tr_loss:', tr_loss)
    print('val_loss:', val_loss)
'''
'''
# hyperparameter: 4) nb_units
nb_units_range = [8, 16, 32, 64, 128]
for nu in nb_units_range:
    tr_loss, val_loss = test_cosine(EPOCHS = nb_epochs, nb_units=nu, plot_results=True)
    print('tr_loss:', tr_loss)
    print('val_loss:', val_loss)
'''
