# About this Notebook

The purpose of this notebook is to study a convolutional network solution. The MLP performs well when the validation/test is on the same flight as the training, however it does not generalize well to other flights. We will therefore try to use a new architecture able to detect more complex pattern in the data : a convolutional network. See previous notebook for more details

# Table of Contents

# Import packages

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from sklearn.preprocessing import MinMaxScaler
import torch
from torch.utils.data import Dataset, DataLoader
from tabulate import tabulate
from tqdm.notebook import tqdm_notebook
import random
import magnav
import os
from ipywidgets import widgets
from torchinfo import summary
import ast
from matplotlib.collections import LineCollection

In [2]:
# Reproducibility
torch.manual_seed(27)
random.seed(27)
np.random.seed(27)

# 1 - What is a Convolutional Neural Network <a class="anchor" id = "1"></a>

A convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. Convolutional neural network are a specialized type of artificial neural networks that use a mathematical operation called convolution in place of general matrix multiplication in at least one of their layers. CNN are often compared to the way the brain achieves vision processing in living organisms.

Typical architecture of a convolutional neural network:

<img src="../data/external/Images/CNN.jpeg" alt="CNN" width="700"/>

A convolutional neural network consits of an input layer, hidden layers and an output layer. In a CNN, the hidden layers include layers that perform convolutions. Typically this includes a layer that performs a dot product of the convolution kernel with the layer's input matrix. This product is usually the [Frobenius inner product](https://en.wikipedia.org/wiki/Frobenius_inner_product), and its activattion function is commonly [ReLU](https://en.wikipedia.org/wiki/Rectifier_(neural_networks). As the convolution kernel slides along the input matrix for the layer, the convolution operation generates a feature map, which in turn contributes to the input of the next layer. This is followed by other layers such as pooling layers, fully connected layers, and normalization layers.

<font size='3'><b>Convolutional layers</b></font>

Convolutional layer convolve the input and pass its result to the next layer. This is similar to the response of a neuron in the visual cortex to a specific stimulus. After passing through a convolutional layer, the image becomes abstracted to a feature map, also called an activation map :

<img src="../data/external/Images/2D_Convolution_Animation.gif" alt="CNN" width="300"/>

<font size='3'><b>Pooling layers</b></font>

Pooling layers reduce the dimensions of data by combining the outputs of neuron clusters at one layer into a single neuron in the next layer. Local pooling combines small clusters, tilling sizes such as 2x2 are commonly used. Global pooling acts on all the neurons of the feature map. There are two common types of pooling in popular use :
- Max pooling uses the maximum value of each local cluster of neurons in the feature map
- Average pooling takes the average value of each local cluster of neurons in the feature map

Here is an expemple of Max Pooling :

<img src="../data/external/Images/Max_Pooling_GIFg.gif" alt="CNN" width="300"/>

## Specifity of time-series data for CNN

In our case we have *multivariate* time series data. This mean that there is more than one observation for each time step. We are trying to create a model able to use a sequence of multiple input series and output one time series dependant on the input time series. The input time series are parallel because each series has observations at the same time steps.

# 2 - Import of Data

In [3]:
df2 = pd.read_hdf('../data/interim/Chall_dataset.h5', key=f'Flt1002')
df3 = pd.read_hdf('../data/interim/Chall_dataset.h5', key=f'Flt1003')
df4 = pd.read_hdf('../data/interim/Chall_dataset.h5', key=f'Flt1004')
df5 = pd.read_hdf('../data/interim/Chall_dataset.h5', key=f'Flt1005')
df6 = pd.read_hdf('../data/interim/Chall_dataset.h5', key=f'Flt1006')
df7 = pd.read_hdf('../data/interim/Chall_dataset.h5', key=f'Flt1007')

In [4]:
df2.head()

Unnamed: 0_level_0,TL_comp_mag3_cl,TL_comp_mag5_cl,V_BAT1,V_BAT2,INS_ACC_X,INS_ACC_Y,INS_ACC_Z,CUR_IHTR,PITCH,ROLL,AZIMUTH,LINE,IGRFMAG1
Time [s],Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
45100.0,-1026.777805,-44.982774,25.827,2.015,-0.109544,-0.389939,9.353474,1.734,9.19,0.19,204.01,1002.01,-297.343
45100.1,-1023.030351,-40.600326,25.826,2.014,-0.157305,-0.441169,9.317263,1.759,9.08,-0.03,203.95,1002.01,-296.223
45100.2,-1021.28623,-34.817623,25.824,2.013,-0.179486,-0.462637,9.348927,1.783,8.96,-0.22,203.91,1002.01,-295.079
45100.3,-1023.965085,-29.347438,25.82,2.01,-0.208515,-0.496153,9.33383,1.796,8.85,-0.39,203.9,1002.01,-293.939
45100.4,-1030.701663,-25.421394,25.815,2.007,-0.252133,-0.507891,9.261835,1.788,8.73,-0.55,203.91,1002.01,-292.821


# 3 - Data Scaling

## 3.1 - MinMax Scaling

In [5]:
df2.describe()

scaling_range = [-1,1]
MinMaxScaler_2 = MinMaxScaler(scaling_range)
MinMaxScaler_3 = MinMaxScaler(scaling_range)
MinMaxScaler_4 = MinMaxScaler(scaling_range)
MinMaxScaler_7 = MinMaxScaler(scaling_range)


df2_scaled = pd.DataFrame()
df3_scaled = pd.DataFrame()
df4_scaled = pd.DataFrame()
df7_scaled = pd.DataFrame()


df2_scaled[df2.drop(columns=['LINE','IGRFMAG1']).columns] = MinMaxScaler_2.fit_transform(df2.drop(columns=['LINE','IGRFMAG1']))
df3_scaled[df3.drop(columns=['LINE','IGRFMAG1']).columns] = MinMaxScaler_3.fit_transform(df3.drop(columns=['LINE','IGRFMAG1']))
df4_scaled[df4.drop(columns=['LINE','IGRFMAG1']).columns] = MinMaxScaler_4.fit_transform(df4.drop(columns=['LINE','IGRFMAG1']))
df7_scaled[df4.drop(columns=['LINE','IGRFMAG1']).columns] = MinMaxScaler_7.fit_transform(df7.drop(columns=['LINE','IGRFMAG1']))


df2_scaled.index = df2.index
df3_scaled.index = df3.index
df4_scaled.index = df4.index
df7_scaled.index = df7.index


df2_scaled[['LINE','IGRFMAG1']] = df2[['LINE','IGRFMAG1']]
df3_scaled[['LINE','IGRFMAG1']] = df3[['LINE','IGRFMAG1']]
df4_scaled[['LINE','IGRFMAG1']] = df4[['LINE','IGRFMAG1']]
df7_scaled[['LINE','IGRFMAG1']] = df7[['LINE','IGRFMAG1']]


df2_scaled.describe()

Unnamed: 0,TL_comp_mag3_cl,TL_comp_mag5_cl,V_BAT1,V_BAT2,INS_ACC_X,INS_ACC_Y,INS_ACC_Z,CUR_IHTR,PITCH,ROLL,AZIMUTH,LINE,IGRFMAG1
count,207578.0,207578.0,207578.0,207578.0,207578.0,207578.0,207578.0,207578.0,207578.0,207578.0,207578.0,207578.0,207578.0
mean,0.120479,-0.499555,-0.643363,-0.078342,-0.124208,-0.031765,0.058171,-0.026525,0.202881,-0.09027,0.105508,1152.355312,15.822918
std,0.060556,0.148419,0.223182,0.484986,0.168563,0.163822,0.109107,0.352792,0.232093,0.257489,0.537493,603.63303,263.641503
min,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,158.0,-868.652
25%,0.092208,-0.566069,-0.749049,-0.471795,-0.174118,-0.076751,0.013592,-0.260143,0.078834,-0.167997,-0.246764,1002.03,-106.78725
50%,0.119767,-0.498298,-0.673004,-0.298462,-0.122392,-0.031005,0.05833,-0.072517,0.188985,-0.101912,0.016722,1002.15,24.2195
75%,0.14914,-0.444462,-0.596958,0.36,-0.071264,0.01445,0.103938,0.166514,0.307775,-0.032857,0.608167,1002.2,120.8685
max,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,3086.0,2699.331


## 3.2 - Standard Scaling

# 4 - Input Sequence

Unlike the MLP, we will enter the data in sequence form :

$ 
\begin{matrix}
&
\begin{bmatrix}
t_0 & t_1 & t_2 & t_3 & \ldots & t_n
\end{bmatrix} 
\\
\begin{bmatrix}
feature\,1 \\
feature\,2 \\
\vdots \\
feature\,k
\end{bmatrix} 
&
\begin{bmatrix}
f_{11} & f_{12} & f_{13} & \ldots & f_{1n} \\
f_{21} & f_{22} & f_{23} & \ldots & f_{2n} \\
\vdots & \vdots & \vdots & \vdots & \vdots \\
f_{k1} & f_{k2} & f_{k3} & \ldots & f_{kn} 
\end{bmatrix}
\\
\begin{bmatrix}
truth
\end{bmatrix} 
&
\begin{bmatrix}
&\quad
&\quad
&\quad
&\quad
&
y_{n}
\end{bmatrix}
\end{matrix}
$


To ensure that only whole sequences will be given to the model, we use the function below to remove the last data that does not fit in a sequence :

In [6]:
def trim_data(data,seq_length):
    # Remove excessive data that cannot be in a full sequence
    if (len(data)%seq_length) != 0:
        data = data[:-(len(data)%seq_length)]
    else:
        pass
        
    return data

The function below is used to create the dataset and send the data as a sequence to the model :

In [7]:
class MagNavDataset(Dataset):
    # split can be 'Train', 'Val', 'Test'
    def __init__(self, df, seq_length, split):
        
        self.seq_length = seq_length
        
        # Get list of features
        self.features   = df.drop(columns=['LINE','IGRFMAG1']).columns.to_list()
        
        if split == 'train':
            
            # Keeping only 1003, 1002 and 1004 flight sections for training except 1002.14
            sections = np.concatenate([df2.LINE.unique(),df3.LINE.unique(),df4.LINE.unique()]).tolist()
            sections.remove(1002.14)
            self.sections = sections
            mask_train = pd.Series(dtype=bool)
            for line in sections:
                mask  = (df.LINE == line)
                mask_train = mask|mask_train
            
            # Split in X, y for training
            X_train    = df.loc[mask_train,self.features]
            y_train    = df.loc[mask_train,'IGRFMAG1']
            
            # Removing data that can't fit in full sequence and convert it to torch tensor
            self.X = torch.t(trim_data(torch.tensor(X_train.to_numpy(),dtype=torch.float32),seq_length))
            self.y = trim_data(torch.tensor(np.reshape(y_train.to_numpy(),[-1,1]),dtype=torch.float32),seq_length)
            
            
            
        elif split == 'val':
            
            # Selecting 1002.14 for validation
            mask_val   = (df.LINE == 1002.14)
            self.sections = 1002.14
            
            # Split in X, y for validation
            X_val      = df.loc[mask_val,self.features]
            y_val      = df.loc[mask_val,'IGRFMAG1']
            
            # Removing data that can't fit in full sequence and convert it to torch tensor
            self.X = torch.t(trim_data(torch.tensor(X_val.to_numpy(),dtype=torch.float32),seq_length))
            self.y = trim_data(torch.tensor(np.reshape(y_val.to_numpy(),[-1,1]),dtype=torch.float32),seq_length)
            
        elif split == 'test':
            
            # Slecting flight 1007 as test
            mask_test = pd.Series(dtype=bool)
            for line in [1007.06]:#df6.LINE.unique():
                mask  = (df.LINE == line)
                mask_test = mask|mask_test
            
            # Split in X, y for test
            X_test     = df.loc[mask_test,self.features]
            y_test     = df.loc[mask_test,'IGRFMAG1']
            
            # Removing data that can't fit in full sequence and convert it to torch tensor
            self.X = torch.t(trim_data(torch.tensor(X_test.to_numpy(),dtype=torch.float32),seq_length))
            self.y = trim_data(torch.tensor(np.reshape(y_test.to_numpy(),[-1,1]),dtype=torch.float32),seq_length)
        
    def __getitem__(self, index):
        X = self.X[:,index:(index+self.seq_length)]
        y = self.y[index+self.seq_length-1]
        return X, y
    
    def __len__(self):
        return len(torch.t(self.X))-self.seq_length

# 5 - Model

We have seen in a previous [section](#1) what a CNN is. CNNs are mainly used in the field of computer vision, but they can also be used for time-series. CNNs are highly noise-resitant and are also able to extract very informative deep features, which are independent from time.

## 5.1 - Differences between 2D CNN and 1D CNN

Contrary to a 2D convolution which moves on 2 dimensions, the 1D convolution moves along the time axis with a width corresponding to the number of signals in input in the case of a multivariate time-series. This means that in output of a 1D convolution, the feature maps are  vectors, while for 2D convolutions they are matrices :

<img src="../data/external/Images/conv1d.gif" alt="CNN1D" width="700"/>

Conv1D has several advantages over Conv2D for time-series such as smoothing the data or converting a multivariate to univriate problem that better adapt to classical regression algorithms.<br>
To get better performances the training is done in the terminal. Here are the steps to follow to train a CNN:

<img src="../data/external/Images/train_CNN.png" alt="CNN1D" width="700"/>

Be sure to be at the root of the MagNav folder before running the commands. There are multiples parameters available, type ```--help``` after ```./src/models/train_CNN.py``` to see more informations. Note that if you want to change the architecture of the CNN you have to directly modify the ```train_CNN.py```. All trained models are saved in ```models/CNN_runs```. The name correspond to the date when the training finished. Then in the notebook we load the saved model to test it. We below instantiate the CNN class to be able to use our saved model.

In [8]:
class CNN(torch.nn.Module):
    
    def forward(self, x):
        logits = self.architecture(x)
        return logits

# 6 - Visualization of results

In [9]:
def getfiles(dirpath):
    a = [s for s in os.listdir(dirpath)
         if os.path.isfile(os.path.join(dirpath, s))]
    a.sort(key=lambda s: os.path.getmtime(os.path.join(dirpath, s)))
    return a

def getdirs(dirpath):
    a = [s for s in os.listdir(dirpath)
         if os.path.isdir(os.path.join(dirpath, s))]
    a.sort(key=lambda s: os.path.getmtime(os.path.join(dirpath, s)),reverse=True)
    return a

def getmodel_params(folder_name):
    my_file = open(f'../models/CNN_runs/{folder_name}/parameters.txt','r')

    content = my_file.readlines()
    content = [x.rstrip() for x in content]
    content = list(filter(None, content))

    for i,item in enumerate(content):
        if item == 'Epochs :':
            epochs = int(content[i+1])
        if item == 'Batch_size :':
            batch_size = int(content[i+1])
        if item == 'Loss :':
            loss = content[i+1][:-2]
        if item == 'Scaling :':
            scaling = content[i+1]
        if item == 'Sequence_length :':
            seq_len = int(content[i+1])
        if item == 'Training_device :':
            device = content[i+1]
        if item == 'Execution_time :':
            exec_time = int(float(content[i+1][:-1]))
        if item == 'Input_shape :':
            input_shape = ast.literal_eval(content[i+1])
        if item == 'Features :':
            features = ast.literal_eval(content[i+1])
    
    return epochs, batch_size, loss, scaling, seq_len, device, exec_time, input_shape, features

In [10]:
def models_summary(models_folder):
    folders = getdirs(models_folder)
    
    table = pd.DataFrame()
    
    for folder in folders:
        
        epochs, batch_size, loss, scaling, seq_len, device, exec_time, input_shape, features = getmodel_params(folder)
        
        my_file = open(f'../models/CNN_runs/{folder}/val_loss.txt','r')
        val_loss = my_file.readlines()
        val_loss = [float(x.rstrip()) for x in val_loss]
        
        temp = pd.DataFrame([[folder,min(val_loss),batch_size,seq_len]],columns=['Model number','Best Validation RMSE','Batch size','Sequence size'])
        
        table = pd.concat([table,temp])
    
    table = table.reset_index(drop=True)
    
    return table
    
    
models_summary = models_summary('../models/CNN_runs')

In [11]:
models_summary.sort_values(by=['Best Validation RMSE']).head(50)

Unnamed: 0,Model number,Best Validation RMSE,Batch size,Sequence size
548,CNN_220623_2356,6.701367,16,5
152,CNN_220626_2235,6.714774,16,15
378,CNN_220624_2330,6.927098,16,10
137,CNN_220627_0602,6.944742,16,15
329,CNN_220625_1213,6.980413,16,10
165,CNN_220626_1707,7.038318,16,15
147,CNN_220627_0045,7.068888,16,15
485,CNN_220624_0930,7.092378,16,5
132,CNN_220627_0820,7.133365,16,15
157,CNN_220626_2036,7.171991,16,15


In [12]:
df_concat = pd.concat([df2,df3,df4,df7],ignore_index=True,axis=0)

seq_length = 100

train = MagNavDataset(df_concat,seq_length=seq_length,split='train')
val   = MagNavDataset(df_concat,seq_length=seq_length,split='val')
test  = MagNavDataset(df_concat,seq_length=seq_length,split='test')

In [26]:
def compute_SNR(truth_mag,pred_mag):
    
    error = pred_mag - truth_mag
    std_truth = np.std(truth_mag)
    std_error = np.std(error)
    
    SNR = std_truth / std_error
    
    return SNR

In [42]:
# Create a widget to select the model folder

def visualize_model(model_folder):
    
    # Load model
    model = torch.load(f'../models/CNN_runs/{model_folder}/CNN.pt')
    
    # Get parameters
    epochs, batch_size, loss, scaling, seq_len, device, exec_time, input_shape, features = getmodel_params(model_folder)
    
    # Make predictions for test data
    test  = MagNavDataset(df7,seq_length=seq_len,split='test')
    test_loader    = DataLoader(test,
                           batch_size=batch_size,
                           shuffle=False,
                           num_workers=0,
                           pin_memory=False)
    preds = []
    
    for batch_index, (inputs, labels) in enumerate(test_loader):
        model.eval()
        with torch.no_grad():
            preds.append(model(inputs.to(device)).cpu())
    preds = np.concatenate(preds)
    
    # Compute RMSE
    RMSE = magnav.rmse(preds,test.y[seq_len:],False)
    SNR  = compute_SNR(test.y[seq_len:].numpy(),preds)
    
    # Display model summary
    print('\n\033[4mModel info :\033[0m\n')
    print(summary(model,input_size=(1,input_shape[0],input_shape[1]),
                  col_names=["kernel_size","input_size","output_size","num_params"],
                  ))
    
    # Display training info
    print('\n\033[4mTraining parameters :\033[0m\n')
    table = [['Epochs',epochs],
             ['Batch Size',batch_size],
             ['Loss function',loss],
             ['Data scaling',scaling],
             ['Input shape',input_shape],
             ['Sequence length', seq_len],
             ['Training time',magnav.to_hms(exec_time)],
             ['Training device',device]]
    print(tabulate(table,headers=['Parameter','Value'],tablefmt="pipe",stralign='right'))
    
    # Display training features
    print('\n\033[4mTraining features :\033[0m\n')
    print(features)
    
    # Display training curves
    print('\n\033[4mTraining curves and predictions :\033[0m\n')
    train_loss = []
    val_loss = []
    
    my_file = open(f'../models/CNN_runs/{model_folder}/train_loss.txt','r')
    train_loss = my_file.readlines()
    train_loss = [float(x.rstrip()) for x in train_loss]

    my_file = open(f'../models/CNN_runs/{model_folder}/val_loss.txt','r')
    val_loss = my_file.readlines()
    val_loss = [float(x.rstrip()) for x in val_loss]
    
    fig, ((ax1,ax2),(ax3,ax4)) = plt.subplots(2,2,figsize=[22,16])
    ax1.plot(train_loss,label='Training')
    ax1.plot(val_loss,label='Validation')
    ax1.set_title('Training loss'),ax1.set_ylabel('RMSE'),ax1.set_xlabel('Epochs')
    ax1.legend()
    ax1.grid()
    
    # Prediction and truth
    ax2.plot(test.y[seq_len:],label='Truth')
    ax2.plot(preds,label='Predictions')
    ax2.set_title('Prediction for flight 1007'),ax2.set_ylabel('[nT]'),ax2.set_xlabel('time step')
    ax2.legend()
    ax2.grid()
    
    # Error plot
    error = preds-np.array(test.y[seq_len:])

    ax4.plot(error,label='Predictions',color='C3')
    ax4.text(0.007,0.967,f'RMSE={RMSE:.2f}nT',fontsize=12,bbox=dict(facecolor = 'C3',alpha=0.6),transform=plt.gca().transAxes)
    ax4.text(0.007,0.915,f'SNR={SNR:.2f}',fontsize=12,bbox=dict(facecolor = 'C3',alpha=0.6),transform=plt.gca().transAxes)
    ax4.set_title('Prediction error for flight 1007'),ax4.set_ylabel('[nT]'),ax4.set_xlabel('time step')
    ax4.legend(loc=1)
    ax4.grid()
    
    # Error map
    df1007 = pd.read_hdf('../data/interim/Flt_data.h5', key=f'Flt1007')
    points = np.array([df1007['LONG'].loc[df1007.LINE==1007.06][seq_len:],df1007['LAT'].loc[df1007.LINE==1007.06][seq_len:]]).T.reshape(-1,1,2)
    segments = np.concatenate([points[:-1],points[1:]],axis=1)

    lc = LineCollection(segments,cmap=plt.get_cmap('Spectral'))
    lc.set_array(error.reshape(-1))
    lc.set_clim(vmin=-50,vmax=50)
    ax3.add_collection(lc)

    cbar = plt.colorbar(lc,ax=ax3,label='[nT]')

    ax3.set_xlim(min(df1007['LONG'])-0.1,max(df1007['LONG'])+0.1)
    ax3.set_ylim(min(df1007['LAT'])-0.1,max(df1007['LAT'])+0.1)
    ax3.set_xlabel('Longitude'), ax3.set_ylabel('Latitude'), ax3.set_title('Flight 1007')
    ax3.grid()
    
    plt.show()

In [43]:
sel_model = widgets.Dropdown(
    options     = getdirs('../models/CNN_runs'),
    description = 'Select model: ',
    disabled    = False)

sel_model_txt = widgets.Text(
    placeholder = 'Model Number',
    description = 'String:'
)


# sel_wid_txt = widgets.interactive(visualize_model, model_folder=sel_model_txt)
# display(sel_wid_txt)
sel_wid = widgets.interactive(visualize_model, model_folder=sel_model)
display(sel_wid)

interactive(children=(Dropdown(description='Select model: ', options=('CNN_220628_1701', 'CNN_220628_083149', …

# 7 - XAI