# Stage 4 AI training: Global hyperparameters

See "Main_AI_training_Stage1.ipynb" for introduction and details about the original and rotated frames, inputs and outputs and machine learning training.

This stage focuses on selecting the optimal model's global hyperparameters.

In [None]:
# Import all auxiliar functions:
%run Auxiliar_functions.ipynb
# Define magnetometer and datafiles path:
data_path =  './Data/Final_t_BxByBz_zAut_LabFrame/' # Datafiles path
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

## Load original data and define rotational frames
Load data from files (many segments) and store information in dataframes, one for each segment. The original data is in the **Laboratory rotational frame (RF1)**. Other rotational frames will be defined in a dictionary, expliciting the rotation axis and angle. They are named arbitrarily, their name is the dictionary's key, and their rotational axis, angle (in degree units) and name are specified as the first, second and third elements of a list, respectively.

In [None]:
# Load data:
files = [data_path+file for file in os.listdir(data_path)]
df_all = [pd.read_csv(file) for file in files]
# Define rotational frames in a dictionary, with the name as key, rotational axis, angle [degree] and name as values:
RFs = {
    'RF1': [None,None,'RF1'], # Original RF
    'RF2': [np.array([0.41,0.75,0.52]),90,'RF2'], # Hardest RF
    'RF3': [np.array([0.47,0.79,0.39]),120,'RF3'], # Intermediate RF
}

## Generate time windows

### Load original data

Each data segment is processed as time windows with a fixed time length. Each processed segment is stored as a \<Time_Wdw\> object, which has all the relevant information for magnetic analysis and position-labeling, as well as useful functions that allow to make arbitrary frame rotations.

In [None]:
# Load time windows from different segments:
wdw_pp = 40 # Number of points for time windows (dt=0.1s)
t_wdws = [] # Initiate list for all segments
norm_aux = 0 # Initiate auxiliar value for normalization.
for i_segm in [0,1,2,3,4]:
    t_wdws.append(Time_Wdw(wdw_pp, f'segm{i_segm+1}', 
                              gr_tr= 'zTrue_m' in df_all[i_segm].columns)) # Initiate object
    t_wdws[-1].store_orig_data(df_all[i_segm]) # Store original data 
    #t_wdws[-1].window_data() # Window data without any augmentation
    norm_aux = np.max([norm_aux,np.max(t_wdws[-1].B_RF1)])
# Set normalizing value for future reference [nT]
for t_wdw in t_wdws:
    t_wdw.norm_value = norm_aux/np.sqrt(3) 
# Separate into training and testing datasets:
t_wdws_train = [t_wdws[0],t_wdws[1],t_wdws[2]]
t_wdws_test = [t_wdws[3],t_wdws[4]] # These ones have ground truth
# Print summary:
print('Time window points:',wdw_pp)
print('\n','-'*20,' Training ','-'*20,'\n')
summarize_TW_segments(t_wdws_train)
print('\n','-'*20,' Testing ','-'*20,'\n')
summarize_TW_segments(t_wdws_test)

### Hyper-parameters - Global

This stage focuses on the ML model's global: learning rate, optimizer and activation function. From Stage 3 analysis, I've chosen the following model:

* CNN [32 filters, 16 kernel size] + CNN [32 filters, 4 kernel size] + DNN [1024 neurons]+[512 neurons].

**Tunable Hyper-parameters**:
* Activation_Function = ['relu','elu','tanh']
* Optimizer = ['adam', 'adadelta', 'adamax']
* Learning_Rate = [5e-2,1e-1,5e-3,1e-3,5e-4,1e-4,5e-5,1e-5]

Again, different architectures, rotational frames and random initialization seeds will be used for every hyper-parameters option.

**Rotational Frames**:
* RF1: Original laboratory frame ("Easy"). Here Bx has a very clear correlation with the elevator z-position.
* RF2: "Hard", noise is roughly equally distributed among all magnetic components.
* RF3: "Intermediate" situation.

In [None]:
%run Auxiliar_functions.ipynb
savefigs_path = './Images/Training_models/Stage4/'
results_path = './Results/'
# General hyper-parameters and accuracy criteria:
gen_hyp = {
    "Loss_Function": "mae",
    "Last_Activation_Function": 'linear',
    "Batch_Size": 512,
    "Epochs": 200, 
    "Training_p_val": 0.25,
    "Early_Stop_Monitor": "val_loss",
    "Early_Stop_Min_Delta": 0, # Improvement criteria for early stop, in [m]
    "Early_Stop_Patience": 15,
    "Early_Stop_Start_From_Epoch":30,
    "Early_Stop_Restore_Best_Weights": True,    
    "z_thres": 1, # in [m]
    "Magnetic_Components": ['Bx','By','Bz'],
    "Time_Window_pp": wdw_pp,
    "Convolutional_Network": True,
    "Conv_Layers": [[32,16],[32,4]],
    "Pool_Layers": [None,None],
    "Dens_Layers": [1024,512],
    "Flatten_Average": True,
    "Dropout_Fraction": 0,
    "Model_Name": "S4_C16C4_NP_Flat_D2048D1024",
    "RF": [RFs["RF1"],RFs["RF2"],RFs["RF3"]],
}

# Options for hyper-parameters:
activ_opts = ['relu','elu','tanh']
optim_opts = ['adam', 'adadelta', 'adamax']
lr_opts = [5e-2,1e-2,5e-3,1e-3,5e-4,1e-4,5e-5,1e-5]

# Options for rotational frames:
RF_opts = [RFs['RF1'],RFs['RF2'],RFs['RF3']]

# Options for random initialization seeds:
seed_opts = [0] # Seeds for training instances

# Train all models (UNCOMMENT IF results are not generated yet):
check_rep_model = './Results/Train_s1s2s3_Test_s4s5/Stage4_all_Train_s1s2s3_Test_s4s5.csv'
pd_results = train_stage4(gen_hyp,activ_opts,optim_opts,lr_opts,
                          seed_opts,RF_opts,t_wdws_train,t_wdws_test,
                          results_path=results_path,
                          check_rep_model=check_rep_model,
                          quick_timing_test=False)