# Grid Search

This notebook contains the results of the Grid Search hyperparameters' optimization process. The process is composed of two phases:%%!

1. Optimization of layers' hyperparameters (i.e., configuration of specific layers).
2. Optimization of number of layers.

The hyperparameters for the following models and data sources are optimized:

- **MLP**: **smartphone** (``mlp_sp``), **smartwatch** (``mlp_sw``) and fused (``mlp_fused``) data. 
- **CNN**: **smartphone** (``cnn_sp``), **smartwatch** (``cnn_sw``) and fused (``cnn_fused``) data.
- **LSTM**: **smartphone** (``lstm_sp``), **smartwatch** (``lstm_sw``) and fused (``lstm_fused``) data.
- **CNN-LSTM**: **smartphone** (``cnn-lstm_sp``), **smartwatch** (``cnn-lstm_sw``) and fused (``cnn-lstm_fused``) data.

The Grid Search process was executed using the python script `02_hyperparameter-optmization.py` in `lib/pipeline`.

The results of this notebook are summarized in the Table II of the paper.

In [1]:
import os
import json
import pandas as pd

from itables import show

GRID_SEARCH_PATH = os.path.join('01_DATA', '02_GRID-SEARCH')
GRID_PHASE_1 = os.path.join(GRID_SEARCH_PATH, 'PHASE1')
GRID_PHASE_2 = os.path.join(GRID_SEARCH_PATH, 'PHASE2')

In [2]:
def obtain_best_grid_results(path):
    best_grid_reports = {}

    for file in os.listdir(path):
        if not file.endswith('.json'):
            continue

        with open(os.path.join(path, file), 'r') as f:
            best_grid_reports[file.split('.')[0]] = json.load(f)[0]['hyperparameters']
    
    return best_grid_reports


ORDER_MAPPING = {
    'mlp_sp': 1,
    'mlp_sw': 2,
    'mlp_fused': 3,
    'cnn_sp': 4,
    'cnn_sw': 5,
    'cnn_fused': 6,
    'lstm_sp': 7,
    'lstm_sw': 8,
    'lstm_fused': 9,
    'cnn-lstm_sp': 10,
    'cnn-lstm_sw': 11,
    'cnn-lstm_fused': 12,
}

def results_to_df(results):
    reports = []

    for model, report in results.items():
        for hyperparam, value in report.items():
            reports.append([model, hyperparam, value])
            
    return pd.DataFrame(reports, columns=['Model', 'Hyperparameter', 'Value']).sort_values(by=['Model'], key=lambda x: x.map(ORDER_MAPPING)).set_index(['Model', 'Hyperparameter'])

## Phase 1: Optimization of layers' hyperparameters

In [3]:
best_phase_1 = obtain_best_grid_results(GRID_PHASE_1)
results_to_df(best_phase_1)

Unnamed: 0_level_0,Unnamed: 1_level_0,Value
Model,Hyperparameter,Unnamed: 2_level_1
mlp_sp,lr,0.001
mlp_sp,hidden_layer,512.0
mlp_sp,input_layer,256.0
mlp_sw,lr,0.001
mlp_sw,hidden_layer,512.0
mlp_sw,input_layer,256.0
mlp_fused,lr,0.001
mlp_fused,input_layer,512.0
mlp_fused,hidden_layer,256.0
cnn_sp,lr,0.0005


## Phase 2: Optimization of number of layers

In [4]:
best_phase_2 = obtain_best_grid_results(GRID_PHASE_2)
show(results_to_df(best_phase_2))

0
Loading ITables v2.4.4 from the internet...  (need help?)
