# Evaluating multiple feature recipes on the CPAC_N7_09_15_20 dataset

- **Model**: Gradient Boosted Trees (histogram-based)
- **Target(s)**: `TF_Pelvis_Moment_X_BWBH` (`TF_Pelvis_Moment_Y_BWBH`)
- **Features**: various (approx. 63 alternative _recipes_)
- **Results**: $r^2$ scores, feature importances (permutation-based)
- **Evaluation strategy**: cross-validation (leave one subject out)

## Libraries

In [1]:
# Standard library
import warnings
import os


# Third party
import numpy as np
import pandas as pd
import sklearn
assert sklearn.__version__ >= "0.21", "Use the conda_python3_latest kernel!"
from sklearn.experimental import enable_hist_gradient_boosting  # noqa
from sklearn import (ensemble, metrics, preprocessing, 
                     pipeline, inspection, model_selection)

from IPython.display import display, Markdown


# Local
import utils

## Load Dataset

In [2]:
df_orig = utils.load_dataset("s3://cpac/ORIG/CPAC_N7_09_15_20/CPAC10S_N7_09_15_20.csv")
df_orig.describe()

Unnamed: 0,M_Trial_Num,M_Mass,M_Mass_to_L5S1,M_sub_task_indices,M_sub_task_num,M_include_overall,M_Index,M_Sub,M_sub_task_num_overall,M_Index_overall,...,RWEO_03_04_00_00_INSOLE_LY_AP_threshF50_mm,RWEO_01_00_00_00_INSOLE_RFORCE_threshF50_N,RWEO_01_02_00_00_INSOLE_RX_ML_threshF50_mm,RWEO_01_02_00_00_INSOLE_RY_AP_threshF50_mm,RWEF_03_00_00_00_INSOLE_LFORCE_threshF50_BW,RWEF_03_04_00_00_INSOLE_LX_ML_threshF50_BH,RWEF_03_04_00_00_INSOLE_LY_AP_threshF50_BH,RWEF_01_00_00_00_INSOLE_RFORCE_threshF50_BW,RWEF_01_02_00_00_INSOLE_RX_ML_threshF50_BH,RWEF_01_02_00_00_INSOLE_RY_AP_threshF50_BH
count,1366326.0,1366326.0,1040590.0,1366326.0,1366326.0,1366326.0,1366326.0,1366326.0,1366326.0,1366326.0,...,1178199.0,1363161.0,1223529.0,1223529.0,1363161.0,1178199.0,1178199.0,1363161.0,1223529.0,1223529.0
mean,69.07848,10.59573,0.3126288,291.2306,5.583528,0.8073256,2716.548,3.855173,238.4674,88484.65,...,121.6852,394.5501,49.03974,133.4307,0.6155901,0.02797499,0.06701906,0.6472693,0.02697077,0.07335476
std,23.60488,5.886671,0.1644402,358.9143,4.895601,0.3943996,2598.654,1.998726,124.6093,65760.9,...,52.77402,276.8346,8.572305,48.81799,0.5533887,0.005171558,0.02932588,0.562644,0.004770854,0.02717074
min,1.0,0.0,0.0502642,1.0,1.0,0.0,1.0,1.0,1.0,-0.1950165,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,49.0,5.0,0.1602113,116.0,1.0,1.0,595.0,2.0,128.0,28249.0,...,76.83,160.315,45.02,94.33,0.2381477,0.02584712,0.04235504,0.2565648,0.02446361,0.05173437
50%,79.0,10.0,0.2776286,235.0,4.0,1.0,1853.5,4.0,250.0,85179.0,...,114.06,365.045,50.61,131.74,0.5112327,0.02904687,0.06220556,0.53981,0.02793333,0.07169364
75%,87.0,15.0,0.4583651,379.0,8.0,1.0,4185.0,6.0,343.0,142109.0,...,166.74,594.563,54.64,175.1,0.8329281,0.03126012,0.09096354,0.8659949,0.02999444,0.09523958
max,96.0,23.0,0.7376715,7238.0,22.0,1.0,14119.0,7.0,457.0,237008.0,...,247.7,1558.065,74.42,250.87,5.463242,0.04218023,0.1440116,5.719531,0.04326744,0.1458547


## Associate column names

In [3]:
def _get_columns_with_prefix(df, prefix):
    columns = []
    for column in df.columns:
        if column.startswith(prefix):
            columns.append(column)
    return columns
    
def get_target_names(df):
    return _get_columns_with_prefix(df, "T_")

def get_meta_names(df):
    return _get_columns_with_prefix(df, "M_")    

## Clean-up dataset

- Remove samples based on `M_include_overall`

In [4]:
df = df_orig[df_orig["M_include_overall"] > 0]
print(f"Number of samples: {df.shape[0]:,d} (before clean-up: {df_orig.shape[0]:,d})")
print(f"Number of trials: {len(df['M_Trial_Name'].unique())} (before clean-up: {len(df_orig['M_Trial_Name'].unique())})")
print(f"Number of subjects: {len(df['M_Sub'].unique())}")

Number of samples: 1,103,070 (before clean-up: 1,366,326)
Number of trials: 162 (before clean-up: 162)
Number of subjects: 7


## Predictor configurations (recipes)

In [5]:
def predictor_short_name(predictor):
    return predictor[17:]

def predictor_sensor_number(predictor):
    return int(predictor[5:7])

def filter_predictors(all_predictors, patterns):
    if isinstance(patterns, str):
        patterns = (patterns,)
        
    predictors = []
    for predictor in all_predictors:
        for pattern in patterns:
            if pattern in predictor:
                predictors.append(predictor)
                break
    return predictors


feature_sets = {

    "Recipe 1: insole,foot IMUs,shank IMUs,thigh IMUs,pelvis IMU,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_12_00_00', '07_08_00_00', '10_11_00_00', '06_07_00_00', '09_10_00_00', '05_06_00_00', '05_09_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '05_08_00_00', '05_11_00_00', '05_07_00_00', '05_10_00_00', '05_00_00_00', '01_03_00_00', '05_12_01_03', '05_06_01_03', '05_09_01_03', 'FOOT_ANGLE_VL', 'SHANK_ANGLE_VL', 'THIGH_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 2: insole,foot IMUs,shank IMUs,thigh IMUs,pelvis IMU":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('07_08_00_00', '10_11_00_00', '06_07_00_00', '09_10_00_00', '05_06_00_00', '05_09_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '05_08_00_00', '05_11_00_00', '05_07_00_00', '05_10_00_00', '05_00_00_00', '01_03_00_00', '05_06_01_03', '05_09_01_03', 'FOOT_ANGLE_VL', 'SHANK_ANGLE_VL', 'THIGH_ANGLE_VL')
        ),

    "Recipe 3: insole,foot IMUs,shank IMUs,thigh IMUs,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('07_08_00_00', '10_11_00_00', '06_07_00_00', '09_10_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '01_03_00_00', 'FOOT_ANGLE_VL', 'SHANK_ANGLE_VL', 'THIGH_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 4: insole,foot IMUs,shank IMUs,pelvis IMU,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_12_00_00', '07_08_00_00', '10_11_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '05_08_00_00', '05_11_00_00', '05_07_00_00', '05_10_00_00', '05_00_00_00', '01_03_00_00', '05_12_01_03', 'FOOT_ANGLE_VL', 'SHANK_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 5: insole,foot IMUs,thigh IMUs,pelvis IMU,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_12_00_00', '05_06_00_00', '05_09_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '05_08_00_00', '05_11_00_00', '05_00_00_00', '01_03_00_00', '05_12_01_03', '05_06_01_03', '05_09_01_03', 'FOOT_ANGLE_VL', 'THIGH_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 6: insole,shank IMUs,thigh IMUs,pelvis IMU,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_12_00_00', '06_07_00_00', '09_10_00_00', '05_06_00_00', '05_09_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '05_07_00_00', '05_10_00_00', '05_00_00_00', '01_03_00_00', '05_12_01_03', '05_06_01_03', '05_09_01_03', 'SHANK_ANGLE_VL', 'THIGH_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 7: foot IMUs,shank IMUs,thigh IMUs,pelvis IMU,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_12_00_00', '07_08_00_00', '10_11_00_00', '06_07_00_00', '09_10_00_00', '05_06_00_00', '05_09_00_00', '05_08_00_00', '05_11_00_00', '05_07_00_00', '05_10_00_00', '05_00_00_00', 'FOOT_ANGLE_VL', 'SHANK_ANGLE_VL', 'THIGH_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 8: insole,foot IMUs,shank IMUs,thigh IMUs":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('07_08_00_00', '10_11_00_00', '06_07_00_00', '09_10_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '01_03_00_00', 'FOOT_ANGLE_VL', 'SHANK_ANGLE_VL', 'THIGH_ANGLE_VL')
        ),

    "Recipe 9: insole,foot IMUs,shank IMUs,pelvis IMU":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('07_08_00_00', '10_11_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '05_08_00_00', '05_11_00_00', '05_07_00_00', '05_10_00_00', '05_00_00_00', '01_03_00_00', 'FOOT_ANGLE_VL', 'SHANK_ANGLE_VL')
        ),

    "Recipe 10: insole,foot IMUs,shank IMUs,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('07_08_00_00', '10_11_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '01_03_00_00', 'FOOT_ANGLE_VL', 'SHANK_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 11: insole,foot IMUs,thigh IMUs,pelvis IMU":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_06_00_00', '05_09_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '05_08_00_00', '05_11_00_00', '05_00_00_00', '01_03_00_00', '05_06_01_03', '05_09_01_03', 'FOOT_ANGLE_VL', 'THIGH_ANGLE_VL')
        ),

    "Recipe 12: insole,foot IMUs,thigh IMUs,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '01_03_00_00', 'FOOT_ANGLE_VL', 'THIGH_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 13: insole,foot IMUs,pelvis IMU,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_12_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '05_08_00_00', '05_11_00_00', '05_00_00_00', '01_03_00_00', '05_12_01_03', 'FOOT_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 14: insole,shank IMUs,thigh IMUs,pelvis IMU":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('06_07_00_00', '09_10_00_00', '05_06_00_00', '05_09_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '05_07_00_00', '05_10_00_00', '05_00_00_00', '01_03_00_00', '05_06_01_03', '05_09_01_03', 'SHANK_ANGLE_VL', 'THIGH_ANGLE_VL')
        ),

    "Recipe 15: insole,shank IMUs,thigh IMUs,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('06_07_00_00', '09_10_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '01_03_00_00', 'SHANK_ANGLE_VL', 'THIGH_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 16: insole,shank IMUs,pelvis IMU,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_12_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '05_07_00_00', '05_10_00_00', '05_00_00_00', '01_03_00_00', '05_12_01_03', 'SHANK_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 17: insole,thigh IMUs,pelvis IMU,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_12_00_00', '05_06_00_00', '05_09_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '05_00_00_00', '01_03_00_00', '05_12_01_03', '05_06_01_03', '05_09_01_03', 'THIGH_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 18: foot IMUs,shank IMUs,thigh IMUs,pelvis IMU":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('07_08_00_00', '10_11_00_00', '06_07_00_00', '09_10_00_00', '05_06_00_00', '05_09_00_00', '05_08_00_00', '05_11_00_00', '05_07_00_00', '05_10_00_00', '05_00_00_00', 'FOOT_ANGLE_VL', 'SHANK_ANGLE_VL', 'THIGH_ANGLE_VL')
        ),

    "Recipe 19: foot IMUs,shank IMUs,thigh IMUs,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('07_08_00_00', '10_11_00_00', '06_07_00_00', '09_10_00_00', 'FOOT_ANGLE_VL', 'SHANK_ANGLE_VL', 'THIGH_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 20: foot IMUs,shank IMUs,pelvis IMU,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_12_00_00', '07_08_00_00', '10_11_00_00', '05_08_00_00', '05_11_00_00', '05_07_00_00', '05_10_00_00', '05_00_00_00', 'FOOT_ANGLE_VL', 'SHANK_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 21: foot IMUs,thigh IMUs,pelvis IMU,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_12_00_00', '05_06_00_00', '05_09_00_00', '05_08_00_00', '05_11_00_00', '05_00_00_00', 'FOOT_ANGLE_VL', 'THIGH_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 22: shank IMUs,thigh IMUs,pelvis IMU,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_12_00_00', '06_07_00_00', '09_10_00_00', '05_06_00_00', '05_09_00_00', '05_07_00_00', '05_10_00_00', '05_00_00_00', 'SHANK_ANGLE_VL', 'THIGH_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 23: insole,foot IMUs,shank IMUs":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('07_08_00_00', '10_11_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '01_03_00_00', 'FOOT_ANGLE_VL', 'SHANK_ANGLE_VL')
        ),

    "Recipe 24: insole,foot IMUs,thigh IMUs":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '01_03_00_00', 'FOOT_ANGLE_VL', 'THIGH_ANGLE_VL')
        ),

    "Recipe 25: insole,foot IMUs,pelvis IMU":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '05_08_00_00', '05_11_00_00', '05_00_00_00', '01_03_00_00', 'FOOT_ANGLE_VL')
        ),

    "Recipe 26: insole,foot IMUs,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '01_03_00_00', 'FOOT_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 27: insole,shank IMUs,thigh IMUs":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('06_07_00_00', '09_10_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '01_03_00_00', 'SHANK_ANGLE_VL', 'THIGH_ANGLE_VL')
        ),

    "Recipe 28: insole,shank IMUs,pelvis IMU":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '05_07_00_00', '05_10_00_00', '05_00_00_00', '01_03_00_00', 'SHANK_ANGLE_VL')
        ),

    "Recipe 29: insole,shank IMUs,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '01_03_00_00', 'SHANK_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 30: insole,thigh IMUs,pelvis IMU":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_06_00_00', '05_09_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '05_00_00_00', '01_03_00_00', '05_06_01_03', '05_09_01_03', 'THIGH_ANGLE_VL')
        ),

    "Recipe 31: insole,thigh IMUs,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '01_03_00_00', 'THIGH_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 32: insole,pelvis IMU,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_12_00_00', '01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '05_00_00_00', '01_03_00_00', '05_12_01_03', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 33: foot IMUs,shank IMUs,thigh IMUs":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('07_08_00_00', '10_11_00_00', '06_07_00_00', '09_10_00_00', 'FOOT_ANGLE_VL', 'SHANK_ANGLE_VL', 'THIGH_ANGLE_VL')
        ),

    "Recipe 34: foot IMUs,shank IMUs,pelvis IMU":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('07_08_00_00', '10_11_00_00', '05_08_00_00', '05_11_00_00', '05_07_00_00', '05_10_00_00', '05_00_00_00', 'FOOT_ANGLE_VL', 'SHANK_ANGLE_VL')
        ),

    "Recipe 35: foot IMUs,shank IMUs,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('07_08_00_00', '10_11_00_00', 'FOOT_ANGLE_VL', 'SHANK_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 36: foot IMUs,thigh IMUs,pelvis IMU":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_06_00_00', '05_09_00_00', '05_08_00_00', '05_11_00_00', '05_00_00_00', 'FOOT_ANGLE_VL', 'THIGH_ANGLE_VL')
        ),

    "Recipe 37: foot IMUs,thigh IMUs,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('FOOT_ANGLE_VL', 'THIGH_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 38: foot IMUs,pelvis IMU,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_12_00_00', '05_08_00_00', '05_11_00_00', '05_00_00_00', 'FOOT_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 39: shank IMUs,thigh IMUs,pelvis IMU":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('06_07_00_00', '09_10_00_00', '05_06_00_00', '05_09_00_00', '05_07_00_00', '05_10_00_00', '05_00_00_00', 'SHANK_ANGLE_VL', 'THIGH_ANGLE_VL')
        ),

    "Recipe 40: shank IMUs,thigh IMUs,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('06_07_00_00', '09_10_00_00', 'SHANK_ANGLE_VL', 'THIGH_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 41: shank IMUs,pelvis IMU,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_12_00_00', '05_07_00_00', '05_10_00_00', '05_00_00_00', 'SHANK_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 42: thigh IMUs,pelvis IMU,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_12_00_00', '05_06_00_00', '05_09_00_00', '05_00_00_00', 'THIGH_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 43: insole,foot IMUs":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '01_03_00_00', 'FOOT_ANGLE_VL')
        ),

    "Recipe 44: insole,shank IMUs":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '01_03_00_00', 'SHANK_ANGLE_VL')
        ),

    "Recipe 45: insole,thigh IMUs":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '01_03_00_00', 'THIGH_ANGLE_VL')
        ),

    "Recipe 46: insole,pelvis IMU":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '05_00_00_00', '01_03_00_00')
        ),

    "Recipe 47: insole,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '01_03_00_00', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 48: foot IMUs,shank IMUs":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('07_08_00_00', '10_11_00_00', 'FOOT_ANGLE_VL', 'SHANK_ANGLE_VL')
        ),

    "Recipe 49: foot IMUs,thigh IMUs":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('FOOT_ANGLE_VL', 'THIGH_ANGLE_VL')
        ),

    "Recipe 50: foot IMUs,pelvis IMU":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_08_00_00', '05_11_00_00', '05_00_00_00', 'FOOT_ANGLE_VL')
        ),

    "Recipe 51: foot IMUs,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('FOOT_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 52: shank IMUs,thigh IMUs":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('06_07_00_00', '09_10_00_00', 'SHANK_ANGLE_VL', 'THIGH_ANGLE_VL')
        ),

    "Recipe 53: shank IMUs,pelvis IMU":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_07_00_00', '05_10_00_00', '05_00_00_00', 'SHANK_ANGLE_VL')
        ),

    "Recipe 54: shank IMUs,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('SHANK_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 55: thigh IMUs,pelvis IMU":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_06_00_00', '05_09_00_00', '05_00_00_00', 'THIGH_ANGLE_VL')
        ),

    "Recipe 56: thigh IMUs,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('THIGH_ANGLE_VL', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 57: pelvis IMU,trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_12_00_00', '05_00_00_00', 'TRUNK_ANGLE_VL')
        ),

    "Recipe 58: insole":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('01_00_00_00', '03_00_00_00', '01_02_00_00', '03_04_00_00', '01_03_00_00')
        ),

    "Recipe 59: foot IMUs":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('FOOT_ANGLE_VL',)
        ),

    "Recipe 60: shank IMUs":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('SHANK_ANGLE_VL',)
        ),

    "Recipe 61: thigh IMUs":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('THIGH_ANGLE_VL',)
        ),

    "Recipe 62: pelvis IMU":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('05_00_00_00',)
        ),

    "Recipe 63: trunk":
        filter_predictors(
            filter_predictors(df.columns, ("SWRF", "SWEF")),
            ('TRUNK_ANGLE_VL',)
        )    
}

for feature_set_name, predictors in feature_sets.items():
    sensors = set(map(predictor_sensor_number, predictors))
    print(f"{feature_set_name}\n\tPredictors: {len(predictors)}, Sensors: {len(sensors)}\n")

Recipe 1: insole,foot IMUs,shank IMUs,thigh IMUs,pelvis IMU,trunk
	Predictors: 80, Sensors: 7

Recipe 2: insole,foot IMUs,shank IMUs,thigh IMUs,pelvis IMU
	Predictors: 58, Sensors: 7

Recipe 3: insole,foot IMUs,shank IMUs,thigh IMUs,trunk
	Predictors: 48, Sensors: 7

Recipe 4: insole,foot IMUs,shank IMUs,pelvis IMU,trunk
	Predictors: 54, Sensors: 5

Recipe 5: insole,foot IMUs,thigh IMUs,pelvis IMU,trunk
	Predictors: 62, Sensors: 3

Recipe 6: insole,shank IMUs,thigh IMUs,pelvis IMU,trunk
	Predictors: 68, Sensors: 5

Recipe 7: foot IMUs,shank IMUs,thigh IMUs,pelvis IMU,trunk
	Predictors: 57, Sensors: 5

Recipe 8: insole,foot IMUs,shank IMUs,thigh IMUs
	Predictors: 41, Sensors: 7

Recipe 9: insole,foot IMUs,shank IMUs,pelvis IMU
	Predictors: 32, Sensors: 5

Recipe 10: insole,foot IMUs,shank IMUs,trunk
	Predictors: 36, Sensors: 5

Recipe 11: insole,foot IMUs,thigh IMUs,pelvis IMU
	Predictors: 40, Sensors: 3

Recipe 12: insole,foot IMUs,thigh IMUs,trunk
	Predictors: 30, Sensors: 3

Recipe 1

## Train and evaluate boosted tree models

In [8]:
def evaluate(target_name, feature_names):
    X, y, groups = df[feature_names], df[target_name], df["M_Sub"]
    
    model = pipeline.Pipeline([
        ('scaler', preprocessing.StandardScaler()),
        ('gboost', ensemble.HistGradientBoostingRegressor())
    ])
    
    logo = model_selection.LeaveOneGroupOut()
    r2_score = model_selection.cross_val_score(
        model, X, y, cv=logo, groups=groups, n_jobs=-1).mean()
    
    # Feature importances on the full training set
    model.fit(X, y)
    perm_imp = inspection.permutation_importance(model, X, y, n_repeats=5, n_jobs=-1)
    importance = pd.Series(perm_imp.importances_mean, index=X.columns)
    importance.sort_values(ascending=False, inplace=True)

    return r2_score, importance

def run_experiments(target_name, feature_sets):
    warnings.filterwarnings('ignore')
    r2_scores = {}
    importances = {}
    
    for feature_set_name, feature_names in feature_sets.items():
        r2_score, importance = evaluate(target_name, feature_names)
        r2_scores[feature_set_name] = r2_score
        importances[feature_set_name] = importance
        display(
            Markdown(
                "---\n"
                f"**Target**: {target_name}  \n"
                f"**Features**: {feature_set_name}  \n"
                f"**$R^2$ = {r2_score:.3f}**"
            )
        )
                                        
    warnings.filterwarnings('default')
    return r2_scores, importances


## Run experiments, save data

In [11]:
target_name = "TF_Pelvis_Moment_Y_BWBH"
r2_scores, importances = run_experiments(target_name, feature_sets)

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 1: insole,foot IMUs,shank IMUs,thigh IMUs,pelvis IMU,trunk  
**$R^2$ = 0.872**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 2: insole,foot IMUs,shank IMUs,thigh IMUs,pelvis IMU  
**$R^2$ = 0.850**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 3: insole,foot IMUs,shank IMUs,thigh IMUs,trunk  
**$R^2$ = 0.851**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 4: insole,foot IMUs,shank IMUs,pelvis IMU,trunk  
**$R^2$ = 0.859**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 5: insole,foot IMUs,thigh IMUs,pelvis IMU,trunk  
**$R^2$ = 0.870**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 6: insole,shank IMUs,thigh IMUs,pelvis IMU,trunk  
**$R^2$ = 0.877**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 7: foot IMUs,shank IMUs,thigh IMUs,pelvis IMU,trunk  
**$R^2$ = 0.732**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 8: insole,foot IMUs,shank IMUs,thigh IMUs  
**$R^2$ = 0.740**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 9: insole,foot IMUs,shank IMUs,pelvis IMU  
**$R^2$ = 0.831**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 10: insole,foot IMUs,shank IMUs,trunk  
**$R^2$ = 0.848**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 11: insole,foot IMUs,thigh IMUs,pelvis IMU  
**$R^2$ = 0.843**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 12: insole,foot IMUs,thigh IMUs,trunk  
**$R^2$ = 0.854**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 13: insole,foot IMUs,pelvis IMU,trunk  
**$R^2$ = 0.846**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 14: insole,shank IMUs,thigh IMUs,pelvis IMU  
**$R^2$ = 0.857**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 15: insole,shank IMUs,thigh IMUs,trunk  
**$R^2$ = 0.855**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 16: insole,shank IMUs,pelvis IMU,trunk  
**$R^2$ = 0.860**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 17: insole,thigh IMUs,pelvis IMU,trunk  
**$R^2$ = 0.870**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 18: foot IMUs,shank IMUs,thigh IMUs,pelvis IMU  
**$R^2$ = 0.704**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 19: foot IMUs,shank IMUs,thigh IMUs,trunk  
**$R^2$ = 0.717**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 20: foot IMUs,shank IMUs,pelvis IMU,trunk  
**$R^2$ = 0.718**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 21: foot IMUs,thigh IMUs,pelvis IMU,trunk  
**$R^2$ = 0.723**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 22: shank IMUs,thigh IMUs,pelvis IMU,trunk  
**$R^2$ = 0.739**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 23: insole,foot IMUs,shank IMUs  
**$R^2$ = 0.730**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 24: insole,foot IMUs,thigh IMUs  
**$R^2$ = 0.744**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 25: insole,foot IMUs,pelvis IMU  
**$R^2$ = 0.804**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 26: insole,foot IMUs,trunk  
**$R^2$ = 0.833**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 27: insole,shank IMUs,thigh IMUs  
**$R^2$ = 0.716**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 28: insole,shank IMUs,pelvis IMU  
**$R^2$ = 0.832**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 29: insole,shank IMUs,trunk  
**$R^2$ = 0.847**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 30: insole,thigh IMUs,pelvis IMU  
**$R^2$ = 0.842**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 31: insole,thigh IMUs,trunk  
**$R^2$ = 0.854**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 32: insole,pelvis IMU,trunk  
**$R^2$ = 0.841**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 33: foot IMUs,shank IMUs,thigh IMUs  
**$R^2$ = 0.487**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 34: foot IMUs,shank IMUs,pelvis IMU  
**$R^2$ = 0.670**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 35: foot IMUs,shank IMUs,trunk  
**$R^2$ = 0.712**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 36: foot IMUs,thigh IMUs,pelvis IMU  
**$R^2$ = 0.700**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 37: foot IMUs,thigh IMUs,trunk  
**$R^2$ = 0.716**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 38: foot IMUs,pelvis IMU,trunk  
**$R^2$ = 0.701**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 39: shank IMUs,thigh IMUs,pelvis IMU  
**$R^2$ = 0.710**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 40: shank IMUs,thigh IMUs,trunk  
**$R^2$ = 0.723**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 41: shank IMUs,pelvis IMU,trunk  
**$R^2$ = 0.723**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 42: thigh IMUs,pelvis IMU,trunk  
**$R^2$ = 0.713**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 43: insole,foot IMUs  
**$R^2$ = 0.660**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 44: insole,shank IMUs  
**$R^2$ = 0.707**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 45: insole,thigh IMUs  
**$R^2$ = 0.714**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 46: insole,pelvis IMU  
**$R^2$ = 0.796**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 47: insole,trunk  
**$R^2$ = 0.831**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 48: foot IMUs,shank IMUs  
**$R^2$ = 0.492**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 49: foot IMUs,thigh IMUs  
**$R^2$ = 0.546**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 50: foot IMUs,pelvis IMU  
**$R^2$ = 0.662**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 51: foot IMUs,trunk  
**$R^2$ = 0.688**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 52: shank IMUs,thigh IMUs  
**$R^2$ = 0.426**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 53: shank IMUs,pelvis IMU  
**$R^2$ = 0.684**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 54: shank IMUs,trunk  
**$R^2$ = 0.721**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 55: thigh IMUs,pelvis IMU  
**$R^2$ = 0.689**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 56: thigh IMUs,trunk  
**$R^2$ = 0.714**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 57: pelvis IMU,trunk  
**$R^2$ = 0.626**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 58: insole  
**$R^2$ = 0.642**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 59: foot IMUs  
**$R^2$ = 0.450**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 60: shank IMUs  
**$R^2$ = 0.424**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 61: thigh IMUs  
**$R^2$ = 0.477**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 62: pelvis IMU  
**$R^2$ = 0.519**

---
**Target**: TF_Pelvis_Moment_Y_BWBH  
**Features**: Recipe 63: trunk  
**$R^2$ = 0.638**

In [12]:
os.makedirs("results", exist_ok=True)
with pd.ExcelWriter(f"results/CPAC10S_N7_09_15_20 - {target_name}.xlsx") as writer:
    df_results = pd.DataFrame({"R2 Score": r2_scores})
    df_results.to_excel(writer, sheet_name="R2 Score")
    
    for feature_set_name, importance in importances.items():
        df_results = pd.DataFrame(
            {
                "Short name": map(predictor_short_name, importance.index),
                "Importance": importance,
            }
        )
        df_results.to_excel(writer, sheet_name=feature_set_name.split(":")[0])