# 05_Refined_Model_Evaluation
This notebook serves to show the accuracy of the refined model, and how it is better than just one individual setting for the generated features.

### Prerequisites:
- `make refined_model` - if you haven't already run the prerequisites, this will run
    - `make download`
    - `make features`
    - `make extended_features`
    
### Purpose:
The purpose of the refined model is to provide some of the functionality of the extended model (including features calculated with multiple hyperparameter settings and epoch sizes), while not including every setting that is used for the extended model, because calculating that for multiple different edf's is time consuming and memory-intensive. The idea behind the refined model is to look at each sleep state separately—Active Waking, Quiet Waking, Drowsiness, SWS, and REM—and include the single best epoch & welch setting for EEG and ECG for each of them. In this way, you are getting some of the benefit of including the same features calculated at multiple different settings (for example maybe a power spectral density window a.k.a. welch size of 16 seconds is better to predict active waking, but a welch size of 1 second is better for predicting drowsiness), while saving time in only calculating the features 5 times instead of the ~40 different settings used for the extended model.

In [7]:
import pandas as pd
import pytz
import re

In [3]:
import sys
sys.path.insert(0, '..')
import src.models.build_model_LGBM as bmodel
import src.models.build_extended_model_LGBM as emodel

In [4]:
%load_ext autoreload
%autoreload 2

## Load features dataframes

In [22]:
# PST timezone
pst_timezone = pytz.timezone('America/Los_Angeles')

# Load features
basic_features_df = pd.read_csv('../data/processed/features/test12_Wednesday_07_features_with_labels.csv',
                                index_col=0)
refined_features_df = pd.read_csv('../data/processed/features/test12_Wednesday_08_refined_features_with_labels_v3.csv',
                                  index_col=0)


# Set index as DatetimeIndex
basic_features_df.index = pd.DatetimeIndex(basic_features_df.index, tz=pst_timezone)
refined_features_df.index = pd.DatetimeIndex(refined_features_df.index, tz=pst_timezone)

In [23]:
refined_features_df.columns

Index(['Heart Rate', 'Pressure Mean', 'Pressure Std.Dev', 'ODBA Mean',
       'ODBA Std.Dev', 'GyrZ Mean', 'GyrZ Std.Dev', 'Simple.Sleep.Code',
       'EPOCH_32_WELCH_1_EEG_std', 'EPOCH_32_WELCH_1_EEG_iqr',
       ...
       'EPOCH_512_WELCH_512_HR_vlf', 'EPOCH_512_WELCH_512_HR_lf',
       'EPOCH_512_WELCH_512_HR_hf', 'EPOCH_512_WELCH_512_HR_lf/hf',
       'EPOCH_512_WELCH_512_HR_p_total', 'EPOCH_512_WELCH_512_HR_vlf_perc',
       'EPOCH_512_WELCH_512_HR_lf_perc', 'EPOCH_512_WELCH_512_HR_hf_perc',
       'EPOCH_512_WELCH_512_HR_lf_nu', 'EPOCH_512_WELCH_512_HR_hf_nu'],
      dtype='object', length=611)

In [25]:
settings = []
for col in refined_features_df.columns:
    if 'EPOCH' in col and 'WELCH' in col:
        settings.append(re.findall(r'EPOCH_[0-9]+_WELCH_[0-9]+_(?:EEG|HR)', str(col))[0])
best_settings = list(set(settings))
print(best_settings)

['EPOCH_512_WELCH_128_HR', 'EPOCH_512_WELCH_64_HR', 'EPOCH_128_WELCH_4_EEG', 'EPOCH_32_WELCH_1_EEG', 'EPOCH_256_WELCH_64_HR', 'EPOCH_512_WELCH_512_HR', 'EPOCH_128_WELCH_1_EEG', 'EPOCH_64_WELCH_4_EEG', 'EPOCH_128_WELCH_16_EEG']


## Basic model

In [26]:
accs, class_accs, conf_matrs, conf_matr = bmodel.evaluate_model(basic_features_df, 'Simple.Sleep.Code',
                                                                verbosity=0)

Fold 1/5
Fold 2/5
Fold 3/5
Fold 4/5
Fold 5/5
Overall accuracy: 76.38%

Mean class accuracies across folds:
Active Waking    90.36
Drowsiness       40.46
Quiet Waking     51.28
REM              57.99
SWS              81.99
Unscorable        0.00
dtype: float64

Overall confusion matrix:
                      Predicted_Active_Waking    Predicted_Quiet_Waking    Predicted_Drowsiness    Predicted_SWS    Predicted_REM    Predicted_Unscorable
------------------  -------------------------  ------------------------  ----------------------  ---------------  ---------------  ----------------------
True_Active_Waking                     122389                      7135                    1415             3375              117                       0
True_Quiet_Waking                       12455                     21347                    3184             2454             2766                       0
True_Drowsiness                          3020                      5076                   12067  

In [27]:
conf_matr

Unnamed: 0,Predicted_Active_Waking,Predicted_Quiet_Waking,Predicted_Drowsiness,Predicted_SWS,Predicted_REM,Predicted_Unscorable
True_Active_Waking,122389,7135,1415,3375,117,0
True_Quiet_Waking,12455,21347,3184,2454,2766,0
True_Drowsiness,3020,5076,12067,1805,364,0
True_SWS,6378,3716,1063,46377,391,0
True_REM,1416,5476,267,1287,22898,0
True_Unscorable,6156,183,26,82,7,0


## Refined model

In [28]:
accs_ref, class_accs_ref, conf_matrs_ref, conf_matr_ref = bmodel.evaluate_model(refined_features_df,
                                                                                'Simple.Sleep.Code',
                                                                                verbosity=0)

Fold 1/5
Fold 2/5
Fold 3/5
Fold 4/5
Fold 5/5
Overall accuracy: 80.09%

Mean class accuracies across folds:
Active Waking    92.94
Drowsiness       47.76
Quiet Waking     56.58
REM              62.67
SWS              84.07
Unscorable        0.00
dtype: float64

Overall confusion matrix:
                      Predicted_Active_Waking    Predicted_Quiet_Waking    Predicted_Drowsiness    Predicted_SWS    Predicted_REM    Predicted_Unscorable
------------------  -------------------------  ------------------------  ----------------------  ---------------  ---------------  ----------------------
True_Active_Waking                     126043                      6173                     693             1302              220                       0
True_Quiet_Waking                       12037                     23350                    3424             1273             2122                       0
True_Drowsiness                          2146                      4954                   13846  

In [29]:
conf_matr_ref

Unnamed: 0,Predicted_Active_Waking,Predicted_Quiet_Waking,Predicted_Drowsiness,Predicted_SWS,Predicted_REM,Predicted_Unscorable
True_Active_Waking,126043,6173,693,1302,220,0
True_Quiet_Waking,12037,23350,3424,1273,2122,0
True_Drowsiness,2146,4954,13846,1317,69,0
True_SWS,4595,3216,1188,48042,884,0
True_REM,1162,4001,37,1415,24729,0
True_Unscorable,6315,122,17,0,0,0
