# 05_Refined_Model_Evaluation
This notebook serves to show the accuracy of the refined model, and how it is better than just one individual setting for the generated features.

### Prerequisites:
- `make refined_model` - if you haven't already run the prerequisites, this will run
    - `make download`
    - `make features`
    - `make extended_features`
    
### Purpose:
The purpose of the refined model is to provide some of the functionality of the extended model (including features calculated with multiple hyperparameter settings and epoch sizes), while not including every setting that is used for the extended model, because calculating that for multiple different edf's is time consuming and memory-intensive. The idea behind the refined model is to look at each sleep state separately—Active Waking, Quiet Waking, Drowsiness, SWS, and REM—and include the single best epoch & welch setting for EEG and ECG for each of them. In this way, you are getting some of the benefit of including the same features calculated at multiple different settings (for example maybe a power spectral density window a.k.a. welch size of 16 seconds is better to predict active waking, but a welch size of 1 second is better for predicting drowsiness), while saving time in only calculating the features 5 times instead of the ~40 different settings used for the extended model.

In [1]:
import pandas as pd
import pytz

In [6]:
import sys
sys.path.insert(0, '..')
import src.models.build_model_LGBM as bmodel
import src.models.build_extended_model_LGBM as emodel

In [3]:
%load_ext autoreload
%autoreload 2

## Load features dataframes

In [11]:
# PST timezone
pst_timezone = pytz.timezone('America/Los_Angeles')

# Load features
basic_features_df = pd.read_csv('../data/processed/test12_Wednesday_07_features_with_labels.csv',
                                index_col=0)
refined_features_df = pd.read_csv('../data/processed/test12_Wednesday_08_refined_features_with_labels_v3.csv',
                                  index_col=0)


# Set index as DatetimeIndex
basic_features_df.index = pd.DatetimeIndex(basic_features_df.index, tz=pst_timezone)
refined_features_df.index = pd.DatetimeIndex(refined_features_df.index, tz=pst_timezone)

FileNotFoundError: [Errno 2] No such file or directory: '../data/processed/test12_Wednesday_07_features_with_labels.csv'

In [5]:
custom_settings = [
    'EPOCH_128_WELCH_16_EEG',
    'EPOCH_32_WELCH_1_EEG',
    'EPOCH_128_WELCH_1_EEG',
    'EPOCH_128_WELCH_4_EEG',
    'EPOCH_64_WELCH_4_EEG',
    'EPOCH_512_WELCH_64_HR',
    'EPOCH_512_WELCH_512_HR',
    'EPOCH_512_WELCH_128_HR',
    'EPOCH_256_WELCH_64_HR'
]

## Basic model

In [12]:
accs, class_accs, conf_matrs, conf_matr = bmodel.evaluate_model(basic_features_df, 'Simple.Sleep.Code',
                                                                verbosity=0)

Fold 1/5
Fold 2/5
Fold 3/5
Fold 4/5
Fold 5/5
Overall accuracy: 77.52%

Mean class accuracies across folds:
Active Waking    91.45
Drowsiness       46.92
Quiet Waking     48.47
REM              62.07
SWS              82.08
Unscorable        0.00
dtype: float64

Overall confusion matrix:
                      Predicted_Active_Waking    Predicted_Quiet_Waking    Predicted_Drowsiness    Predicted_SWS    Predicted_REM    Predicted_Unscorable
------------------  -------------------------  ------------------------  ----------------------  ---------------  ---------------  ----------------------
True_Active_Waking                     124040                      7078                    1129             1824              360                       0
True_Quiet_Waking                       13683                     20280                    3225             1672             3346                       0
True_Drowsiness                          2219                      4290                   13559  

In [13]:
conf_matr

Unnamed: 0,Predicted_Active_Waking,Predicted_Quiet_Waking,Predicted_Drowsiness,Predicted_SWS,Predicted_REM,Predicted_Unscorable
True_Active_Waking,124040,7078,1129,1824,360,0
True_Quiet_Waking,13683,20280,3225,1672,3346,0
True_Drowsiness,2219,4290,13559,1868,396,0
True_SWS,5209,3145,1386,46638,1547,0
True_REM,1341,4122,134,1829,23918,0
True_Unscorable,6264,160,0,16,14,0


## Refined model

In [None]:
accs_ref, class_accs_ref, conf_matrs_ref, conf_matr_ref = bmodel.evaluate_model(refined_features_df,
                                                                                'Simple.Sleep.Code',
                                                                                verbosity=0)