In this notebook, your task is examine the inner workings of the sleep stage classification model you have been creating for the [Haaglanden Medisch Centrum Sleep Staging Database](https://physionet.org/content/hmc-sleep-staging/1.1/).


# Instructions

There are two ways you can complete this assignment:
1. **Modify HW3-3:** Make a copy of your HW3-3 submission and rename it so that it is clear you are submitting a new file. Copy the cells below to the end of that file and complete them once you have modified your original code.
2. **Use this file:** Copy all of the necessary cells from HW3-3 into the beginning of this file.

# Exercise 1: Overall Feature Importance

This exercise assumes that you are using a model that is able to report some form of feature importance (e.g., `.coef_`, `.feature_importances_`). If not, please switch to one for the sake of this exercise.

**(Part 1)** Since you are training one model per fold, it is likely that different features are going to be important for different folds. With that in mind, modify your training loop so that you aggregate the feature importance scores across all folds of the cross-validation procedure. The `DataFrame` should looks something like the following:

| Fold # | Feature 1 | Feature 2 | ... |
| ------- | ------- | ------- | ------- |
| 1 | 0.01 | 0.04 | ... |
| 2 | 0.01 | 0.05 | ... |
| 3 | 0.04 | 0.02 | ... |
| ... | ... | ... | ... |

*Hint:* If you are using some form of feature selection, you may want to investigate the `.get_support()` method for identifying the indeces of the features that were selected in a given fold. This also means that your `DataFrame` may have missing entries:

| Fold # | Feature 1 | Feature 2 | ... |
| ------- | ------- | ------- | ------- |
| 1 | 0.01 | 0.04 | ... |
| 2 | 0.01 | NA | ... |
| 3 | NA | 0.02 | ... |
| ... | ... | ... | ... |

In these cases, you should fill in the missing values with `0` to indicate no importance.

In [None]:
# Write code here

**(Part 2)** Plot the overall importance of the features in your modeling pipeline as a box-and-whiskers plot. Each box should correspond to a feature, and the distribution should represent the distribution of that feature's importance scores across all folds.

In [None]:
# Write code here

**(Part 3)** What did you learn about your modeling pipeline from this exercise? Were there any features that were consistently important across folds?

Write your answer here: ???

# Exercise 2: SHAP Values

Looking at SHAP values with cross-validation and feature selection is possible, but gets a bit tricky since you need to account for different features being used in different folds.

Instead, we are going to pretend that you are ready to "deploy" your model for real-world usage. We will use SHAP to see how we can explain predictions that are made by this final model.

**(Part 1)** Train a single model using all of the data as training data. In other words, remove the cross-validation loop and instead train on 100% of your data.

In [None]:
# Write code here

**(Part 2)** Use this model to generate predictions for all of the data in your dataset. Note that the accuracy of these predictions will not reflect the model's accuracy on new data since you will be testing on the training data. This is strictly for illustrative purposes.

In [None]:
# Write code here

**(Part 3)** Using `shap.waterfall()`, generate a plot showing the influence of each feature on the model's prediction for the first sample in your dataset.

*Hint:* Since this is a multi-class classifier, you will need to identify the SHAP values associated with the class that was predicted for the first sample. Most libraries sort categorical string labels in alphabetical order, but you can confirm by looking at the `.classes_` attribute of a model.

In [None]:
# Write code here

**(Part 4)** What was the most important feature for this prediction on this sample?

Write your answer here: ???

**(Part 5)** Using `shap.summary_plot()`, generate a plot showing the influence of each feature on the model's prediction for the entire dataset with respect to REM sleep (class `R`).

In [None]:
# Write code here

**(Part 6)** Which feature(s) were the most positively correlated with positive REM predictions in your model (i.e., higher value led to an `R` prediction)?

Write your answer here: ???

**(Part 7)** Which feature(s) were the most negatively correlated with positive REM predictions in your model (i.e., higher value led to a non-`R` prediction)?

Write your answer here: ???

# Prepare Submission

To get full credit for this assignment, you should submit your assignment in two formats so that we can easily grade and debug your code:
1. **.ipynb:** First, confirm that your code can run from start to finish without any errors. To check this, go to "Runtime" > "Run all" in the Google Colab menu. If everything looks good, you can export your file by going to "File" > "Download" > "Download .ipynb".
2. **.pdf:** Run the function called `colab2pdf()` below. This will automatically convert your notebook to a PDF. Note that while "File" > "Print" > "Save as PDF" also works, it requires you to manually expand all of the cells and may cut off some images.

In [None]:
colab2pdf()

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>