**INSTRUCTIONS**:

This template is for maintaining a consistent structure to FACET tutorials integrated into the FACET sphinx documentation. Please observe the following:
1. In the cells below italics provide instructions/guidance on what is needed and can be deleted prior to finalizing the notebook
2. Please retain the cell below which incorporates the FACET logo into the Sphinx documentation.
3. When referring to FACET please use all capitals.
4. Please ensure that coding style follows PEP8 and keywords are used for clarity.
5. Finally, this cell can be deleted once the tutorial has been drafted.

# *Name of tutorial* with FACET

FACET is composed of the following key components:

- **Model Inspection**

    FACET introduces a new algorithm to quantify dependencies and interactions between features in ML models. This new tool for human-explainable AI adds a new, global perspective to the observation-level explanations provided by the popular [SHAP](https://shap.readthedocs.io/en/latest/) approach. To learn more about FACET's model inspection capabilities, see the getting started example below.


- **Model Simulation**

    FACET's model simulation algorithms use ML models for *virtual experiments* to help identify scenarios that optimise predicted  outcomes. To quantify the uncertainty in simulations, FACET utilises a range of bootstrapping algorithms including stationary and stratified bootstraps. For an example of FACET’s bootstrap simulations, see the getting started example below.    
    
    
- **Enhanced Machine Learning Workflow**  

    FACET offers an efficient and transparent machine learning workflow, enhancing [scikit-learn]( https://scikit-learn.org/stable/index.html)'s tried and tested pipelining paradigm with new capabilities for model selection, inspection, and simulation. FACET also introduces [sklearndf](https://github.com/BCG-Gamma/sklearndf), an augmented version of *scikit-learn* with enhanced support for *pandas* dataframes that ensures end-to-end traceability of features.     

***

**Context**

*Please provide relevant context for the tutorial including:*

1. *Question(s) to answer*  
2. *Data source(s) used*

*This section should be kept brief and to the point. Further background information on the data, EDA, etc should be placed in the Appendix at the end of the notebook.*

***

**Tutorial outline**

*The tutorial should have the following structure, and should always link to the same heading in the notebook:*

1. [Required imports](#Required-imports)
2. [*Your sections*](#Your-sections)  
... *more of your sections*

3. [Summary](#Summary)
4. [What can you do next?](#What-can-you-do-next?)
5. [Appendix](#Appendix)

# Required imports

In order to run this notebook, we will import not only the FACET package, but also other packages useful to solve this task. Overall, we can break down the imports into three categories: 

1. Common packages (pandas, matplotlib, etc.)
2. Required FACET classes (inspection, selection, validation, simulation, etc.)
3. Other BCG Gamma packages which simplify pipelining (sklearndf, see on [GitHub](https://github.com/orgs/BCG-Gamma/sklearndf/)) and support visualization (pytools, see on [GitHub](https://github.com/BCG-Gamma/pytools)) when using FACET

**Common package imports**

In [3]:
# list your usual imports here such as pandas, numpy and others 
# not covered by FACET, sklearndf or pytools

**FACET imports**

In [1]:
# list your Gamma facet imports here

**sklearndf imports**

Instead of using the "regular" scikit-learn package, we are going to use sklearndf (see on [GitHub](https://github.com/orgs/BCG-Gamma/sklearndf/)). sklearndf is an open source library designed to address a common issue with scikit-learn: the outputs of transformers are numpy arrays, even when the input is a data frame. However, to inspect a model it is essential to keep track of the feature names. sklearndf retains all the functionality available through scikit-learn plus the feature traceability and usability associated with Pandas data frames. Additionally, the names of all your favourite scikit-learn functions are the same except for `DF` on the end. For example, the standard scikit-learn import:

`from sklearn.pipeline import Pipeline`

becomes:

`from sklearndf.pipeline import PipelineDF`

In [4]:
# list your sklearndf imports here

**pytools imports**

pytools (see on [GitHub](https://github.com/BCG-Gamma/pytools)) is an open source library containing general machine learning and visualization utilities, some of which are useful for visualising the advanced model inspection capabilities of FACET.

In [5]:
# list your pytools imports here

# *Your sections*

1. *Your text providing an overview of this section.*
2. *Use as many sections as you need to provide a good high-level structure to the tutorial, such as preprocessing, classifier development or model inspection*

## *Your subsections*

1. *Your text describing what the set-up for this section.*
2. *Use subsections as need to ensure a section has a clear structure, for example under a section for model inspection you might include subsections for feature importance, SHAP feature interactions, synergy or redundancy*

# Summary

*This section should list the key conclusions/takeaways from the tutorial.*

# What can you do next?

*If applicable this section should note what readers of the tutorial can think about or do next to extend what was done or develop the skills taught further.*

# Appendix

*The Appendix should detail all relevant information to the tutorial not included in the main outline above. It should include at a minimum the two sections below.*

## Data source and study cohort

*This section should include all information relevant to obtaining the data used in the tutorial and reproducing the starting point for the tutorial. This could include things like:* 

1. *Detailed listings of data sources, including links and how to access*  
2. *How the study population are defined, whether it be all transactions over a certain value or all patients above a certain age*  
3. *If any features or the target require definitions in order to be derived from the data, such as the combination of levels in biomarkers*
4. *Any other information pertinent to the reproducibility of the tutorial*

## Exploratory Data Analysis (EDA)

*This section should perform two functions:*  

1. *To walk the reader through the EDA behind the tutorial, and*  
2. *Provide a quick summary of EDA conclusions relevant to the tutorial which could include items such as identifying missingness, then need for transformations or outlier mitigation, basic associations between features and the target, etc*