# User Study on Interpretability - Tutorial

#### Thank you for participating in our study! 
 
The study is structured as follows:
1. Tutorial: Overview of the main concepts that will be used in the rest of the study.
2. Setup Description: Presentation of the dataset used, descriptive statistics, etc. 
3. Main Component: Using the interpretability tool to answer questions about an ML model.
4. A follow-up questionnaire and interview. 

Please keep this tutorial open and handy while you complete the study. **You're welcome (and encouraged!) to refer back to it at any point.**


## Odds, Log Odds, and Odd Ratios

<img align="center" width="900" height="900" src="./images_tutorial/Odds_Diagrams.jpg">

### Odds
Odds are an intuitive way to express chance, very commonly used in the context of betting markets.

Since in this study will be focused on machine learning ---and classification in particular--- we will introduce odds and related concepts using an example from classification. 

Suppose we are trying to predict whether a person has a certain disease ($Y=1$ if they're infected, $Y=0$ otherwise) based on some symptoms $X$. 

[//]: # "For the purposes of this study we define odds and related concepts in terms of classification. Supose we have a contiuous variable $X\in\mathbb{R}$ and a binary one $Y \in \{0,1\}$."

The odds of a person being sick are defined as:
$$ O(Y=1) = \frac{P(Y=1)}{P(Y=0)} $$ 

For example, if $P(Y=1)=0.8$, then $O(Y=1)=4$, and we would say that the odds of $Y=1$ are four to one, which can be written as 4:1. 

### Odds Ratio
Now, in to express the association between symptoms $X$ and disease $Y$, we can use the *odds ratio*: 
$$ \text{OR}(Y=1 : X) = \frac{O(Y=1 \mid X)}{O(Y)}$$
which tells us how the odds of being sick conditioned on the symptoms compares to the odds without taking those into account.

It turns out (using Bayes rule) that this is mathematically equivalent to:
$$ \text{OR}(Y=1 : X) = \frac{P(X \mid Y=1)}{P(X \mid Y=0} $$

Therefore, we can interpret odds ratios in two complementary ways. For example, $\text{OR}(Y=1 : X_{\text{cough}}=1)=2$ can be interpreted as:
* **(I1):** The odds of being infected double when the person has a cough
* **(I2):** A person is twice as likely to have a cough when infected compared to when they are not


It is common to work with the logarithm of the odds ratio instead, which allows us to use addition (instead of multiplication) when combining odds of various events. For example, if $X_{\text{cough}}$ and $X_{\text{fever}}$ are independent, then:

$$\log  \text{OR}(Y=1 : X_{\text{cough}} , X_{\text{fever}} )  = \log  \text{OR}(Y=1 : X_{\text{cough}} )  +  \log  \text{OR}(Y=1 : X_{\text{fever}} )$$

So we can easily decompose how much the presence of all the symptoms contribute to the probability of being infected into individual, additive contributions.



### Checkpoint Questions

Suppose that for a given patient:
* $\log  \text{OR}(Y=1 : X_{\text{cough}}=0 ) = -1$
* $\log  \text{OR}(Y=1 : X_{\text{headache}}=1 ) = 0$
* $\log \text{OR}(Y=1 : X_{\text{fever}} =1)=2$

**Q1:** How would you interpret these facts?  

Answer:


[//]: #  "\log  \frac{O(Y=1 \mid X_1 , X_2 )}{O(Y=1)}  = \log  \frac{O(Y=1 \mid X_1 )}{O(Y=1)}  + \log  \frac{O(Y=1 \mid X_2 )}{O(Y=1)}"

**Q2:** Taking all of these into account, what is the odds ratio of this person being infected? And how would you interpret this? (you can use any of the two equivalent interpretaions I1 or I2)

Answer:  

## Weight Of Evidence

<img align="center" width="500" height="500" src="./images_tutorial/woe_balance.png">

### Definition

The log of the odds ratio is sometimes referred to as ***the Weight of Evidence*** (WoE for short). In a nutshell, the Weight of Evidence is used to quantify variable importance, and it attempts to answer the question:
> "does the *evidence* speak in favor or against a certain *hypothesis*?"

In this study we will use WoE to 'explain' predictions of machine learning models, so the 'evidence' will be the input features (e.g., symptoms), the 'hypothesis' will be related to the model's prediction (e.g., Y='infected') and the question we seek to answer is:

> "**according to the model**, how much does the input speak in favor of a certain prediction"

Since the WoE is nothing but the log odds ratio, recall that it can be expressed (and interpreted) in two different ways:
$$ \text{woe}(Y=1 : X_{\text{cough}}) =  \log \frac{O(Y=1 \mid X_{\text{cough}})}{O(Y=1)} =  \log \frac{P(X_{\text{cough}} \mid Y=1)}{P(X_{\text{cough}} \mid Y=0)}$$

In the language of the Weight of Evidence literature, we would say:
* $\text{woe}(Y=1 : X_{\text{cough}}=1) > 0 $  ⟹ the presence of cough ***speaks in favor*** of this patient being infected ($Y=1$)
* $\text{woe}(Y=1 : X_{\text{cough}}=1) < 0 $  ⟹ the presence of cough ***speaks against*** this patient being infected ($Y=1$)


### Binary vs. Multiclass

Since $Y$ is binary in our example so far, there's only two possible hypotheses: 
* the patient is infected (let's say this is the 'primary hypothesis') or
* the patient is not infected (the 'alternative hypothesis').  

Therfore, evidence *against* one of these is evidence *in favor* of the other.

But what if we had multi-class classification problem? E.g., suppose the model must instead predict one of $K$ possible diseases, and that for a given patient the model predicts $Y=$'flu', which we take as the primary hypothesis. The alternative hypothesis $h'$ could be:
* All the other possible diseases (e.g., $h': ( Y \in K\setminus \text{flu} )$)
* Another specific disease (e.g., $h': (Y=\text{cold})$)
* Any other subset of diseases, e.g. 'viral' or 'bacterial'

Each of these might shed light on different aspects of the prediction.

[//]: <> "how much more likely is a given hypothesis h over an alternative h' given evidence e"
[//]: <> "we denote this by $\text{woe}(h/h' : e)$"
[//]: <> "If this quantity is positive, we say that the evidence $e$ speaks in favor of $h$ (and against $h'$). If it's negative, then the roles reverse: $e$ speaks against $h$ (and in favor of $h'$)."


This table provides a simple rule of thumb to decide on how "significant" the WoE is:

|  Weight of Evidence Score  | Odds in favor of hypothesis           | Strength of Evidence|
| ------------- |:-------------:| -----:|
| $ < 1.15$      | less than 3:1 | Not worth mentioning |
| $1.15$ to $2.3$  | between 3:1 and 10:1      |  Substantial |
| $2.3$ to $4.61$ | between 10:1 and 100:1      |    Strong |
| $>4.61$  | more than 100:1     |  Decisive |

Remember: a negative WoE indicates that the evidence speaks in favor of the alternative hypothesis $h'$, so we can use this table to now quantify strength of evidence *against* $h$ (in *favor* of $h'$).

[//]: <> "according to the model, how much more likely is class A over B given than the input is X"
[//]: <> "As before, a postitive WoE score indicates that the features $X$ make the reference class (A) more likely than the alternative one (B)."

Here's what a full example might look like:

<img align="center" width="900" height="900" src="./images_tutorial/Simple_WoE_Diagram_feat.jpg">


### WoE of Individual Features and Feature Groups

<!--- (<img align="center" width="500" height="500" src="./images_tutorial/dendrogram.png">) --->


When the input has dimension >1, we will usually display the WoE scores for each feature as shown above.

When there is a large number of features, and there is a meaningful way to group them, it is often convenient to show aggregated WoE scores per group.

For our running example, a sensible grouping of the six symptoms would be:  
* 'respiratory' (cough, dispnea)
*  overall 'body' feeling (aches, weakness)
* 'temprature' (chills, fever). 

In that case, we could instead display:

<img align="center" width="900" height="900" src="./images_tutorial/Simple_WoE_Diagram_Agg.jpg">

which might let us quickly realize that the most decisive factors supporting this prediction are respiratory.


### Sequential Explanations

For multi-class classification, it is sometimes useful to break down the explantion into various 'steps'. 

For our diagnosis example, suppose the model predicts 'flu'. It might be illustrative to understand according to the model:
1. What evidence points to *viral diseases* ('flu', 'avian flu', etc.) instead of *bacterial* ones ('strep', etc.)
2. What evidence singles out common 'flu' over other viral diseases.

We can use the Weight of Evidence sequential towards this purpose:

<img align="center" width="900" height="900" src="./images_tutorial/Simple_WoE_Diagram_Seq.jpg">


### Prior Probabilities



### A Real Example

The following is an actual plot produced by our WoE-Explainer tool:

<img align="center" width="900" height="900" src="./images_tutorial/example_expl_2.jpg">


Based on this plot, please answer the following questions:

**Q2:** How would you interpret these facts?  

**Q3:** How would you interpret these facts?  


<!---
From this plot, we can draw the following conclusions:
* The prior log-odds are positive, i.e., absent other information the model is more likely to predict than not. 
* The values of the `econ` and `usage` attributes provide substantial evidence in favor of class A 
* The values of the `demo` and `safety` attributes provide substantial evidence against class A (equiv., in favor of class B)
* The rest of the attributes dont provide substantial evidence 
* Despite opposing effects, the total evidence **in favor** of A is stronger that that **against** it (think about stacking the positive and negative bars together), so the model predicts class A
--->

## Key take-aways:

1. #### The Weight of Evidence (WoE): "how does the presence of some input features (the *evidence*) affect the prediction of a model (hypothesis)"
2. #### WoE is expressed in terms of log odd ratios, so it is additive over variables (features)
4. #### You can choose to aggregate the WoE scores by feature group or keep them individually for each feature
5. #### You can choose to show one-shot (predicted-vs-other WoE) or sequential (WoE ratios for subsets of the classes at a time) 