# User Study on Interpretability - Main Sections

--- 

***BEFORE YOU BEGIN***: 

* Make sure this is running on the **Python 3.6 Kernel** (not Python 3). This can be changed in the 'Kernel' menu above.
* Go to "Cell" -> "Run All" to start executing preliminary commands in the background while you read the instructions below.

--- 

### Description:
    
Throughout this study we will be using the 'Online News Popularity' dataset. 

Each instance in this dataset is a news article published in mashable.com, characterized by 53 features describing its length, contents, polarity and some metadata. We will provide short descriptions of each feature below. The data consists of 33,510 examples (80%/20% training/testing). 

The task is to predict the *channel* ('world', 'tech', 'entertainment', 'business', 'social media' or 'lifestyle') in which each news article was published. 

The 'Preliminaries' section below is a typical ML pipeline: data loading, description, model training and evaluation.

After the model is trained, the interpretabilty tool will be instantatied and used to explain the predictions of this model.

### Instructions:

* Please read carefully and **execute all cells** (if you did "Run All", the first part will already be executed, no need to run those again).
* At the end of each section you will find a some questions, which you can answer in the empty cells provided below them.
* If you have any questions, please let the researcher now.
* Feel free to refer to the tutorial if you need a reminder of any of the concepts introduced there.


---

## Preliminaries: Data, Features, Meta-Features & Models

In [None]:
import sys
import importlib
import numpy as np
import sklearn
import pandas as pd
import matplotlib.pyplot as plt
sys.path.append('../')

The target variable is the 'channel', which has 6 classes, not evenly distributed:

In [None]:
from src.data import load_online_news
X, Y, df, feature_groups, feature_desc = load_online_news(target='channel', transform='log')

This dataset has 53 features, which could be hard to analyze simultaenously. Fortunately, there's many variables that encode similar aspects of the input, like length or polarity. A faily simple and natural grouping of features is shown below.

**Note**: there is no need to read the description of all features. Should you need them, you can scroll back here and read those that might be relevant for questions later on.

In [None]:
for i,idxs in enumerate(feature_groups.idxs):
    print('\nFeatue Group {}: {}'.format(i, feature_groups.names[i].upper()))
    for j in idxs:
        print('    {:30}\t->\t{}'.format(X.names[j], feature_desc[X.names[j]]))

Next, we next show basic statistics of each feature. Again, this information is not crucial for answering the questions below, and is provided only for reference. 

In [None]:
fig, ax = plt.subplots(figsize=(18,4))
cols = np.array(df.columns.tolist())[np.concatenate(feature_groups.idxs)]
df.reindex(columns = cols).boxplot(grid=False, rot = 90, ax = ax)
plt.show()

Note that the integer-valued features have been scaled (by taking log).

##### Model Training

We will train a classifier on this data.

In [None]:
import src.classifiers
classifier = src.classifiers.factory(dataset='online_news', model_type=2, load_trained=True)
#classifier.fit(X.train, Y.train)
print('Accuracy on train: {:4.2f}%'.format(100*classifier.score(X.train, Y.train)))
print('Accuracy on test: {:4.2f}%'.format(100*classifier.score(X.test, Y.test)))

In [None]:
assert 'meaning of life' is 42, " This error was purposely added to stop automatic execution. Ignore and continue below."

---
## PART 1

##### Insantiate Explainers

We now create a Weight of Evidence estimator, and an explainer wrapper around it.

In [None]:
import importlib
from src.utils import range_plot
from src.explainers import WOE_Explainer
from src.woe import woe_gaussian

woe_estimator =  woe_gaussian(classifier, X.train, classes = range(len(Y.names)), cond_type='nb')

woeexplainer = WOE_Explainer(classifier, woe_estimator,
                             total_woe_correction=True,
                             classes=Y.names, features=X.names,
                             X=X.train, Y=Y.train,
                             featgroup_idxs = feature_groups.idxs,
                             featgroup_names = feature_groups.names)


Before explaining specific examples, let's look at the model's prior class probabilities.


In [None]:
fig, ax = plt.subplots(1,1, figsize=(7,4))
woeexplainer.plot_priors(normalize = None, ax = ax) 
plt.show()

As discussed in the tutorial, lower prior log odds require stronger evidence to overcome them. In this case, 'social media' and 'lifestyle' have much lower prior log odds that the other classes (because the data is unbalanced!).




##### Let's pick an example from the test set:

In [None]:
idx_1 = 4 # Don't change this
x_1   = X.test[idx_1].reshape(1,-1)
y_1   = Y.test[idx_1].reshape(1,)

The first set of questions on this section will be based on this example.

Let's see what the model predicts:

In [None]:
pred_class = classifier.predict(x_1)[0]
pred_proba = classifier.predict_proba(x_1)[0][pred_class]

print(f"Predicted class: {Y.names[pred_class]} (prob: {pred_proba})")
print(f"True class:      {Y.names[y_1.squeeze()]}")

We look at this example's feature values compared to the training data (aggregated by class):

In [None]:
woeexplainer.plot_ranges(x_1, groupby='predicted', annotate='value', rescale=True)
plt.show()

The boxplots have been centered and scaled in this plot to facilitate visualization. 

While the actual values of the features are not too important, the position of the black dots (the example being explained) with respect to the training is useful to understand how this instance relates other examples.

Now we explain the model's prediction for this example, using the Explainer tool.

**Attention**: Here, **you have to choose** whether to visualize the explanation by features of by feature groups. Don't worry! You can switch as needed.

In [None]:
### Uncomment one to select TYPE of explanation unit

#explanation_units = 'features' 
#explanation_units = 'feature_groups' 

e = woeexplainer.explain(x_1,y_1, totext=False, units=explanation_units)

##### Q1: In plain English, what would you say are main characteristics of this news article that the model is relying on to make its prediction?

Answer:

In [None]:
# The prediction is mostly based on these chacterteristics:
# 1. The article ...
# 2. The article ...
# 3. The article ... 

Let's clear the variables before moving on:

In [None]:
if 'idx_1' in globals(): del idx_1
if 'explanation_units' in globals(): del explanation_units

#### We will now take a look at a different example.

In [None]:
idx_2 = 20 # Don't change this 
x_2   = X.test[idx_2].reshape(1,-1)
y_2   = Y.test[idx_2].reshape(1,)

If you want to look at the feature boxplots for this example, uncomment the following:

In [None]:
# woeexplainer.plot_ranges(x_2, groupby='predicted', annotate='value', rescale=True)
# plt.show()

As before, you can **select** how to display the explanation:

In [None]:
### Uncomment one to select TYPE of explanation unit

#explanation_units = 'features' 
#explanation_units = 'feature_groups'

e = woeexplainer.explain(x_2, y_2, units=explanation_units, totext=False)

##### Q2: The model is not very confident about its prediction. Why do you think that is?

Answer:

In [None]:
# This prediction is not very confident because ....

##### Q3: In plain English, how would you modify this article to make the model more confident of its prediction, while not changing the article 'too much'?

Answer:

In [None]:
# I would change ... 

---
## Part 2

For the second part of the study, we will continue working with the same dataset and model, but will now try to answer a different set of questions.

Let's pick another example:

In [None]:
idx_3 = 55 # Don't change this.
x_3   = X.test[idx_3].reshape(1,-1)
y_3   = Y.test[idx_3].reshape(1,)

If you want to look at the feature boxplots for this example, uncomment the following:

In [None]:
#woeexplainer.plot_ranges(x_3, groupby='predicted', annotate='value', rescale=True)
#plt.show()

Let's see what the model predicts in this case:

In [None]:
pred_class = classifier.predict(x_3)[0]
pred_proba = classifier.predict_proba(x_3)[0][pred_class]

print(f"Predicted class: {Y.names[pred_class]} (prob: {pred_proba})")
print(f"True class:      {Y.names[y_3.squeeze()]}")

Let's explain it. Now, **you must choose** whether to produce a **sequential** or **one-shot** explanation. Again, feel free to change between these as needed.

In [None]:
sequential_explanation = # Choose True or False

e = woeexplainer.explain(x_3, y_3, units='feature_groups', totext=False,
                         sequential=sequential_explanation)

##### Q4: Why do you think the model didn't predict 'entertainment','lifestyle' or 'tech' instead?

Answer:

In [None]:
# It didn't predict these other classes because: 

##### Q5: Why do you think the model didn't predict 'world' instead?

Answer:

In [None]:
# It didn't predict 'world' because:

##### Q6: Suppose this classifier had been trained on a dataset with very few examples labeled 'business', but was otherwise identical. Do you think the prediction for this example would change? If so, what class would be predicted instead?

Answer:

In [None]:
# The prediction [would|wouldn't] change ...

<!--- ##### Q6: Suppose there is another news article with the same exact 'keywords' as this one, but all the other features have changed so that they are equally likely for 'world' than for other classes (i.e., they are not predictive of the class). For this modified article, how much more likely do you think it is that the model would predict 'world' instead of other classes?

Answer:

--->

## Follow-Up Questions

The researcher will now ask you a few general follow-up questions. 