## CSCI 470 Activities and Case Studies

1. For all activities, you are allowed to collaborate with a partner. 
1. For case studies, you should work individually and are **not** allowed to collaborate.

By filling out this notebook and submitting it, you acknowledge that you are aware of the above policies and are agreeing to comply with them.

Some considerations with regard to how these notebooks will be graded:

1. You can add more notebook cells or edit existing notebook cells other than "# YOUR CODE HERE" to test out or debug your code. We actually highly recommend you do so to gain a better understanding of what is happening. However, during grading, **these changes are ignored**. 
2. You must ensure that all your code for the particular task is available in the cells that say "# YOUR CODE HERE"
3. Every cell that says "# YOUR CODE HERE" is followed by a "raise NotImplementedError". You need to remove that line. During grading, if an error occurs then you will not receive points for your work in that section.
4. If your code passes the "assert" statements, then no output will result. If your code fails the "assert" statements, you will get an "AssertionError". Getting an assertion error means you will not receive points for that particular task.
5. If you edit the "assert" statements to make your code pass, they will still fail when they are graded since the "assert" statements will revert to the original. Make sure you don't edit the assert statements.
6. We may sometimes have "hidden" tests for grading. This means that passing the visible "assert" statements is not sufficient. The "assert" statements are there as a guide but you need to make sure you understand what you're required to do and ensure that you are doing it correctly. Passing the visible tests is necessary but not sufficient to get the grade for that cell.
7. When you are asked to define a function, make sure you **don't** use any variables outside of the parameters passed to the function. You can think of the parameters being passed to the function as a hint. Make sure you're using all of those variables.
8. Finally, **make sure you run "Kernel > Restart and Run All"** and pass all the asserts before submitting. If you don't restart the kernel, there may be some code that you ran and deleted that is still being used and that was why your asserts were passing.

# Model Interpretability

In this exercise you'll use the [alibi](https://docs.seldon.io/projects/alibi/en/stable/) library to explain why some models make the predictions they do.

In [None]:
! pip install alibi

In [None]:
import sklearn
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import LinearSVC
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.metrics import accuracy_score

In [None]:
data = load_iris()

In [None]:
print(data["DESCR"])
features = data["data"]
targets = data["target"]

In [None]:
X_train, X_test, y_train, y_test = train_test_split(features, targets, random_state=0)

In [None]:
print(len(X_test))

In [None]:
## Create 2 classifiers: rf_clf that is a Random Forest model, and svm_clf that is a Linear SVM model
## Train them both on the training data
## Use them to predict the test data - saving it to y_rf_pred and y_svm_pred respectively
## You may consider using GridSearchCV to determine a hyperparameter search for both models.

rf_param_grid = {
    'max_depth': [80, 90, 100],
    'max_features': [2,3]
}
rf_grid_search = GridSearchCV(estimator=RandomForestClassifier(), param_grid=rf_param_grid, cv=5)
rf_grid_search.fit(X_train, y_train)
rf_clf = rf_grid_search.best_estimator_
y_rf_pred = rf_clf.predict(X_test)

svm_param_grid = {
    'C': [0.01, 0.1, 1, 10, 100, 1000]
}
svm_grid_search = GridSearchCV(estimator=LinearSVC(), param_grid=svm_param_grid, cv=5)
svm_grid_search.fit(X_train, y_train)
svm_clf = svm_grid_search.best_estimator_
y_svm_pred = svm_clf.predict(X_test)



In [None]:
assert len(y_rf_pred) == 38
assert isinstance(rf_clf, RandomForestClassifier) or isinstance(rf_clf, GridSearchCV)
assert len(y_svm_pred) == 38
assert isinstance(svm_clf, LinearSVC) or isinstance(svm_clf, GridSearchCV)

In [None]:
print(f"The random forest model achieved an accuracy of {accuracy_score(y_test, y_rf_pred)}.")
print(f"The support vector machine model achieved an accuracy of {accuracy_score(y_test, y_svm_pred)}.")

In [None]:
# Since we used a Linear SVM, we can easily determine the coefficients for the features:
if isinstance(svm_clf, LinearSVC):
    print(svm_clf.coef_)
elif isinstance(svm_clf, GridSearchCV):
    print(svm_clf.best_estimator_.coef_)

print("Each class gets a coefficient for each feature that helps us determine that feature's importance.")

Now let's look at how we can use explainers, namely the [AnchorTabular](https://docs.seldon.io/projects/alibi/en/stable/methods/Anchors.html#id3) explainer to understand why the models make the predictions they do.

In [None]:
from alibi.explainers import AnchorTabular

Alibi explainers follow a general structure of:

1. Initialize the explainer, providing a prediction function, and explainer specific parameters. `exp = Explainer(predict_func, param_1, param_2, ...)`
1. Fit the explainer to the training data (this step is explainer dependent) `exp.fit(train_data)`
1. Explain a given sample `exp.explain(sample)`

First, we reframe the prediction pipeline into a prediction function that we can use with the explainer:

In [None]:
rf_clf_func = lambda x: rf_clf.predict(x)
svm_clf_func = lambda x: svm_clf.predict(x)

Now we can instantiate the explainer using the prediction function and any parameters the explainer requires:

In [None]:
rf_explainer = AnchorTabular(rf_clf_func, data["feature_names"])
rf_explainer.fit(X_train)

In [None]:
svm_explainer = AnchorTabular(svm_clf_func, data["feature_names"])
svm_explainer.fit(X_train)

Once the explainer is set up, we can now use it to `.explain` samples! Pick a sample below to explain the two models' predictions.

In [None]:
# Change this value to choose a test sample
index_to_explain = 5


rf_explanation = rf_explainer.explain(X_test[index_to_explain])
svm_explanation = svm_explainer.explain(X_test[index_to_explain])

In [None]:
rf_explanation.anchor, rf_explanation.precision

In [None]:
svm_explanation.anchor, svm_explanation.precision

Here we can see what the model's explanation for the classification of that sample is. You can see that even with our relatively interpretable model of Linear SVMs, these explainers can provide a more direct and intuitive explanation for why a sample was labeled the way it was.

Now that you've seen the general approach for these explainers, let's work on something a bit more complex. Now you'll have to create the models, the prediction function, and the explainers.

## Explaining MNIST predictions

Explaining data from measured observations is simple enough. Now let's try explaining how images get labeled.

In [None]:
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D
import matplotlib
import matplotlib.pyplot as plt
import numpy as np

In [None]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()

In [None]:
sample_index = 12
sample = x_train[sample_index]

In [None]:
plt.imshow(sample, cmap="gray")

In [None]:
## Create a neural network model that should do well on the MNIST dataset and save it to mnist_nn
## Make the neural network sufficiently complex (at least 5 layers) and feel free to use Conv2D layers for example

## Save the neural network to mnist_nn
## You'll need to make sure you get at least 80% accuracy

input_shape = x_train[0].reshape(28,28,1).shape
layers = []
layers.append(Conv2D(32, kernel_size=3, padding='same', activation ='relu', input_shape = input_shape))
layers.append(MaxPool2D(padding='same'))
layers.append(Conv2D(64, kernel_size=3, padding='same', activation ='relu'))
layers.append(MaxPool2D(padding='same'))
layers.append(Conv2D(128, kernel_size=3, padding='same', activation ='relu'))
layers.append(MaxPool2D(padding='same'))
layers.append(Conv2D(128, kernel_size=3, padding='same', activation ='relu'))
layers.append(MaxPool2D(padding='same'))
layers.append(Flatten())
layers.append(Dense(100, activation='relu'))
layers.append(Dense(100, activation='relu'))
layers.append(Dense(10, activation='softmax'))
mnist_nn = Sequential(layers)

mnist_nn.compile(optimizer="adam", loss='sparse_categorical_crossentropy', metrics=['accuracy'])
mnist_nn.fit(x_train.reshape(-1, 28, 28 ,1), y_train, epochs=1)
mnist_nn.summary()

In [None]:
assert len(mnist_nn.layers) > 5
assert mnist_nn.evaluate(x_test.reshape(-1, 28, 28 ,1), y_test)[1] > 0.8

In [None]:
from alibi.explainers import AnchorImage

To work with images, we'll use the [AnchorImage](https://docs.seldon.io/projects/alibi/en/stable/methods/Anchors.html#id5) explainer. This explainer requires that we break up the image into "superpixels". We'll use the function in the next cell to do just that. 

In [None]:
def superpixel(image, size=(4, 4)):
    segments = np.zeros([image.shape[0], image.shape[1]])
    row_idx, col_idx = np.where(segments == 0)
    for i, j in zip(row_idx, col_idx):
        segments[i, j] = int((image.shape[1]/size[1]) * (i//size[0]) + j//size[1])
    return segments

In [None]:
segments = superpixel(x_train[0])
plt.imshow(segments)

Each presented square is a superpixel. You can change the code above to test out other ways of determining superpixels. You could even just simply change the size from 4,4 to a different size and see what happens.

In [None]:
# Create an explainer object using AnchorImage that explains the mnist_nn model you created.
# Make sure to use the superpixel function as the segmentation function 

predict_fn = lambda x: mnist_nn.predict(x)
mnist_explainer = AnchorImage(predict_fn, input_shape, segmentation_fn=superpixel)

In [None]:
assert isinstance(mnist_explainer, AnchorImage)

In [None]:
# Change this number and try out different samples
image_index_to_explain = 2
image_to_explain = x_test.reshape(-1, 28, 28 ,1)[image_index_to_explain]

In [None]:
plt.imshow(image_to_explain[:,:,0], cmap="gray")

In [None]:
# Change the value of p_sample, and threshold here to see how the explanation changes based on the sample.
mnist_image_explanation = mnist_explainer.explain(image_to_explain, threshold=.9, p_sample=.5)

In [None]:
print(f"The model predicted the number as a {mnist_nn.predict(image_to_explain.reshape(1, 28, 28, 1)).argmax()} because of:")
plt.imshow(mnist_image_explanation.anchor[:,:,0], cmap="gray")

One thing you may have noticed is that the explanations are heavily dependent on the superpixels we identify. Have ideas for a better superpixel definition? Go back and try it!

## Explaining newsgroup predictions

With the newsgroup dataset we'll look at explaining how text gets predicted using [AnchorText](https://docs.seldon.io/projects/alibi/en/v0.2.2/methods/Anchors.html#Initialization).

In [None]:
from sklearn.datasets import fetch_20newsgroups
import spacy
from alibi.explainers import AnchorText
from alibi.utils.download import spacy_model
from sklearn.feature_extraction.text import TfidfVectorizer

In [None]:
newsgroups = fetch_20newsgroups()

In [None]:
print(newsgroups["DESCR"])
text = newsgroups["data"]
news_labels = newsgroups["target"]
newsgroup_names = newsgroups["target_names"]

text_train, text_test, labels_train, labels_test = train_test_split(text, news_labels, random_state=0)

In [None]:
# Creating a TFIDF vectorizer and Linear SVM classifier to make predictions about the newsgroup dataset

tfidf = TfidfVectorizer()
tfidf.fit(text_train)

clf = LinearSVC()
clf.fit(tfidf.transform(text_train), labels_train)


In [None]:
# Create newsgroup_predictor which is a predictor function to use with an AnchorText predictor using
# the vectorizer and classifier defined in the cell above
# Note that you have to transform the data with the vectorizer and then predict it.

newsgroup_predictor = lambda x: clf.predict(tfidf.transform(x))

In [None]:
assert len(newsgroup_predictor(text_test[:2])) == 2

In [None]:
model = 'en_core_web_md'
spacy_model(model=model)
nlp = spacy.load(model)

In [None]:
# Create the explainer to use
newsgroup_explainer = AnchorText(nlp, newsgroup_predictor)

In [None]:
# Copy the text of an article you find on the internet and save it as article

article = """Sprinting through the office door and leaping onto his stunned father’s lap, 27-year-old Dennis Radomir loudly announced Daddy, I’m hungry Monday as he burst into the background of a work-related video conference. 
Daddy, Daddy, my tummy is grumbling, please can I have my yum yums now, whined the fully grown adult male before taking off his shirt, falling to the ground, and crying loudly after his father refused to give him his favorite 
dino nuggies. Sprinting through the office door and leaping onto his stunned father’s lap, 27-year-old Dennis Radomir loudly announced Daddy, I’m hungry Monday as he burst into the background of a work-related video conference."""

In [None]:
assert len(article) > 500

In [None]:
# Define article_explanation as the explainer's explanation for the article you provided.

article_explanation = newsgroup_explainer.explain(article)

In [None]:
print(f"The model predicted the article as {newsgroup_names[newsgroup_predictor([article])[0]]} because of the word: {article_explanation.anchor}")

In [None]:
# Change this number and try out different samples
test_sample_index = 28
test_sample = text_test[test_sample_index]
print(test_sample)

In [None]:
test_sample_explanation = newsgroup_explainer.explain(test_sample)

In [None]:
print(f"The model predicted the test sample as {newsgroup_names[newsgroup_predictor([test_sample])[0]]} because of the word {test_sample_explanation.anchor}")

## Feedback

In [None]:
def feedback():
    """Provide feedback on the contents of this exercise
    
    Returns:
        string
    """
    return "Other than not getting a chance to see if the article worked, all good!"

In [None]:
print(feedback())