We Import `joblib` from `sklearn.externals` and `numpy` as `np`, `pandas` as `pd` and `csv`

In [None]:
from sklearn.externals import joblib
import numpy as np
import pandas as pd
import csv

## load model

We use `joblib.load()` to load and store the model file 'stolen_model.pkl'.

In [None]:
model = joblib.load('stolen_model.pkl')

## Top 10 attributes class 1

To extract the top 10 attributes leading to one class, we take advantage of our knowledge about the underlying pipeline. The pipeline has a step named `'lin_svc'` that contains the support vector machine. We use `named_steps[]` to get this step from the pipeline and call `coef_[0]` to receive the coefficients/weights from the SVM. - and store the results into a variable called 'weights'.

In [None]:
weights = model.named_steps['lin_svc'].coef_[0]
weights = weights.toarray()
sorted_index = np.argsort(weights[0])[::-1]
#print(sorted_index)
top_10 = sorted_index[:10]

Now that we have the weights we are interested in, we need to connect them to the corresponding terms. Get the named step `'tfidv'` from the pipeline and call `get_feature_names()` to store the terms as a new variable.

In [None]:
terms = model.named_steps['tfidv'].get_feature_names()
#print(terms)
for ind in top_10:
    print(terms[ind])

## Top 10 attributes class 2

We also extract the top 10 attributes determining the second class.

In [None]:
weights = model.named_steps['lin_svc'].coef_[0]
weights = weights.toarray()
sorted_index = np.argsort(weights[0])[::1]
#print(sorted_index)
top_10 = sorted_index[:10]
terms = model.named_steps['tfidv'].get_feature_names()
#print(terms)
for ind in top_10:
    print(terms[ind])

## lime

LIME (Local Interpretable Model-agnostic Explanations) is a novel explanation technique that explains the prediction of any classifier in an interpretable and faithful manner by learning a interpretable model locally around the prediction.

We create list of size 2 with the elements 'activist' and 'public' in this order and assign it to `class_names` and a LimeTextExplainer(), passing `class_names = class_names`. We use the `explain_instance` function (pass a text to explain as the first argument, `model.predict_proba` and `num_features=10`) and save the result to exp.

In [None]:
from lime import lime_text
from lime.lime_text import LimeTextExplainer

class_names = ['activist','public']
explainer = LimeTextExplainer(class_names = class_names)
exp= explainer.explain_instance("Das ist ein Test", model.predict_proba,num_features=10)
print('Probability: =', model.predict_proba(["Das ist ein Test"]))
exp.as_list()

In [None]:
# use the built-in lime visualizations
%matplotlib inline
fig = exp.as_pyplot_figure()
exp.show_in_notebook(text=True)

## eli5

**eli5** is a python package that has been built on top of lime and a couple of other explainable AI projects. We use eli5's `show_weights()` function, giving it the named step 'lin_svc' as first argument and as `vec`. Additionally, we define the number of features we are interested in by setting `top` to an integer of our choice. 

In [None]:
# use eli5 to show the top N features contributing to one class
import eli5
eli5.show_weights(model.named_steps['lin_svc'], vec=model.named_steps['tfidv'], top=30)

We can use the eli5 function `show_prediction()` with the named step 'lin_svc' as first argument, any string as the second argument and `vec=model.named_steps['tfidv']` to visualize the most important features for a sample text.

In [None]:
eli5.show_prediction(model.named_steps['lin_svc'], "Das ist ein Test", vec=model.named_steps['tfidv'])

## send your candidate messages to the server for evaluation
Use backdoor.py (either paste the code in here or execute it separately)

## save your candidate messages
Save your messages as a .csv using the code below.


In [None]:
messages = [['message 1'], 
            ['message 2'], 
            ['message 3'], 
            ['message 4'], 
            ['message 5'], 
            ['message 6'], 
            ['message 7']] 
df = pd.DataFrame(messages)
df.to_csv("your_team.csv", encoding='utf-8', quoting=csv.QUOTE_ALL,header=False, index=False)