# Usertesting

In this notebook you will see how to test a model with our Biaslyze tool in order to inspect it on hints for possible bias. Biaslyze uses counterfactual token fairness scores to evaluate the significance of concepts and attributes sensible to discrimination within the models decisions. 
To show you how Biaslyze works we use data from a Kaggle challenge and build a model that classifies texts from online comments as toxic or not toxic. 
The data consists of instances of 226235 online comments. You can get the data on the kaggle site.

Data source: [https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge)

# Installation
First install the Biaslyze python package using:

In [None]:
#!pip install biaslyze

In [None]:
import numpy as np
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import make_pipeline
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

## Load and prepare data

In [None]:
df = pd.read_csv("../data/jigsaw-toxic-comment-classification/train.csv"); df.head()

## Now make the classification problem binary: 
Apart from the descriptive multi-label toxicity labels, there is another target column with a binary class signifying if a comment text is toxic or non-toxic.

In [None]:
df["target"] = df[["toxic", "severe_toxic", "obscene", "threat", "insult", "identity_hate"]].sum(axis=1) > 0

## Train a BoW-model

In [None]:
train_df, test_df = train_test_split(df, test_size=0.05, random_state=42)

In [None]:
clf = make_pipeline(TfidfVectorizer(min_df=10, max_features=30000, stop_words="english"), LogisticRegression(C=10))

In [None]:
clf.fit(train_df.comment_text, train_df.target)

In [None]:
y_pred = clf.predict(test_df.comment_text)

In [None]:
score = accuracy_score(test_df.target, y_pred)
print("Test accuracy: {:.2%}".format(score))

## Evaluate the model for bias

Now that we have a model to test, lets evaluate it with the Biaslyze tool and test the sensible concepts for possible bias. 

In [None]:
from biaslyze.bias_detectors import CounterfactualBiasDetector

bias_detector = CounterfactualBiasDetector()

In [None]:
counterfactual_detection_results = bias_detector.process(
    texts=test_df.comment_text,
    labels=test_df.target.tolist(),
    predict_func=clf.predict_proba,
    concepts_to_consider=["religion", "gender"],
    max_counterfactual_samples=None,
)

In [None]:
counterfactual_detection_results.visualize_counterfactual_scores(concept="gender", top_n=20)

In [None]:
counterfactual_detection_results.visualize_counterfactual_scores(concept="religion", top_n=15)

In [None]:
counterfactual_detection_results.visualize_counterfactual_sample_scores(concept="gender", top_n=15)

In [None]:
counterfactual_detection_results.visualize_counterfactual_scores(concept="religion", top_n=20)

In [None]:
from bokeh.io import show, output_notebook

output_notebook()

In [None]:
viz = counterfactual_detection_results.visualize_counterfactual_score_by_sample_histogram()
show(viz)