# Test biaslyze with the toxic comments dataset

Data source: https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import numpy as np
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import make_pipeline
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt

## Load and prepare data

In [3]:
df = pd.read_csv("../data/jigsaw-toxic-comment-classification/train.csv"); df.head()

Unnamed: 0,id,comment_text,toxic,severe_toxic,obscene,threat,insult,identity_hate
0,0000997932d777bf,Explanation\nWhy the edits made under my usern...,0,0,0,0,0,0
1,000103f0d9cfb60f,D'aww! He matches this background colour I'm s...,0,0,0,0,0,0
2,000113f07ec002fd,"Hey man, I'm really not trying to edit war. It...",0,0,0,0,0,0
3,0001b41b1c6bb37e,"""\nMore\nI can't make any real suggestions on ...",0,0,0,0,0,0
4,0001d958c54c6e35,"You, sir, are my hero. Any chance you remember...",0,0,0,0,0,0


In [4]:
# make the classification problem binary
df["target"] = df[["toxic", "severe_toxic", "obscene", "threat", "insult", "identity_hate"]].sum(axis=1) > 0

## Train a BoW-model

In [5]:
clf = make_pipeline(TfidfVectorizer(min_df=10, max_features=30000, stop_words="english"), LogisticRegression(C=10))

In [6]:
clf.fit(df.comment_text, df.target)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


In [7]:
train_pred = clf.predict(df.comment_text)
print(accuracy_score(df.target, train_pred))

0.9753589311341033


## Counterfactual based bias detection

In [8]:
from biaslyze.bias_detectors import CounterfactualBiasDetector

The dash_html_components package is deprecated. Please replace
`import dash_html_components as html` with `from dash import html`
  import dash_html_components as html


In [9]:
bias_detector = CounterfactualBiasDetector(lang="en")

In [10]:
counterfactual_detection_results = bias_detector.process(
    texts=df.comment_text.sample(10000, random_state=42),
    labels=df.target.tolist(),
    predict_func=clf.predict_proba,
    concepts_to_consider=["religion", "gender"], #, "nationality", "ethnicity"],
    max_counterfactual_samples=None,
)

2023-08-11 15:32:30.708 | INFO     | biaslyze.concept_detectors:detect:47 - Started keyword-based concept detection on 10000 texts...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10000/10000 [00:00<00:00, 49588.79it/s]
2023-08-11 15:32:30.917 | INFO     | biaslyze.concept_detectors:detect:64 - Done. Found 8934 texts with protected concepts.
2023-08-11 15:32:30.918 | INFO     | biaslyze.bias_detectors.counterfactual_biasdetector:process:163 - Processing concept religion...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8934/8934 [00:50<00:00, 175.70it/s]
100%|█████████████████████████████████████████████████████

In [11]:
counterfactual_detection_results.report()

Concept: religion		Max-Mean Counterfactual Score: 0.01744		Max-Std Counterfactual Score: 0.04918
Concept: gender		Max-Mean Counterfactual Score: 0.03883		Max-Std Counterfactual Score: 0.07872


In [12]:
print(counterfactual_detection_results.concept_results[0].omitted_keywords)

[]


In [13]:
counterfactual_detection_results.dashboard(num_keywords=10)

## Can counterfactual samples reduce bias?

In [14]:
counterfactual_samples = counterfactual_detection_results._get_counterfactual_samples_by_concept(concept="religion")
len(counterfactual_samples)

9481

In [15]:
# prepare texts and labels
counterfactual_texts = [sample.text for sample in counterfactual_samples]
counterfactual_labels = [sample.label for sample in counterfactual_samples]
counterfacutal_weights = [2 for sample in counterfactual_samples]

In [16]:
# add sample weight by bias
sample_weights = [1] * len(df.comment_text.tolist()) + counterfacutal_weights

In [17]:
# retrain
clf.fit(
    df.comment_text.tolist() + counterfactual_texts,
    df.target.tolist() + counterfactual_labels,
    logisticregression__sample_weight=sample_weights
)


lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression



In [18]:
# evaulate again
mitigaed_counterfactual_detection_results = bias_detector.process(
    texts=df.comment_text.sample(10000),
    labels=df.target.tolist(),
    predict_func=clf.predict_proba,
    concepts_to_consider=["religion"], # , "gender", "nationality", "ethnicity"]
    max_counterfactual_samples=None,
)

2023-08-11 15:41:37.267 | INFO     | biaslyze.concept_detectors:detect:47 - Started keyword-based concept detection on 10000 texts...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10000/10000 [00:00<00:00, 62586.33it/s]
2023-08-11 15:41:37.439 | INFO     | biaslyze.concept_detectors:detect:64 - Done. Found 279 texts with protected concepts.
2023-08-11 15:41:37.442 | INFO     | biaslyze.bias_detectors.counterfactual_biasdetector:process:163 - Processing concept religion...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 279/279 [00:06<00:00, 41.19it/s]
100%|██████████████████████████████████████████████████████

In [19]:
mitigaed_counterfactual_detection_results.report()

Concept: religion		Max-Mean Counterfactual Score: 0.01335		Max-Std Counterfactual Score: 0.04106


In [20]:
mitigaed_counterfactual_detection_results.dashboard()


Port 8090 is already in use. Using next free port 8091 instead.

