# Building a classifier for identifying personal attacks with the Wikipedia Database

We take the fantastic [Jigsaw Paper](http://papers.www2017.com.au.s3-website-ap-southeast-2.amazonaws.com/proceedings/p1391.pdf) and associated data as an inspiration for creating our own classifier of text which contains personal attacks. Credit goes to [this notebook](https://github.com/ewulczyn/wiki-detox/blob/master/src/figshare/Wikipedia%20Talk%20Data%20-%20Getting%20Started.ipynb) for expediting the process.

In [28]:
import pandas as pd
import numpy as np
import string
import urllib
from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import GridSearchCV

The below obtains the labelled dataset from Wikipedia. Effectively, volunteers voted on whether a piece of text in the Wikipedia Talk section was abusive or not. Note that this may not work well for tweets due to the fact that when swearing occurs on Wikipedia, it is usually indicitave of abuse, whereas for Twitter this is not the case. However it is far better than standard sentiment analysis or querying a select corpus!

In [2]:
# download annotated comments and annotations

ANNOTATED_COMMENTS_URL = 'https://ndownloader.figshare.com/files/7554634' 
ANNOTATIONS_URL = 'https://ndownloader.figshare.com/files/7554637' 


def download_file(url, fname):
    urllib.request.urlretrieve(url, fname)

download_file(ANNOTATED_COMMENTS_URL, 'attack_annotated_comments.tsv')
download_file(ANNOTATIONS_URL, 'attack_annotations.tsv')

In [3]:
# panda-fy the data
comments = pd.read_csv('attack_annotated_comments.tsv', sep = '\t', index_col = 0)
annotations = pd.read_csv('attack_annotations.tsv',  sep = '\t')

In [5]:
# LogReg in SKLearn doesn't support continuous vars, so use a 'voting' methodology to determine classification
labels = annotations.groupby('rev_id')['attack'].mean() > 0.5

In [6]:
# join labels and comments
comments['attack'] = labels

In [7]:
# remove newline and tab tokens
comments['comment'] = comments['comment'].apply(lambda x: x.replace("NEWLINE_TOKEN", " "))
comments['comment'] = comments['comment'].apply(lambda x: x.replace("TAB_TOKEN", " "))

In [8]:
comments.query('attack')['comment'].head()

rev_id
801279             Iraq is not good  ===  ===  USA is bad   
2702703      ____ fuck off you little asshole. If you wan...
4632658         i have a dick, its bigger than yours! hahaha
6545332      == renault ==  you sad little bpy for drivin...
6545351      == renault ==  you sad little bo for driving...
Name: comment, dtype: object

In [10]:
# fit a Log Reg with character ngrams between 1 and 5 (takes a while)

# note that the data already has train/test labels
train_comments = comments.query("split=='train'")
test_comments = comments.query("split=='test'")

clf = Pipeline([
    ('tfidf', TfidfVectorizer(max_features = 10000, ngram_range = (1,5), analyzer='word', norm='l2')),
    ('clf', LogisticRegression()),
])
clf = clf.fit(train_comments['comment'], train_comments['attack'])
auc = roc_auc_score(test_comments['attack'], clf.predict_proba(test_comments['comment'])[:, 1])
print('Test ROC AUC: %.3f' %auc)

Test ROC AUC: 0.955


In [65]:
# correctly classify nice comment
clf.predict(['Thanks for you contribution, you did a great job!'])

array([False], dtype=bool)

Good, so it's not saying this is rude.

In [17]:
# correctly classify nasty comment with obfuscations
clf.predict(['You are a f** cu*t!'])

array([ True], dtype=bool)

WORKS ON OBFUSCATED CHARACTERS !!!!!!

# Tuning Hyperparameters

The below enhances our logit model by doing a grid search over the hyperparameters with a 4 fold cross validation.

In [27]:
# transform the comments to tfidf

tfidf_v =  TfidfVectorizer(max_features = 10000, ngram_range = (1,5), analyzer='word', norm='l2')
out_spars = tfidf_v.fit_transform(comments['comment'])

In [30]:
clf = LogisticRegression()
params = {'C': [1, 10, 100, 1000], 'penalty': ['l1','l2']}

gscv = GridSearchCV(clf,cv=4,param_grid=params)

In [31]:
gscv.fit(out_spars,comments['attack'])
gscv.best_params_

{'C': 1, 'penalty': 'l1'}

In [32]:
gscv.cv_results_

{'mean_fit_time': array([  5.61631161,   5.51555151,  10.01550138,  10.03600347,
         50.25377482,  14.96274614,  86.61616087,  55.14251369]),
 'mean_score_time': array([ 0.0225023 ,  0.02600265,  0.01900196,  0.02100217,  0.02125216,
         0.02050203,  0.02050203,  0.02025205]),
 'mean_test_score': array([ 0.94428813,  0.94051647,  0.93816026,  0.9417593 ,  0.92476524,
         0.93384485,  0.92048436,  0.92488607]),
 'mean_train_score': array([ 0.94892   ,  0.94751318,  0.96419653,  0.9597977 ,  0.97028989,
         0.96686633,  0.9710609 ,  0.97002521]),
 'param_C': masked_array(data = [1 1 10 10 100 100 1000 1000],
              mask = [False False False False False False False False],
        fill_value = ?),
 'param_penalty': masked_array(data = ['l1' 'l2' 'l1' 'l2' 'l1' 'l2' 'l1' 'l2'],
              mask = [False False False False False False False False],
        fill_value = ?),
 'params': ({'C': 1, 'penalty': 'l1'},
  {'C': 1, 'penalty': 'l2'},
  {'C': 10, 'penalty': 

Conclusion: Use L1 regularisation with C = 1.