## Introduction

This notebook describes a two-step process to classify comment offensivity. The solution seeks to solve Kaggle competition https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge.

The hypothesis is that two-step classification gives better results, because the predictive power of offence vs not offence is better when we pool offences in the first step. The problem is that offending comments only make up 10% of all of the comments, so to classify each offence class separately from the beginning would potentially cause smaller offence classes to be underclassified - to solve this problem, we propose two-step classification process.

The first step is to train a binary classifier to identify if the comment is offensive or not. For this all the samples that have at least one offence category marked as 1 will be labelled as 1, others as 0. This step has two versions:

Version 1 where model is trained on the "as is" data.

Version 2 where offence class comments are over-sampled to have roughly equal representation to the non-offence class.

The second step will be trained purely on offence data and will be used to distinguish between offences (score offences). The step will have two separate versions:

Version 1 where model is trained on the "as is" offence data.

Version 2 where smaller offence class comments are over-sampled to have roughly equal representation to the large offence classes.

The additional property of this two-step classification is that second step serves also as a quality check for the first step - when the probability score for each of the offence types is eally low, then probably there has been a mistake in the first step and we can discard the comment as non-offensive.

## Implementation

### Reading in and exploring the datasets

In [20]:
import pandas as pd
import numpy as np
from pandas import Series

In [21]:
data = pd.read_csv('train.csv')
data.iloc[0,:]

id                                                        22256635
comment_text     Nonsense?  kiss off, geek. what I said is true...
toxic                                                            1
severe_toxic                                                     0
obscene                                                          0
threat                                                           0
insult                                                           0
identity_hate                                                    0
Name: 0, dtype: object

In [25]:
data.shape

(95851, 9)

### Classifying offence vs accepted

Add a binary label column to indicate if offence or not

In [26]:
data['binary_label'] = Series(np.zeros(len(data['id'])), index=data.index)
data.loc[data.iloc[:,1:].sum(axis=1) >= 1,'binary_label'] = 1

The proportion of offending comments is:

In [28]:
sum(data['binary_label']/len(data['binary_label']))

0.1021376928774943

Now lets train the classifier to identify if comment is offensive or not

The previously mentioned oversampling approach.

### Classifying offence data

In [18]:
offence_data = data[data.iloc[:,1:].sum(axis=1) >= 1]

Train the classifier on the "as is" data

Now lets train with the previously mentioned oversampling approach

### Ideas and code testing

In [32]:
import spacy
nlp = spacy.load('en')
doc = nlp(u'This is a sentence.')

In [31]:
import keras

Using TensorFlow backend.
