# Attention Toxicity Model

This notebook trains a version of the toxicity model that uses attention. It is the model described below with an added attention mechanism:


It uses a CNN architecture for text classification trained on the [Wikipedia Talk Labels: Toxicity dataset](https://figshare.com/articles/Wikipedia_Talk_Labels_Toxicity/4563973) and pre-trained GloVe embeddings which can be found at:
http://nlp.stanford.edu/data/glove.6B.zip
(source page: http://nlp.stanford.edu/projects/glove/).

This model is a modification of [example code](https://github.com/fchollet/keras/blob/master/examples/pretrained_word_embeddings.py) found in the [Keras Github repository](https://github.com/fchollet/keras) and released under an [MIT license](https://github.com/fchollet/keras/blob/master/LICENSE). For further details of this license, find it [online](https://github.com/fchollet/keras/blob/master/LICENSE) or in this repository in the file KERAS_LICENSE. 

## Usage Instructions

Prior to running the notebook, you must:

* Download the [Wikipedia Talk Labels: Toxicity dataset](https://figshare.com/articles/Wikipedia_Talk_Labels_Toxicity/4563973)
* Download pre-trained [GloVe embeddings](http://nlp.stanford.edu/data/glove.6B.zip)
* (optional) To skip the training step, you will need to download a model and tokenizer file. We are looking into the appropriate means for distributing these (sometimes large) files.

In [None]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import pandas as pd

from model_tool import *

## Load Data

In [None]:
SPLITS = ['train', 'dev', 'test']

wiki = {}
debias = {}
random = {}
for split in SPLITS:
    wiki[split] = '../data/wiki_%s.csv' % split
    debias[split] = '../data/wiki_debias_%s.csv' % split
    random[split] = '../data/wiki_debias_random_%s.csv' % split

## Train Models

In [None]:
hparams = {'epochs': 20}

### Random attention model

In [None]:
model_names = ['atn_cnn_random_tox_v4_{}'.format(i) for i in xrange(100, 110)]
for model_name in model_names:
    MODEL_NAME = model_name
    random_model = AttentionToxModel(hparams=hparams)
    random_model.train(random['train'], random['dev'], text_column = 'comment', label_column = 'is_toxic', model_name = MODEL_NAME)

In [None]:
random_test = pd.read_csv(random['test'])
random_model.score_auc(random_test['comment'], random_test['is_toxic'])

### Plain attention wikipedia model

In [None]:
model_names = ['atn_cnn_wiki_tox_v4_{}'.format(i) for i in xrange(100, 110)]
for model_name in model_names:
    MODEL_NAME = model_name
    wiki_model = AttentionToxModel(hparams=hparams)
    wiki_model.train(wiki['train'], wiki['dev'], text_column = 'comment', label_column = 'is_toxic', model_name = MODEL_NAME)

In [None]:
wiki_test = pd.read_csv(wiki['test'])
wiki_model.score_auc(wiki_test['comment'], wiki_test['is_toxic'])

### Debiased attention model

In [None]:
model_names = ['atn_cnn_debias_tox_v4_{}'.format(i) for i in xrange(100, 110)]
for model_name in model_names:
    MODEL_NAME = model_name
    debias_model = AttentionToxModel(hparams=hparams)
    debias_model.train(debias['train'], debias['dev'], text_column = 'comment', label_column = 'is_toxic', model_name = MODEL_NAME)

In [None]:
debias_test = pd.read_csv(debias['test'])
debias_model.score_auc(debias_test['comment'], debias_test['is_toxic'])