# Project 3


# Conversations Toxicity Detection

Jigsaw Unintended Bias in Toxicity Classification 

Detect toxicity across a diverse range of conversations


https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/data#


Try Colab
https://albahnsen.com/2018/07/22/how-to-download-kaggle-data-into-google-colab/


## Description

## Background
At the end of 2017 the Civil Comments platform shut down and chose make their ~2m public comments from their platform available in a lasting open archive so that researchers could understand and improve civility in online conversations for years to come. Jigsaw sponsored this effort and extended annotation of this data by human raters for various toxic conversational attributes.

In the data supplied for this competition, the text of the individual comment is found in the comment_text column. Each comment in Train has a toxicity label (target), and models should predict the target toxicity for the Test data. This attribute (and all others) are fractional values which represent the fraction of human raters who believed the attribute applied to the given comment. For evaluation, test set examples with target >= 0.5 will be considered to be in the positive class (toxic).

The data also has several additional toxicity subtype attributes. Models do not need to predict these attributes for the competition, they are included as an additional avenue for research. Subtype attributes are:

- severe_toxicity
- obscene
- threat
- insult
- identity_attack
- sexual_explicit

Additionally, a subset of comments have been labelled with a variety of identity attributes, representing the identities that are mentioned in the comment. The columns corresponding to identity attributes are listed below. Only identities with more than 500 examples in the test set (combined public and private) will be included in the evaluation calculation. These identities are shown in bold.

- male
- female
- transgender
- other_gender
- heterosexual
- homosexual_gay_or_lesbian
- bisexual
- other_sexual_orientation
- christian
- jewish
- muslim
- hindu
- buddhist
- atheist
- other_religion
- black
- white
- asian
- latino
- other_race_or_ethnicity
- physical_disability
- intellectual_or_learning_disability
- psychiatric_or_mental_illness
- other_disability

Note that the data contains different comments that can have the exact same text. Different comments that have the same text may have been labeled with different targets or subgroups.

## Examples
Here are a few examples of comments and their associated toxicity and identity labels. Label values range from 0.0 - 1.0 represented the fraction of raters who believed the label fit the comment.

- Comment: i'm a white woman in my late 60's and believe me, they are not too crazy about me either!!

Toxicity Labels: All 0.0
Identity Mention Labels: female: 1.0, white: 1.0 (all others 0.0)
- Comment: Why would you assume that the nurses in this story were women?

Toxicity Labels: All 0.0
Identity Mention Labels: female: 0.8 (all others 0.0)
- Comment: Continue to stand strong LGBT community. Yes, indeed, you'll overcome and you have.

Toxicity Labels: All 0.0
Identity Mention Labels: homosexual_gay_or_lesbian: 0.8, bisexual: 0.6, transgender: 0.3 (all others 0.0)

In addition to the labels described above, the dataset also provides metadata from Jigsaw's annotation: toxicity_annotator_count and identity_annotator_count, and metadata from Civil Comments: created_date, publication_id, parent_id, article_id, rating, funny, wow, sad, likes, disagree. Civil Comments' label rating is the civility rating Civil Comments users gave the comment.

## Labelling Schema
To obtain the toxicity labels, each comment was shown to up to 10 annotators*. Annotators were asked to: "Rate the toxicity of this comment"

- Very Toxic (a very hateful, aggressive, or disrespectful comment that is very likely to make you leave a discussion or give up on sharing your perspective)
- Toxic (a rude, disrespectful, or unreasonable comment that is somewhat likely to make you leave a discussion or give up on sharing your perspective)
- Hard to Say
- Not Toxic

These ratings were then aggregated with the target value representing the fraction of annotations that annotations fell within the former two categories.

To collect the identity labels, annotators were asked to indicate all identities that were mentioned in the comment. An example question that was asked as part of this annotation effort was: "What genders are mentioned in the comment?"

- Male
- Female
- Transgender
- Other gender
- No gender mentioned

Again, these were aggregated into fractional values representing the fraction of raters who said the identity was mentioned in the comment.

The distributions of labels and subgroup between Train and Test can be assumed to be similar, but not exact.

*Note: Some comments were seen by many more than 10 annotators (up to thousands), due to sampling and strategies used to enforce rater accuracy.

## File descriptions
- train.csv - the training set, which includes subgroups
- test.csv - the test set, which does not include subgroups
- sample_submission.csv - a sample submission file in the correct format


# Evaluation

- 50% Create a solution using with a Machine Learning algorithm - Presentation - Only show what you did different or what other teams can learn from your solution
- 50% Performance in the Kaggle competition (Normalized acording to class performance in the private leaderboard)

_____________



### Owners

The project has been developed by the following:

* Andres Felipe Martinez Tunarroza
* Jorge Luis Medina Herrada
* Ana Milena Rodriguez Gómez
* Nicolas David Gil Quijano

Credtis to **thousandvoices** ([source](https://www.kaggle.com/thousandvoices/simple-lstm))

Data Mining. University of the Andes.

May, 2019.

In [None]:
import numpy as np
import pandas as pd
import nltk

from keras.models import Model
from keras.layers import Input, Dense, Embedding, SpatialDropout1D, add, concatenate
from keras.layers import CuDNNLSTM, Bidirectional, GlobalMaxPooling1D, GlobalAveragePooling1D
from keras.preprocessing import text, sequence
from keras.callbacks import LearningRateScheduler

nltk.download('stopwords')
from nltk.corpus import stopwords 
stop_words = set(stopwords.words('english'))

import os
print(os.listdir("../input"))

#### Global variables

In [None]:
EMBEDDING_FILES = ['../input/fasttext-crawl-300d-2m/crawl-300d-2M.vec',
                   '../input/glove840b300dtxt/glove.840B.300d.txt']
NUM_MODELS = 2
BATCH_SIZE = 512
LSTM_UNITS = 128
DENSE_HIDDEN_UNITS = 4 * LSTM_UNITS
EPOCHS = 4
MAX_LEN = 220
IDENTITY_COLUMNS = ['male', 'female', 'homosexual_gay_or_lesbian', 'christian', 'jewish',
                    'muslim', 'black', 'white', 'psychiatric_or_mental_illness']
AUX_COLUMNS = ['target', 'severe_toxicity', 'obscene', 'identity_attack', 'insult', 'threat']
TEXT_COLUMN = 'comment_text'
TARGET_COLUMN = 'target'
CHARS_TO_REMOVE = '!"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n“”’\'∞θ÷α•à−β∅³π‘₹´°£€\×™√²—'

list_stopwords = {'the','be','to','of','and','a','in','that','have','I','is','it','for','not','on','with','he','as',
                  'you','do','at', 'this','but','his','by','from'}

stopword_Common_names = {'Oliver','Noah','Jack','Liam','Harry', 'Mason','Jacob','Charlie','William','Thomas',
                         'Ethan','George','Michael','Oscar','Alexander','James','Daniel','Amelia','Emma','Olivia',
                         'Isla','Sophia','Emily','Isabella','Poppy','Ava','Mia','Isabella','Jessica','Abigail',
                         'Lily','Madison','Sophie','Charlotte'}

list_stopwords = set.union(stopword_Common_names,list_stopwords,stopwords.words('english'))

ids = {'contractions':["' m","' re","' ve","' s","' ll","' d","' n't","'m","'re","'ve","'s","'ll","'d","'n't","n't","[!”#$%&’()*+,-./:;<=>?@[\]^_`{|}~]"],
      'normal':[" am"," are"," have"," is"," will"," had"," not"," am"," are"," have"," is"," will"," had"," not"," not", " "]}

ids = dict(zip(ids['contractions'], ids['normal']))

#### Functions

In [None]:
def preprocessing(df):
    
    aux = []
    for i in np.arange(df.shape[0]):
        querywords = pd.DataFrame(df).iloc[i][0].split()          
        resultwords  = [word for word in querywords if word.lower() not in list_stopwords]
        aux.append(' '.join(resultwords))

    x_t = pd.DataFrame(aux, index=df.index)[0]
    
    return x_t

def get_coefs(word, *arr):
    return word, np.asarray(arr, dtype='float32')

def load_embeddings(path):
    with open(path) as f:
        return dict(get_coefs(*line.strip().split(' ')) for line in f)

def build_matrix(word_index, path):
    embedding_index = load_embeddings(path)
    embedding_matrix = np.zeros((len(word_index) + 1, 300))
    for word, i in word_index.items():
        try:
            embedding_matrix[i] = embedding_index[word]
        except KeyError:
            pass
    return embedding_matrix

#### Neural network arquitecture

* Embedding layer
* Dropout layer
* LSTM layer (Bidirectional with *CuDNNLSTM* implementation)
* Pooling layer (GlobalMaxPooling1D and GlobalAveragePooling1D)
* Dense layer

Ohter information:
* `loss='binary_crossentropy'`
* `optimizer='adam'`

In [None]:
def build_model(embedding_matrix, num_aux_targets):
    words = Input(shape=(None,))
    x = Embedding(*embedding_matrix.shape, weights=[embedding_matrix], trainable=False)(words)
    x = SpatialDropout1D(0.2)(x)
    x = Bidirectional(CuDNNLSTM(LSTM_UNITS, return_sequences=True))(x)
    x = Bidirectional(CuDNNLSTM(LSTM_UNITS, return_sequences=True))(x)

    hidden = concatenate([
        GlobalMaxPooling1D()(x),
        GlobalAveragePooling1D()(x),
    ])
    hidden = add([hidden, Dense(DENSE_HIDDEN_UNITS, activation='relu')(hidden)])
    hidden = add([hidden, Dense(DENSE_HIDDEN_UNITS, activation='relu')(hidden)])
    result = Dense(1, activation='sigmoid')(hidden)
    aux_result = Dense(num_aux_targets, activation='sigmoid')(hidden)
    
    model = Model(inputs=words, outputs=[result, aux_result])
    model.compile(loss='binary_crossentropy', optimizer='adam')

    return model

In [None]:
train_df = pd.read_csv('../input/jigsaw-unintended-bias-in-toxicity-classification/train.csv')
test_df = pd.read_csv('../input/jigsaw-unintended-bias-in-toxicity-classification/test.csv')

train_df.head()

In [None]:
for column in IDENTITY_COLUMNS + [TARGET_COLUMN]:
    train_df[column] = np.where(train_df[column] >= 0.5, True, False)

train_df.head()

In [None]:
x_train = train_df[TEXT_COLUMN].astype(str)
y_train = train_df[TARGET_COLUMN].values
y_aux_train = train_df[AUX_COLUMNS].values
x_test = test_df[TEXT_COLUMN].astype(str)

In [None]:
x_train_ = preprocessing(pd.DataFrame(x_train))

In [None]:
x_test_ = preprocessing(pd.DataFrame(x_test))

In [None]:
tokenizer = text.Tokenizer(filters=CHARS_TO_REMOVE)
tokenizer.fit_on_texts(list(x_train) + list(x_test))

In [None]:
x_train = tokenizer.texts_to_sequences(x_train)

In [None]:
x_test = tokenizer.texts_to_sequences(x_test)

In [None]:
x_train = sequence.pad_sequences(x_train, maxlen=MAX_LEN)

In [None]:
x_test = sequence.pad_sequences(x_test, maxlen=MAX_LEN)

In [None]:
embedding_matrix = np.concatenate(
    [build_matrix(tokenizer.word_index, f) for f in EMBEDDING_FILES], axis=-1)

Weights calculation to fit the neural network 

In [None]:
sample_weights = np.ones(len(x_train), dtype=np.float32)
sample_weights += train_df[IDENTITY_COLUMNS].sum(axis=1)
sample_weights += train_df[TARGET_COLUMN] * (~train_df[IDENTITY_COLUMNS]).sum(axis=1)
sample_weights += (~train_df[TARGET_COLUMN]) * train_df[IDENTITY_COLUMNS].sum(axis=1) * 5
sample_weights /= sample_weights.mean()

#### Neural Network training and prediction

In [None]:
checkpoint_predictions = []
weights = []

for model_idx in range(NUM_MODELS):
    model = build_model(embedding_matrix, y_aux_train.shape[-1])
    for global_epoch in range(EPOCHS):
        model.fit(
            x_train,
            [y_train, y_aux_train],
            batch_size=BATCH_SIZE,
            epochs=1,
            verbose=2,
            sample_weight=[sample_weights.values, np.ones_like(sample_weights)],
            callbacks=[
                LearningRateScheduler(lambda _: 1e-3 * (0.55 ** global_epoch))
            ]
        )
        checkpoint_predictions.append(model.predict(x_test, batch_size=2048)[0].flatten())
        weights.append(2 ** global_epoch)

Results Consolidation

In [None]:
predictions = np.average(checkpoint_predictions, weights=weights, axis=0)

#### Submit

In [None]:
submission = pd.DataFrame.from_dict({'id': test_df.id,
                                     'prediction': predictions})

In [None]:
submission.to_csv('submission.csv', index=False)