# HATE SPEECH DETECTION

* Kshitij Sharma (185001080)
* Prasannakumaran D (185001110)
* Praveen kumar (185001113)
* Sai Ashish (185001130)


### Problem Statement

Every year people are being scammed and bullied online by Russian hackers who spread fake news through twitter. Our objective is to identify such malicious tweets and categorize them based on the the toxicity level of the tweet. In our proposed methodology we consider using Convolutional Neural Networks to train our deep learning model to identify such harmful tweets.

### Proposed Methodology 
We first train a simple Convolutional Neural Network model to recognize various types of hate speech by word patterns using the data obtained from the Kaggle. 
We then apply it to the tweets associated with the bots of the Internet Research Agency (IRA) of Russia to try and characterize them and see if this matches well and if the hate speech model gives us any insights into the tweets.

In [None]:
import os
import numpy as np 
import pandas as pd 
import seaborn as sns
import dask.dataframe as ddf
import matplotlib.pyplot as plt
from keras.models import Model
from sklearn.dummy import DummyClassifier
from IPython.display import Markdown, display
from keras.preprocessing import text, sequence
from keras.layers import Dense, Embedding, Input
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras.layers import Conv1D, GlobalMaxPool1D, Dropout, concatenate
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

display_markdown = lambda x: display(Markdown(x))
dmc = DummyClassifier()
lrm = LogisticRegression()
rfc = RandomForestClassifier()

## Data Preprocessing and Cleaning

In [None]:
# Network parameters
max_features = 20000
maxlen = 100
# Load the dataset
train_data = pd.read_csv("./toxic/train.csv")
train_data.head()

In [None]:
# Shape of the training data
train_data.shape

### Checkin NULL Values

In [None]:
train_data.info()

In [None]:
train_text = train_data["comment_text"].fillna("Invalid").values
list_classes = ["toxic", "severe_toxic", "obscene", "threat", "insult", "identity_hate"]
y = train_data[list_classes].values

## Sequence Generation


The textual data cannot be interpreted by the machine and therefore the words, punctuations are tokenized using the tokenizer class. This class allows to vectorize a text corpus, by turning each text into either a sequence of integers (each integer being the index of a token in a dictionary) or into a vector where the coefficient for each token could be binary, based on word count, based on tf-idf

* fit_on_texts -- Updates internal vocabulary based on a list of texts
* texts_to_sequences -- Transforms each text in texts to a sequence of integers
* pad_sequences -- Since the sentences are not of the same length the sparse matrix is padded with zeros 

In [None]:
tokenizer = text.Tokenizer(num_words = max_features)
tokenizer.fit_on_texts(list(train_text))
tokenized_train_text = tokenizer.texts_to_sequences(train_text)
X_train = sequence.pad_sequences(tokenized_train_text, maxlen = maxlen)

## Building the Network

* Embedding : Turns positive integers (indexes) into dense vectors of fixed size.
* Dropout : Randomly sets elements to zero to prevent overfitting
* Conv1D : 1D convolution layer
* Dilation rate : Dilated convolutions introduce another parameter to convolutional layers called the dilation rate. This defines a spacing between the values in a kernel. 


In [None]:
def build_model(max_dilation_rate = 4):
    '''
    Builds the neural network. The network implements Dropout regularization with a dropout rate of 25 %, 
    Kernel size of 3 and embedding size of 128, implements Binary crossentropy loss function, Adam Optimizer and 
    the choice of metric is binary accuracy
    Input : conv_layers (Integer), max_dilation_rate (Integer)
    Output: Returns the built model
    '''
    embed_size = 128
    inp = Input(shape = (maxlen, ))
    x = Embedding(max_features, embed_size)(inp)
    x = Dropout(0.25)(x)
    x = Conv1D(2 * embed_size, kernel_size = 3)(x)
    prefilt_x = Conv1D(2 * embed_size, kernel_size = 3)(x)
    out_conv = []
    
    for dilation_rate in range(max_dilation_rate):
        x = prefilt_x
        for i in range(3):
            x = Conv1D(32*2**(i), kernel_size = 3, dilation_rate = 2 ** dilation_rate)(x)    
        out_conv += [Dropout(0.5)(GlobalMaxPool1D()(x))]
        
    x = concatenate(out_conv, axis = -1)    
    x = Dense(64, activation="relu")(x)
    x = Dropout(0.1)(x)
    x = Dense(6, activation="sigmoid")(x)
    model = Model(inputs = inp, outputs = x)
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['binary_accuracy'])
    return model

model = build_model()

In [None]:
model.summary()

## Train the Model
Here we train the model and use model checkpointing and early stopping to keep only the best version of the model
* ModelCheckpoint -- Callback to save the Keras model or model weights at some frequency.
* EarlyStopping -- Stop training when a monitored metric has stopped improving

In [None]:
batch_size = 512
epochs = 10

weights = "weights.hdf5"

checkpoint = ModelCheckpoint(weights, monitor='val_loss', verbose=1, save_best_only = True, mode = 'min')
early = EarlyStopping(monitor="val_loss", mode="min", patience=20)
callbacks_list = [checkpoint, early] 

model.fit(X_train, y, 
          batch_size = batch_size, 
          epochs = epochs, 
          validation_split = 0.1, 
          callbacks = callbacks_list)

model.load_weights(weights)

### Training Results


In [None]:
eval_results = model.evaluate(X_train, y, batch_size=batch_size)
for c_name, c_val in zip(model.metrics_names, eval_results):
    print(c_name, '%2.3f' % (c_val))

## Evaluating the model's performance on Unseen Data

In [None]:
test_data_text = pd.read_csv("./toxic/test.csv")
test_data_labels = pd.read_csv("./toxic/test_labels.csv")

In [None]:
test_text = test_data_text["comment_text"].fillna("Invalid").values
tokenized_test_text = tokenizer.texts_to_sequences(test_text)
X_test = sequence.pad_sequences(tokenized_test_text, maxlen = maxlen)
y_test = test_data_labels[list_classes].values

In [None]:
eval_results = model.evaluate(X_test, y_test, batch_size=batch_size)
for c_name, c_val in zip(model.metrics_names, eval_results):
    print(c_name, '%2.3f' % (c_val))

# Using Russian Troll Tweets Dataset

Since the files are large we use dask to handle the loading of dataset. We then focus on the tweet itself (content) and the category (account_category) to see if our hate-speech model shows similar results 

In [None]:
rustweet_dir = os.path.join('./', 'Russian Troll')
all_tweets_ddf = ddf.read_csv(os.path.join(rustweet_dir, '*.csv'), assume_missing = True)

Display the dataframe

In [None]:
all_tweets_ddf.head(3)

### Extract the content and account category for English Tweets

In [None]:
english_tweets_ddf = all_tweets_ddf[all_tweets_ddf['language'].isin(['English'])]
content_cat_ddf = english_tweets_ddf[['content', 'account_category']]
contents = content_cat_ddf.sample(frac=0.2).compute().drop_duplicates()

### Plot the frequency distribution for different account categories

In [None]:
fig, ax1 = plt.subplots(1,1, figsize = (10, 5))
contents['account_category'].hist(ax=ax1)

### Convert the text to sequence

In [None]:
tweets = contents["content"].fillna("Invalid").values
tokenized_tweets = tokenizer.texts_to_sequences(tweets)
X_tweet = sequence.pad_sequences(tokenized_tweets, maxlen = maxlen)

### Prediction 

In [None]:
y_tweet = model.predict(X_tweet, batch_size=1024, verbose=True)

### Create the Toxicity Dataframe from the output of the model

In [None]:
toxicity_df = pd.DataFrame(y_tweet, columns = list_classes)
toxicity_df['content_category'] = contents['account_category'].values.copy()
toxicity_df['total_hatefulness'] = np.sum(y_tweet, 1)

In [None]:
toxicity_df.head(3)

### Example tweet and prediction

In [None]:
def show_sentence(sent_index):
    display_markdown('### Input Sentence:\n `{}`'.format(tweets[sent_index]))
    c_pred = model.predict(X_tweet[sent_index : sent_index + 1])[0]
    display_markdown('### Scores')
    for k, p in zip(list_classes, c_pred):
        display_markdown('- {}, Prediction: {:2.2f}%'.format(k, 100*p))
show_sentence(100)

### Identity hate levels of different content categories

In [None]:
cat_sample_df = toxicity_df.groupby('content_category').apply(lambda x: x.sample(250, replace = False if x.shape[0] > 1000 else True)).reset_index(drop = True)
sns.factorplot(y = 'content_category', x = 'identity_hate', kind = 'swarm', data = cat_sample_df, size = 5)

### Classifying bots based on hate speech scores

In [None]:
tx_train_df, tx_valid_df = train_test_split(toxicity_df, 
                                            test_size = 0.25,
                                            random_state = 2018,
                                            stratify=toxicity_df['content_category'])

In [None]:
def fit_and_show(in_skl_model):
    in_skl_model.fit(tx_train_df[list_classes], tx_train_df['content_category'])
    out_pred = in_skl_model.predict(tx_valid_df[list_classes])
    print('%2.2f%%' % (100*accuracy_score(out_pred, tx_valid_df['content_category'])), 'accuracy')
    print(classification_report(out_pred, tx_valid_df['content_category']))
    sns.heatmap(confusion_matrix(tx_valid_df['content_category'], out_pred))

In [None]:
print("DUMMY CLASSIFIER")
fit_and_show(dmc)

In [None]:
print("RANDOM FOREST CLASSIFIER")
fit_and_show(rfc)

In [None]:
print("LOGISTIC REGRESSION")
fit_and_show(lrm)