<h2> Comment Toxicity Deep Learning Model </h2>
<h5> Created a deep learning model which could accurately detect whether a comment online was "toxic" or not. There are 6 categorical values that a comment can have, being "toxic", "severe toxic", "obscene", "threat", "insult" and "identity hate". </h5>

<p> Going to start off by importing libraries needed for this project and importing the dataset </p>

In [183]:
import tensorflow as tf
import os
import pandas as pd
import numpy as np
df = pd.read_csv("./archive/train.csv")
df.head()

Unnamed: 0,id,comment_text,toxic,severe_toxic,obscene,threat,insult,identity_hate
0,0000997932d777bf,Explanation\nWhy the edits made under my usern...,0,0,0,0,0,0
1,000103f0d9cfb60f,D'aww! He matches this background colour I'm s...,0,0,0,0,0,0
2,000113f07ec002fd,"Hey man, I'm really not trying to edit war. It...",0,0,0,0,0,0
3,0001b41b1c6bb37e,"""\nMore\nI can't make any real suggestions on ...",0,0,0,0,0,0
4,0001d958c54c6e35,"You, sir, are my hero. Any chance you remember...",0,0,0,0,0,0


<h3> Data Preprocessing </h3>

<p> I will start off by preprocessing the text data. This includes lower-casing, removing stop words, punctuation removal and then using TextVectorization, 
which converts words to integers to be used by the model. Also have to do other steps like splitting training and testing data.</p>

In [184]:
from nltk.corpus import stopwords
from tensorflow.keras.layers import TextVectorization

stop = stopwords.words('english')

def remove_stop_words(s):
    return ' '.join(word for word in s.split() if word not in stop)

df['comment_text'] = df['comment_text'].apply(remove_stop_words)


X = df['comment_text']
y = df[df.columns[2:]].values

MAX_FEATURES = 200000 #number of words in the vocab

vectorizer = TextVectorization(max_tokens=MAX_FEATURES, output_sequence_length=1800, output_mode='int', standardize='lower_and_strip_punctuation')
vectorizer.adapt(X.values)
vectorized_text = vectorizer(X.values)
vectorized_text #each integer represents a word in the vocab

<tf.Tensor: shape=(159571, 1800), dtype=int64, numpy=
array([[  591,   140,    59, ...,     0,     0,     0],
       [    1,   145,  2465, ...,     0,     0,     0],
       [  358,   378,    19, ...,     0,     0,     0],
       ...,
       [32414,  7384,   314, ...,     0,     0,     0],
       [   27,   477,    13, ...,     0,     0,     0],
       [   27,     2,    66, ...,     0,     0,     0]])>

In [185]:
dataset = tf.data.Dataset.from_tensor_slices((vectorized_text, y))
dataset = dataset.cache() #improve performance
dataset = dataset.shuffle(160000) # prevent overfitting in case data is arranged in specific way
dataset = dataset.batch(16) # each batch has 16 data points
dataset = dataset.prefetch(8) # while model works on one batch, tensorflow can preload others so theres no bottleneck

In [186]:
dataset.as_numpy_iterator().next() #view the first batch that will be fed into training model

(array([[   10,   577,    49, ...,     0,     0,     0],
        [ 3320,  1326,  1765, ...,     0,     0,     0],
        [  236,   467,    10, ...,     0,     0,     0],
        ...,
        [   19,    80,  2338, ...,     0,     0,     0],
        [  109,   366,     2, ...,     0,     0,     0],
        [18423,  2531, 41041, ...,     0,     0,     0]]),
 array([[0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0]]))

In [191]:
training_data = dataset.take(int(len(dataset) * 0.7)) #take 70% of the data to use for training
validation_data = dataset.skip(int(len(dataset)*.7)).take(int(len(dataset)*.2))
testing_data = dataset.skip(int(len(dataset)*.9)).take(int(len(dataset)*.1))

<h3> Create Deep Learning Model </h3>

In [192]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dropout, Bidirectional, Dense, Embedding

In [193]:
model = Sequential()
#create embedding layer which is able to capture relationship between words. Words closer together have more similar meaning.
model.add(Embedding(MAX_FEATURES+1, 32))
#bidirectional lstm is important for nlp - eg phrases like "i don't hate you". need to remember earlier words and consider both directions
model.add(Bidirectional(LSTM(32, activation='tanh')))
#feature extraction
model.add(Dense(128, activation='relu'))
model.add(Dense(256, activation='relu'))
model.add(Dense(128, activation='relu'))
#since final data for 6 categories need to be between 0 and 1
model.add(Dense(6, activation='sigmoid')) 

In [194]:
model.compile(loss="BinaryCrossentropy", optimizer="Adam")

In [195]:
model.summary()

Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_3 (Embedding)     (None, None, 32)          6400032   
                                                                 
 bidirectional_3 (Bidirectio  (None, 64)               16640     
 nal)                                                            
                                                                 
 dense_12 (Dense)            (None, 128)               8320      
                                                                 
 dense_13 (Dense)            (None, 256)               33024     
                                                                 
 dense_14 (Dense)            (None, 128)               32896     
                                                                 
 dense_15 (Dense)            (None, 6)                 774       
                                                      

In [196]:
from tensorflow.keras.callbacks import EarlyStopping

early_stopping = EarlyStopping(
    monitor='val_loss',  # which metric to monitor.
    min_delta=0,  # minimum change to qualify as an improvement.
    patience=1,  # number of epochs with no improvement to stop training.
    verbose=1,  # print messages.
    restore_best_weights=True  # restore the best weights from the epoch with the best monitored metric.
)


history = model.fit(training_data, epochs = 3, validation_data = validation_data, callbacks = early_stopping)

Epoch 1/3
Epoch 2/3
Epoch 3/3


<h3> Try making some predictions </h3>

In [199]:
input_text = vectorizer("You suck! Balls!")

In [200]:
model.predict(np.expand_dims(input_text, 0)) > 0.5



array([[ True, False,  True, False,  True, False]])

In [203]:
df.columns[2:] 
#comparing against categories shows the comment is toxic, obscene and is an insult. 
#However, it's not a threat, severely toxic or identity hate.

Index(['toxic', 'severe_toxic', 'obscene', 'threat', 'insult',
       'identity_hate'],
      dtype='object')

<h3> Evaluating the model </h3>

In [205]:
from tensorflow.keras.metrics import Precision, Recall, CategoricalAccuracy

precision = Precision() #lower value shows fewer false positives
recall = Recall() #lower value shows fewer false negatives
accuracy = CategoricalAccuracy() # correct predictions / total predictions

In [206]:
for batch in testing_data.as_numpy_iterator():
    X_test_batch, y_test_batch = batch
    
    predict = model.predict(X_test_batch)
    print(predict.shape)
    y_test_batch = y_test_batch.flatten() #true values
    predict = predict.flatten() #predicted values
    
    precision.update_state(y_train_batch, predict)
    recall.update_state(y_train_batch, predict)
    accuracy.update_state(y_train_batch, predict)

(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)
(16, 6)


In [102]:
print(f'Precision: {precision.result().numpy()}, Recall:{recall.result().numpy()}, Accuracy:{accuracy.result().numpy()}')

Precision: 0.8193861246109009, Recall:0.7121595144271851, Accuracy:0.49448344111442566


<h3> Sharing model using Gradio </h3>

<p> The precision is a higher than recall, which means the model is making a lot of false negatives. While trying the model out, I noticed that the model rarely predicts categories like "threat" since they do not occur frequently in the training dataset. A possible improvement would be to assign a higher weight to minority categories </p>

In [207]:
!pip install gradio jinja2



In [208]:
import gradio as gr

In [209]:
model.save("toxicity.h5")
model = tf.keras.models.load_model('toxicity.h5')

In [210]:
def score_comment(comment):
    vectorized_comment = vectorizer([comment])
    results = model.predict(vectorized_comment)
    
    text = ''
    for idx, col in enumerate(df.columns[2:]):
        text += '{}: {}\n'.format(col, results[0][idx]>0.5)
    
    return text

In [212]:
interface = gr.Interface(fn=score_comment, 
                         inputs=gr.inputs.Textbox(lines=2, placeholder='Comment to score'),
                        outputs='text')

In [213]:
interface.launch()

Running on local URL:  http://127.0.0.1:7861

To create a public link, set `share=True` in `launch()`.






In [None]:
for col in df.columns[2:]:
    print(df[col].value_counts())