# Toxicity detection with `transformer` models 

Return to the [castle](https://github.com/Nkluge-correa/teeny-tiny_castle).

**Toxicity detection is a tool commonly used in discussion forums and other social media applications since moderation is crucial to promoting healthy online discussions.** 

**We can define toxicity as:**

- **_‚Äúabusive speech targeting specific group characteristics, such as ethnic origin, religion, gender, or sexual orientation.‚Äù_**

**For an in-depth discussion on toxicity detection as a machine learning problem, we recommend "_[Learning from the worst: Dynamically generated datasets to improve online hate detection](https://scholar.google.com/scholar_url?url=https://arxiv.org/abs/2012.15761&hl=pt-BR&sa=T&oi=gsb&ct=res&cd=0&d=7265559494033067667&ei=QUJYY6TJL4iKmgHXk5LgDQ&scisig=AAGBfm3gsyOD5eqcUPLFvWmVm8PlLcMr3g)_".**

<img src="https://miro.medium.com/max/1400/1*d4k-PRw-warACDpklCh1mw.png" alt="toxic-image" width="800"/>


**In this notebook, we will be using a dataset created from the [Toxic Comment Classification Challenge Dataset](https://github.com/tianqwang/Toxic-Comment-Classification-Challenge), created by the [Conversation AI](https://conversationai.github.io/) team, a research initiative founded by [Jigsaw](https://jigsaw.google.com/) and Google (both a part of Alphabet).**

## ‚ò£Ô∏è DISCLAIMER/WARNING ‚ò£Ô∏è 

### _This dataset for this competition contains text that may be considered profane, vulgar, or offensive._

**The original dataset contains an unequal distribution of ‚Äú_hate_‚Äù and ‚Äú_not hate_‚Äù samples for multi-classification. However, we created a smaller version of the original dataset (the `toxic_content_dataset.csv`, available for download in [this](https://drive.google.com/uc?export=download&id=1ZvZtrsE1dAl7CiHt16Jstp2rhkDKnGlL) link). The used dataset contains an equal amount of ‚Äú_hate_‚Äù and ‚Äú_not hate_‚Äù samples, summing up to $70157$ samples total.**

In [1]:
import pandas as pd
import urllib.request

urllib.request.urlretrieve(
    'https://drive.google.com/uc?export=download&id=1ZvZtrsE1dAl7CiHt16Jstp2rhkDKnGlL', 
    'toxic_content_dataset.csv'
)

df = pd.read_csv("toxic_content_dataset.csv")
display(df)

Unnamed: 0.1,Unnamed: 0,comment_text,toxic
0,0,explanation edits made username hardcore metal...,1
1,1,aww matches background colour seemingly stuck ...,1
2,2,hey man really trying edit war guy constantly ...,1
3,3,make real suggestions improvement wondered sec...,1
4,4,sir hero chance remember page,1
...,...,...,...
70152,70191,lol gay never know good feels fuck woman ass,0
70153,70192,fuck pansy jew would whine nai brith beat pale...,0
70154,70193,shalom semite get fuck kill son bitch leave wi...,0
70155,70194,think gay fag,0


**The original dataset had a lot of symbols and emojis that may hinder the performance of a language model trained for text classification. However, we already performed the cleaning procedure. Below you can find the funciton that we used to function to `preprocess` text data, removing unwanted characters and so on.**

In [None]:
import re
import nltk
import numpy as np
from nltk.corpus import stopwords
from nltk.tokenize import RegexpTokenizer
from sklearn.model_selection import train_test_split
from nltk.stem import WordNetLemmatizer,PorterStemmer

lemmatizer = WordNetLemmatizer()
stemmer = PorterStemmer() 

def preprocess(sentence):
    sentence=str(sentence)
    sentence = sentence.lower()
    sentence=sentence.replace('{html}',"") 
    cleanr = re.compile('<.*?>')
    cleantext = re.sub(cleanr, '', sentence)
    rem_url=re.sub(r'http\S+', '',cleantext)
    rem_num = re.sub('[0-9]+', '', rem_url)
    tokenizer = RegexpTokenizer(r'\w+')
    tokens = tokenizer.tokenize(rem_num)  
    filtered_words = [w for w in tokens if len(w) > 2 if not w in stopwords.words('english')]
    stem_words=[stemmer.stem(w) for w in filtered_words]
    lemma_words=[lemmatizer.lemmatize(w) for w in stem_words]
    return " ".join(filtered_words)

df['comment_text']= df['comment_text'].map(lambda s:preprocess(s)) 

x = list(df.comment_text)
y = list(df.toxic)

x_train, x_test, y_train, y_test = train_test_split(
    x, y, test_size=0.2, random_state=42)


y_train = np.array(y_train).astype(float)
y_test = np.array(y_test).astype(float)

**Text is one of the most widespread forms of sequence data and discrete signals (as opposed to continuous signals, like _images_ or _audio_). These sequences can be sequences of characters, syllables, or words.**

**Deep learning for NLP is pattern recognition applied to paragraphs, sentences, and words, just as computer vision is pattern recognition applied to videos, images, and pixels.**

**Like all neural networks, language models based on deep-learning architectures don‚Äôt take as input raw text, i.e., you _can not multiply a word by a weight matrix, add a bias, and apply a ReLU function at the end_. Neural networks only work with numeric tensors. Thus, we need to _vectorize_ our text data, i.e., transform the text into numeric tensors.**

**For a comprehensive guide on how to vectorize text data, we recommend Chapter 6: Deep learning for text and sequences, in [_Deep Learning with Python_](https://tanthiamhuat.files.wordpress.com/2018/03/deeplearningwithpython.pdf). Below we will be using the `Tokenizer` class from the [Keras](https://keras.io/) library.** 

**In terms of preprocessing, you can also pass symbols you may want to filter, by using the `filters` argument.**

**_For simplicity, we are creating a tokenizer with a maximum of 3000 words. If you explore the `JSON` file where we saved our tokenizer, you can see the values attribute to each work in our data_**

In [47]:
import io
import json
import tensorflow as tf
from tensorflow import keras
from keras_preprocessing.sequence import pad_sequences
from keras.preprocessing.text import Tokenizer, tokenizer_from_json


vocab_size = 3000
embed_size = 50
max_len = 256
tokenizer = Tokenizer(num_words=vocab_size,
                      filters='!"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n',
                      lower=True,
                      split=" ",
                      oov_token="<OOV>")

tokenizer.fit_on_texts(x_train)
training_sequences = tokenizer.texts_to_sequences(x_train)
training_padded = pad_sequences(
    training_sequences, maxlen=max_len, truncating='post')


tokenizer_json = tokenizer.to_json()
with io.open('models\\tokenizer_toxic_detection.json', 'w', encoding='utf-8') as f:
    f.write(json.dumps(tokenizer_json, ensure_ascii=False))

**[Recurrent neural networks](https://en.wikipedia.org/wiki/Recurrent_neural_networks "Recurrent neural networks"), like `LSTM` and `GRU`, and convolutional networks, like `1D Convnets`, are great options for dealing with problems involving NLP. In [other notebooks](https://github.com/Nkluge-correa/teeny-tiny_castle/blob/bbe9c0a77499fa68de7c6d53bf5ef7e0b43a25e0/ML%20Explainability/NLP%20Interpreter%20(en)/model_maker_en.ipynb) of our repository, you will see many examples of how to build these networks for tasks like sentiment analysis.**

**However, in this notebook, we will be using a `transformer` model, an extremely versatile and scalable architecture proposed by Vaswani et al. in [Attention Is All You Need](https://arxiv.org/abs/1706.03762).**

<img src="https://machinelearningmastery.com/wp-content/uploads/2021/08/attention_research_1.png" alt="drawing" height="450"/>

**A `transformer`  is a  deep learning model that adopts the mechanism of [self-attention](https://en.wikipedia.org/wiki/Attention_(machine_learning) "Attention (machine learning)"), differentially weighting the significance of each part of the input data. Like RNNs, transformers are designed to process sequential input data. However, unlike RNNs, transformers process the entire input all at once (_not sequencially_). The transformer does not have to process one word at a time. This allows for more [parallelization](https://en.wikipedia.org/wiki/Parallel_computing "Parallel computing"), thus reducing training times, and also allowing the training on larger datasets.**

**For an extremely _comprehensive_ and _ilustrated_ explanation of what is "_attention_" or how a "_transformer works_", we recommend the work of _Jay Alammar_:**

- [The Illustrated Transformer](https://jalammar.github.io/illustrated-transformer/);
- [The Illustrated GPT-2](https://jalammar.github.io/illustrated-gpt2/).

**Using only the _decoder_ component of the original transformer architecture, we will implement a small transformer model in this notebook (a.k.a. a decoder-only transformer). The original transformer architecture, which consists of both an encoder and a decoder transformer block, was (originally) designed for _sequence-to-sequence_ tasks like translation.**

**If you are interested in learning about the transformer architecture, check our _[sequence-to-sequence](https://github.com/Nkluge-correa/teeny-tiny_castle/blob/bbe9c0a77499fa68de7c6d53bf5ef7e0b43a25e0/ML%20Intro%20Course/seuqnece-to-sequence.ipynb)_ tutorial.**

**In general, text classification can be done using the encoder component. It's a very generic module that learns to transform a sequence into a more useful representation after ingesting it.**

**This model only has 4 attention heads with a capacity of 256 tokens and 4 transformer blocks. Our embedding layer's size is also restricted to embeddings with 50 dimensions and a vocabulary of 3000 tokens (where the dense word vectors will be created).**

In [50]:
import tensorflow as tf
from tensorflow import keras
from keras import layers

def transformer_encoder(inputs, head_size, num_heads, ff_dim, dropout=0):
    # Normalization and Attention
    x = layers.LayerNormalization(epsilon=1e-6)(inputs)
    x = layers.MultiHeadAttention(
        key_dim=head_size, num_heads=num_heads, dropout=dropout
    )(x, x)
    x = layers.Dropout(dropout)(x)
    res = x + inputs

    # Feed Forward Part
    x = layers.LayerNormalization(epsilon=1e-6)(res)
    x = layers.Conv1D(filters=ff_dim, kernel_size=1, activation="relu")(x)
    x = layers.Dropout(dropout)(x)
    x = layers.Conv1D(filters=inputs.shape[-1], kernel_size=1)(x)
    return x + res

def build_model(
    input_shape,
    head_size,
    num_heads,
    ff_dim,
    num_transformer_blocks,
    mlp_units,
    dropout=0,
    mlp_dropout=0,
):
    vocab_size = 3000
    embed_size = 50
    max_len = 256
    inputs = tf.keras.Input(shape=input_shape, dtype="int32")
    x = tf.keras.layers.Embedding(input_dim=vocab_size,
                              output_dim=embed_size,
                              input_length=max_len)(inputs)
    for _ in range(num_transformer_blocks):
        x = transformer_encoder(x, head_size, num_heads, ff_dim, dropout)

    x = layers.GlobalAveragePooling1D(data_format="channels_first")(x)
    for dim in mlp_units:
        x = layers.Dense(dim, activation="relu")(x)
        x = layers.Dropout(mlp_dropout)(x)
    outputs = layers.Dense(1, activation="sigmoid")(x)
    return keras.Model(inputs, outputs)

input_shape = (training_padded.shape[1])

model = build_model(
    input_shape,
    head_size=256,
    num_heads=4,
    ff_dim=4,
    num_transformer_blocks=4,
    mlp_units=[128],
    mlp_dropout=0.4,
    dropout=0.2,
)

model.compile(loss=tf.losses.BinaryCrossentropy(),
              optimizer='adam',
              metrics=['accuracy'])
print("Version: ", tf.__version__)
print("Eager mode: ", tf.executing_eagerly())
print("GPU is", "available" if tf.config.list_physical_devices('GPU') else "NOT AVAILABLE")
model.summary()
callbacks = [keras.callbacks.EarlyStopping(monitor="val_loss",
                                            patience=10, 
                                            restore_best_weights=True)]
model.fit(training_padded,
          y_train,
          validation_split = 0.2,
          epochs=20,
          batch_size=16,
          verbose=1,
          callbacks=callbacks)

test_sequences = tokenizer.texts_to_sequences(x_test)
test_padded = pad_sequences(test_sequences, maxlen=256, truncating='post')

test_loss_score, test_acc_score = model.evaluate(test_padded, y_test)

print(f'Final Loss: {round(test_loss_score, 2)}.')
print(f'Final Performance: {round(test_acc_score * 100, 2)} %.')
model.save("models\\toxic_detection_transformer.h5")

Version:  2.10.0
Eager mode:  True
GPU is available
Model: "model_2"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_3 (InputLayer)           [(None, 256)]        0           []                               
                                                                                                  
 embedding_2 (Embedding)        (None, 256, 50)      150000      ['input_3[0][0]']                
                                                                                                  
 layer_normalization_8 (LayerNo  (None, 256, 50)     100         ['embedding_2[0][0]']            
 rmalization)                                                                                     
                                                                                                  
 multi_head_attention_4 (MultiH  (None, 

**Bellow we can test our pre-trained model.** üôÉ

In [5]:
import json
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras
from keras_preprocessing.sequence import pad_sequences
from keras.preprocessing.text import Tokenizer, tokenizer_from_json

model = keras.models.load_model('models\\toxic_detection_transformer.h5')

with open('models\\tokenizer_toxic_detection.json') as f:
    data = json.load(f)
    tokenizer = tokenizer_from_json(data)
    word_index = tokenizer.word_index

strings = [
    'I think you should shut up your big mouth',
    'I do not agree with you'
]

preds = model.predict(
        keras.preprocessing.sequence.pad_sequences(
                                                    tokenizer.texts_to_sequences(strings),
                                                    maxlen=256,
                                                    truncating='post'
                                                ),
    verbose=0)

for i, string in enumerate(strings):
    print(f'{string}\n')
    print(f'Text has toxic content üòî {round((1 - preds[i][0]) * 100, 2)}% | Text has no toxic content üòä {round(preds[i][0] * 100, 2)}\n\n{"*" * 50}')

I think you should shut up your big mouth

Text has toxic content üòî 98.89% | Text has no toxic content üòä 1.11

**************************************************
I do not agree with you

Text has toxic content üòî 7.25% | Text has no toxic content üòä 92.75

**************************************************


**You can try to repurpose this architecture for other applications and tasks, like multi-classification instead of binary classification.**

---

Return to the [castle](https://github.com/Nkluge-correa/teeny-tiny_castle).