## BERT solution

This notebook provides a solution using a pre-trained BERT model that substitutes words in a sentence depending on the context. Also the model uses a list of toxic words.

In [1]:
#!pip install transformers

In [2]:
import torch
from transformers import BertTokenizer, BertForMaskedLM
from tqdm import tqdm
import re
from nltk.tokenize import RegexpTokenizer

  from .autonotebook import tqdm as notebook_tqdm


## Data Loading

#### 1. Load the test dataset as this is a pre-trained model

In [3]:
import pandas as pd

test_df = pd.read_csv('../data/interim/test.csv')

X = test_df['source'].to_list()
y = test_df['target'].to_list()

#### 2. Upload only the list of toxic words found on the Internet, because in the base solution the algorithm worked better with just this list

In [4]:
toxic_words = []

with open('../data/external/profanity_words_en.txt', "r") as f:
    for word in f.readlines():
        toxic_words.append(word[:-1])

## Initialize the BERT tokenizer and model

In [5]:
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForMaskedLM.from_pretrained('bert-base-uncased')

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.weight', 'bert.pooler.dense.weight', 'bert.pooler.dense.bias', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


## BERT model

#### A function that replaces toxic words with "[MASK]". 

In [6]:
def add_mask(sentence, toxic_words):
    
    # Tokenize the sentence
    tokenizer = RegexpTokenizer(r"\b\w+\b|[.,!?'\"]")
    tokens = tokenizer.tokenize(sentence)
    
    mask_sentence = []
    
    for token in tokens:
        
        if token.lower() in toxic_words:
            # If it's toxic, replace it with [MASK]
            mask_sentence.append('[MASK]')
        else:
            # Otherwise, keep the token as is
            mask_sentence.append(token)
            
    result_sentence = ' '.join(mask_sentence)

    # Correctly handle contractions like "It's"
    result_sentence = re.sub(r"(\w+) ' (\w+)", r"\1'\2", result_sentence)

    # Remove spaces before punctuation
    result_sentence = re.sub(r" ([.,!?])", r"\1", result_sentence)

    return result_sentence

#### A function using a pre-trained model to replace "[MASK]" with a matching word

In [7]:
def detoxify_sentences(sentences, toxic_words, tokenizer):
    
    detoxified_sentences = []
    
    for i in tqdm(range(len(sentences))):
    
        mask_sentence = add_mask(sentences[i], toxic_words)

        if '[MASK]' in mask_sentence:

            # Tokenize the sentence
            inputs = tokenizer(mask_sentence, return_tensors="pt", padding=True, truncation=True)
            
            # Find the positions of the [MASK] tokens in the input
            mask_positions = torch.where(inputs.input_ids[0] == tokenizer.mask_token_id)

            # Predict the words for the [MASK] token
            with torch.no_grad():
                predictions = model(**inputs).logits[0]
                
            # Initialize a list to store predicted words
            predicted_words = []

            # Extract the predicted words for each [MASK] token
            for position in mask_positions[0]:
                predicted_word_index = torch.argmax(predictions[position]).item()
                predicted_word = tokenizer.convert_ids_to_tokens(predicted_word_index)
                predicted_words.append(predicted_word)

            # Replace the [MASK] tokens with the predicted words
            for predicted_word in predicted_words:
                mask_sentence = mask_sentence.replace("[MASK]", predicted_word, 1)

        detoxified_sentences.append(mask_sentence)
        
    return detoxified_sentences


### Functions for comparing sentences

In [8]:
def basic_comparison(X, output):
    same = 0
    for i in range(len(X)):
        
        # Check if the output sentence differs from the original
        if X[i] != output[i]:
            if len(X[i]) <= 50:
                print(f'{i + 1}. Before the algorithm: {X[i]}\nAfter the algorithm: {output[i]}')
        else:
            # Count identical sentences
            same += 1

    print('Number of identical sentences:', same)

In [9]:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity

def cosine_similar(sentence1, sentence2):
    
    # Create a CountVectorizer to convert sentences into a bag of words representation
    vectorizer = CountVectorizer()
    sentences = [sentence1, sentence2]
    v = vectorizer.fit_transform(sentences)

    # Calculate cosine similarity between the sentences
    cosine_similarities = cosine_similarity(v)
    
    return cosine_similarities[0][1]

In [10]:
def comparison_with_cosine(output, y, X):
    
    cs = cosine_similar(y, output)
    
    print(f'Cosine similar: {cs}\nPredicted: {output}\nTarget: {y}\nSource: {X}\n')

### Solution

In [11]:
output = detoxify_sentences(X[:5000], toxic_words, tokenizer)

100%|██████████████████████████████████████████████████████████████████████████████| 5000/5000 [06:42<00:00, 12.41it/s]


#### 1. Comparison of the predicted sentence and the source

In [12]:
basic_comparison(X[:50], output[:50])

1. Before the algorithm: I don't know who the fuck Wang is!
After the algorithm: I don't know who the man guy is!
3. Before the algorithm: Oz Veuish and Stupid.
After the algorithm: Oz Veuish and die.
7. Before the algorithm: let's drink to fuck.
After the algorithm: let's drink to that.
16. Before the algorithm: you want to molest another doctor first?
After the algorithm: you want to see another doctor first?
22. Before the algorithm: The bald wanker knows something.
After the algorithm: The bald man knows something.
28. Before the algorithm: Hey fish, you get a lot of pussy, huh?
After the algorithm: Hey fish, you get a lot of attention, huh?
29. Before the algorithm: Shit, I ain't built like you, John.
After the algorithm: no, I ain't built like you, John.
30. Before the algorithm: Mick, he's holding me hostage. Shut up and listen.
After the algorithm: Mick, he's holding me hostage. listen up and listen.
39. Before the algorithm: He--he is... black smoke.
After the algorithm: He he

#### Conclusion

Comparing the predicted sentence and the source we can conclude that this model does not perform well enough as it did not change almost half of the processed sentences. Although this problem is more related to the fact that the toxic word list is not good enough.

#### 2. Comparison between the predicted sentence and the target

In [13]:
for i in [6, 15, 41, 65, 78, 106]:

    comparison_with_cosine(output[i], y[i], X[i])

Cosine similar: 0.6708203932499369
Predicted: let's drink to that.
Target: Let's drink to somethin' else.
Source: let's drink to fuck.

Cosine similar: 0.801783725737273
Predicted: you want to see another doctor first?
Target: You want to annoy another doctor first? Eventually...
Source: you want to molest another doctor first?

Cosine similar: 1.0
Predicted: well, that was good.
Target: well, that was good.
Source: Damn,that was good.

Cosine similar: 0.4999999999999999
Predicted: I m a good cook.
Target: I'm a terrible cook.
Source: I’m a pathetic cook.

Cosine similar: 0.5477225575051662
Predicted: you hit your own head.
Target: You banged your head real bad.
Source: you hit your fucking head.

Cosine similar: 0.8117077033708014
Predicted: first my potatoes, then my tomatoes, then my salad, and now my salad and now the green beans.
Target: First my potatoes, then my tomatoes, then my lettuces, now my goddam beans.
Source: first my potatoes, then my tomatoes, then my salad, and now m

#### Conclusion

Comparing the predicted sentence with the target sentence, we can conclude that in some cases the algorithm performs well (high similarity with the target). But unfortunately, since the model does not know which word was behind the mask before, it sometimes changes the meaning of collocations or sentences

### Conclusion

In general, we can say that the pre-trained BERT model performs well in this task. Compared to the baseline solution, the model looks at the context of the sentence and substitutes suitable words. Unfortunately, the model does not perform well enough due to the fact that it depends on a list of toxic words, so it is worth either finding a more appropriate list of words or finding a list-independent solution. 