# Train a Pytorch transformer

To implement a Pytorch transformer, I used the following tutorial: [link](https://pytorch.org/tutorials/beginner/translation_transformer.html?highlight=seq2seq). 

For the whole implementation, check the `src/models/pytorch_transformer` folder.

In [None]:
# Intallations needed if using Colab
# !pip install tensorflow-gpu==2.8.0
# !pip install torchtext

## Cloning the repository

In [4]:
!git clone https://github.com/leiluk1/text-detoxification.git

Cloning into 'text-detoxification'...
remote: Enumerating objects: 159, done.[K
remote: Counting objects: 100% (136/136), done.[K
remote: Compressing objects: 100% (99/99), done.[K
remote: Total 159 (delta 54), reused 109 (delta 29), pack-reused 23[K
Receiving objects: 100% (159/159), 44.15 MiB | 15.86 MiB/s, done.
Resolving deltas: 100% (55/55), done.


In [5]:
# If using a Colab 
%cd ./text-detoxification/

/content/text-detoxification


## Making a dataset

In [6]:
!python ./src/data/make_dataset.py 

Done successfully! Check data/interim folder.


# Training a model

In [7]:
!python ./src/models/pytorch_transformer/train.py --batch_size 32 --epochs 10

100% 5000/5000 [07:16<00:00, 11.45it/s]
100% 1094/1094 [00:27<00:00, 39.59it/s]
Epoch: 1, Train loss: 3.935, Val loss: 3.144
100% 5000/5000 [07:23<00:00, 11.29it/s]
100% 1094/1094 [00:27<00:00, 39.88it/s]
Epoch: 2, Train loss: 3.012, Val loss: 2.821
100% 5000/5000 [07:22<00:00, 11.30it/s]
100% 1094/1094 [00:27<00:00, 39.08it/s]
Epoch: 3, Train loss: 2.740, Val loss: 2.661
100% 5000/5000 [07:21<00:00, 11.31it/s]
100% 1094/1094 [00:27<00:00, 39.64it/s]
Epoch: 4, Train loss: 2.578, Val loss: 2.581
100% 5000/5000 [07:22<00:00, 11.31it/s]
100% 1094/1094 [00:27<00:00, 39.62it/s]
Epoch: 5, Train loss: 2.457, Val loss: 2.535
100% 5000/5000 [07:23<00:00, 11.26it/s]
100% 1094/1094 [00:27<00:00, 39.61it/s]
Epoch: 6, Train loss: 2.358, Val loss: 2.494
100% 5000/5000 [07:23<00:00, 11.28it/s]
100% 1094/1094 [00:27<00:00, 39.54it/s]
Epoch: 7, Train loss: 2.274, Val loss: 2.472
100% 5000/5000 [07:24<00:00, 11.25it/s]
100% 1094/1094 [00:27<00:00, 39.82it/s]
Epoch: 8, Train loss: 2.197, Val loss: 2.472


## Making the predictions on test set 

The test set is 5000 text examples from the preprocessed dataset.

In [9]:
!python ./src/models/pytorch_transformer/predict.py

Generating predictions...: 100% 5000/5000 [05:38<00:00, 14.77it/s]
Done!


In [10]:
import pandas as pd

In [11]:
res = pd.read_csv('./data/interim/transformer_results.csv')
res.head()

Unnamed: 0,reference,detox_reference,tranformer_result
0,"one is a little fucked up, but they're...","One is higher-pitched, but they're...","one is a little messed up, but they're..."
1,"I dated my naked ass last year, but now that I...","Last year I went at this half-assed, but now I...","i've been banging my naked last year, but now..."
2,"So stop your fucking whining, and go into your...","so stop crying, and go to your beautiful house...",so stop whining and go to your beautiful$ 3 m...
3,"""useless.""","""And nothing.""",""" no use."""
4,you're useless as a nurse.,Not much of a nurse are you!,you're no use as a sister.


## Inference: Some examples

In [17]:
!python ./src/models/pytorch_transformer/predict.py --inference "What a stupid joke."

Detoxified text:  what a joke. 


In [19]:
!python ./src/models/pytorch_transformer/predict.py --inference "Fucking damn joke!"

Detoxified text:  you got ta be kidding me. 


Pytorch transfromer results:

| Source sentence       | Detoxified sentence        |
|-----------------------|----------------------------|
| "What a stupid joke." | "what a joke."             |
| "Fucking damn joke!"  | "you got ta be kidding me."|


As was obtained, the Pytorch transformer provides quite good results in a non-toxic manner, but some information from the original example sentences is missing. As shown in the table, the detoxified sentences are less expressive than the original ones. Therefore, further improvements are needed to make the detoxified sentences more informative and natural-sounding.