<a href="https://colab.research.google.com/github/sljm12/machine_learning_notebooks/blob/master/nlp/yoda_translator/Yoda_translator_with_T5.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

 **Few shot text generation with T5 Transformer**

This notebook was modified from https://towardsdatascience.com/poor-mans-gpt-3-few-shot-text-generation-with-t5-transformer-51f1b01f843e.

I modified it to work with transformers 4.19.2 instead of the previous 2.9

The idea was to try if we can use Few Shot learning to also do a Yoda translator.

## 1. Install libraries

In [None]:
!pip install transformers sentencepiece --quiet

In [None]:
# Check we have a GPU and check the memory size of the GPU
!nvidia-smi

Thu May 19 05:39:30 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   57C    P8    10W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

## 2. Prepare Model

In [None]:

import random
import pandas as pd
import numpy as np
import torch
from torch.utils.data import Dataset, DataLoader

from transformers import (
    AdamW,
    T5ForConditionalGeneration,
    T5Tokenizer,
    get_linear_schedule_with_warmup
)

def set_seed(seed):
  random.seed(seed)
  np.random.seed(seed)
  torch.manual_seed(seed)

set_seed(42)

In [None]:
ENGLISH_TO_YODA = "e_to_y:"
YODA_TO_ENGLISH = "y_to_e:"

In [None]:
from transformers import T5Tokenizer, T5ForConditionalGeneration
tokenizer = T5Tokenizer.from_pretrained('t5-base')
t5_model = T5ForConditionalGeneration.from_pretrained('t5-base')


Downloading:   0%|          | 0.00/773k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.17k [00:00<?, ?B/s]

For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding with `truncation is True`.
- Be aware that you SHOULD NOT rely on t5-base automatically truncating your input to 512 when padding/encoding.
- If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with `model_max_length` or pass `max_length` when encoding/padding.


Downloading:   0%|          | 0.00/850M [00:00<?, ?B/s]

In [None]:
t5_model.cuda()

T5ForConditionalGeneration(
  (shared): Embedding(32128, 768)
  (encoder): T5Stack(
    (embed_tokens): Embedding(32128, 768)
    (block): ModuleList(
      (0): T5Block(
        (layer): ModuleList(
          (0): T5LayerSelfAttention(
            (SelfAttention): T5Attention(
              (q): Linear(in_features=768, out_features=768, bias=False)
              (k): Linear(in_features=768, out_features=768, bias=False)
              (v): Linear(in_features=768, out_features=768, bias=False)
              (o): Linear(in_features=768, out_features=768, bias=False)
              (relative_attention_bias): Embedding(32, 12)
            )
            (layer_norm): T5LayerNorm()
            (dropout): Dropout(p=0.1, inplace=False)
          )
          (1): T5LayerFF(
            (DenseReluDense): T5DenseReluDense(
              (wi): Linear(in_features=768, out_features=3072, bias=False)
              (wo): Linear(in_features=3072, out_features=768, bias=False)
              (dropout): Dr

In [None]:
# optimizer
no_decay = ["bias", "LayerNorm.weight"]
optimizer_grouped_parameters = [
    {
        "params": [p for n, p in t5_model.named_parameters() if not any(nd in n for nd in no_decay)],
        "weight_decay": 0.0,
    },
    {
        "params": [p for n, p in t5_model.named_parameters() if any(nd in n for nd in no_decay)],
        "weight_decay": 0.0,
    },
]
optimizer = AdamW(optimizer_grouped_parameters, lr=3e-4, eps=1e-8)




In [None]:
from pathlib import Path
text = Path("yoda.txt").read_text("utf8")
data = [t.split("\t") for t in text.split("\n") if len(t.strip()) > 0]

In [None]:
yodaish_english = data

In [None]:
english_yodaish = [[i[1], i[0]] for i in yodaish_english]

In [None]:
yodaish_english

[['Consume you, it will', 'it will consume you'],
 ['Patience you must have my young Padawan.',
  'you must have patience my young padawan'],
 ['Powerful you have become, the dark side I sense in you',
  'you have become powerful, i sense in you the dark side'],
 ['The shadow of greed, that is', 'that is the shadow of greed'],
 ['Truly wonderful the mind of a child is',
  'the mind of a child is truly wonderful'],
 ['Judge me by my size, do you', 'do  you judge me by my size'],
 ['Impossible to see the light, the future is',
  'the future is impossible to see the light'],
 ['To answer power with power, the Jedi way this is not',
  'The jedi way is not to answer power with power'],
 ['In this war, a danger there is, of losing who we are',
  'There is a danger of losing who we are in this war'],
 ['Named must your fear be before banish it you can.',
  'Your fear must be named before you can banish it'],
 ['Earned it I have', 'i have earned it'],
 ['Soon will I rest, yes, forever sleep',


In [None]:
english_yodaish

[['it will consume you', 'Consume you, it will'],
 ['you must have patience my young padawan',
  'Patience you must have my young Padawan.'],
 ['you have become powerful, i sense in you the dark side',
  'Powerful you have become, the dark side I sense in you'],
 ['that is the shadow of greed', 'The shadow of greed, that is'],
 ['the mind of a child is truly wonderful',
  'Truly wonderful the mind of a child is'],
 ['do  you judge me by my size', 'Judge me by my size, do you'],
 ['the future is impossible to see the light',
  'Impossible to see the light, the future is'],
 ['The jedi way is not to answer power with power',
  'To answer power with power, the Jedi way this is not'],
 ['There is a danger of losing who we are in this war',
  'In this war, a danger there is, of losing who we are'],
 ['Your fear must be named before you can banish it',
  'Named must your fear be before banish it you can.'],
 ['i have earned it', 'Earned it I have'],
 ['Yes soon i will rest, forever sleep',
 

In [None]:
list_tokens = [len(tokenizer.encode(i[0])) for i in true_false_adjective_tuples]
max_length = max(list_tokens)
print(max_length)

16


In [None]:
for input,output in yodaish_english:
    print(input,"-" ,output)

Consume you, it will - it will consume you
Patience you must have my young Padawan. - you must have patience my young padawan
Powerful you have become, the dark side I sense in you - you have become powerful, i sense in you the dark side
The shadow of greed, that is - that is the shadow of greed
Truly wonderful the mind of a child is - the mind of a child is truly wonderful
Judge me by my size, do you - do  you judge me by my size
Impossible to see the light, the future is - the future is impossible to see the light
To answer power with power, the Jedi way this is not - The jedi way is not to answer power with power
In this war, a danger there is, of losing who we are - There is a danger of losing who we are in this war
Named must your fear be before banish it you can. - Your fear must be named before you can banish it
Earned it I have - i have earned it
Soon will I rest, yes, forever sleep - Yes soon i will rest, forever sleep
When you look at the dark side, careful you must be - you 

## 3. Train Loop

In [None]:
def train(data, init_words, num_epochs=10):
  t5_model.train()

  epochs = num_epochs

  for epoch in range(epochs):
    print ("epoch ",epoch)
    for input,output in data:
      input_sent = init_words +" " + input+ " </s>"
      ouput_sent = output+" </s>"

      tokenized_inp = tokenizer.encode_plus(input_sent,  max_length=96, pad_to_max_length=True,return_tensors="pt").to('cuda')
      tokenized_output = tokenizer.encode_plus(ouput_sent, max_length=96, pad_to_max_length=True,return_tensors="pt").to('cuda')


      input_ids  = tokenized_inp["input_ids"]
      attention_mask = tokenized_inp["attention_mask"]

      lm_labels= tokenized_output["input_ids"]
      decoder_attention_mask=  tokenized_output["attention_mask"]


      # the forward function automatically creates the correct decoder_input_ids
      output = t5_model(input_ids=input_ids, labels=lm_labels,
                        decoder_attention_mask=decoder_attention_mask,attention_mask=attention_mask)
      loss = output[0]

      loss.backward()
      optimizer.step()
      optimizer.zero_grad()

In [None]:
train(yodaish_english, YODA_TO_ENGLISH)

epoch  0


  f"This sequence already has {self.eos_token}. In future versions this behavior may lead to duplicated eos tokens being added."


epoch  1
epoch  2
epoch  3
epoch  4


In [None]:
train(english_yodaish, ENGLISH_TO_YODA)

epoch  0


  f"This sequence already has {self.eos_token}. In future versions this behavior may lead to duplicated eos tokens being added."


epoch  1
epoch  2
epoch  3
epoch  4


In [None]:
t5_model.train()

epochs = 5

for epoch in range(epochs):
  print ("epoch ",epoch)
  for input,output in true_false_adjective_tuples:
    input_sent = "falsify: "+input+ " </s>"
    ouput_sent = output+" </s>"

    tokenized_inp = tokenizer.encode_plus(input_sent,  max_length=96, pad_to_max_length=True,return_tensors="pt").to('cuda')
    tokenized_output = tokenizer.encode_plus(ouput_sent, max_length=96, pad_to_max_length=True,return_tensors="pt").to('cuda')


    input_ids  = tokenized_inp["input_ids"]
    attention_mask = tokenized_inp["attention_mask"]

    lm_labels= tokenized_output["input_ids"]
    decoder_attention_mask=  tokenized_output["attention_mask"]


    # the forward function automatically creates the correct decoder_input_ids
    output = t5_model(input_ids=input_ids, labels=lm_labels,
                      decoder_attention_mask=decoder_attention_mask,attention_mask=attention_mask)
    loss = output[0]

    loss.backward()
    optimizer.step()
    optimizer.zero_grad()




epoch  0
epoch  1
epoch  2
epoch  3
epoch  4


## 4. Test model

In [None]:
def test_text(s, init_words):
  test_sent = init_words+' '+ s +' </s>'
  test_tokenized = tokenizer.encode_plus(test_sent, return_tensors="pt").to('cuda')

  test_input_ids  = test_tokenized["input_ids"]
  test_attention_mask = test_tokenized["attention_mask"]

  t5_model.eval()
  beam_outputs = t5_model.generate(
      input_ids=test_input_ids,attention_mask=test_attention_mask,
      max_length=96,
      early_stopping=True,
      num_beams=10,
      num_return_sequences=3,
      no_repeat_ngram_size=2
  )

  for beam_output in beam_outputs:
      sent = tokenizer.decode(beam_output, skip_special_tokens=True,clean_up_tokenization_spaces=True)
      print (sent)

In [None]:
# This sentence's aircraft are not in any of the training set
test_text("Named must be your fear before banish it you can",YODA_TO_ENGLISH)

  f"This sequence already has {self.eos_token}. In future versions this behavior may lead to duplicated eos tokens being added."


Your fear must be before banish it you can.
Named must your fear be before banish it you can.
Named must be your fear be before banish it you can.


In [None]:
# This sentence's aircraft are not in any of the training set
test_text("When nine hundred years old you reach, look as good you will not",YODA_TO_ENGLISH)

  f"This sequence already has {self.eos_token}. In future versions this behavior may lead to duplicated eos tokens being added."


You will not look as good when you reach nine hundred years old
You will not look good when you reach nine hundred years old
You will not look when you reach nine hundred years old


In [None]:
# This sentence's aircraft are not in any of the training set
test_text("Truly wonderful, the mind of a child is.",YODA_TO_ENGLISH)

  f"This sequence already has {self.eos_token}. In future versions this behavior may lead to duplicated eos tokens being added."


The mind of a child is truly wonderful
Truly wonderful the mind of a child is
the mind of a child is truly wonderful


In [None]:
test_text("Luminous beings are we…not this crude matter",YODA_TO_ENGLISH)

  f"This sequence already has {self.eos_token}. In future versions this behavior may lead to duplicated eos tokens being added."


We are not this crude matter
We are Luminous beings not this crude matter
We are brilliant beings, not this crude matter


In [None]:
test_text("I am sick of you",ENGLISH_TO_YODA)

  f"This sequence already has {self.eos_token}. In future versions this behavior may lead to duplicated eos tokens being added."


Sick of you, I am
Saill of you, I am
Sick of you I am


In [None]:
wt5_model.save_pretrained("./yoda_translator")

In [None]:
!tar -czvf model.zip /content/sample_data/*

tar: Removing leading `/' from member names
/content/sample_data/anscombe.json
/content/sample_data/california_housing_test.csv
/content/sample_data/california_housing_train.csv
/content/sample_data/config.json
/content/sample_data/mnist_test.csv
/content/sample_data/mnist_train_small.csv
/content/sample_data/pytorch_model.bin
/content/sample_data/README.md


In [None]:
!ls -lh /content

total 771M
-rw-r--r-- 1 root root 771M May 18 05:50 model.zip
drwxr-xr-x 1 root root 4.0K May 18 05:48 sample_data
drwxr-xr-x 2 root root 4.0K May 18 05:48 save_model


In [None]:
torch.save(t5_model, "./a")

PermissionError: [Errno 13] Permission denied: './a'