<a href="https://colab.research.google.com/github/samin9796/arg2keypoint/blob/main/Few_shot_learning_with_T5.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

 **Few shot text generation with T5 Transformer**

Author: Ramsri Goutham Golla

Linkedin : https://www.linkedin.com/in/ramsrig/

Twitter: https://twitter.com/ramsri_goutham

## 1. Install libraries

In [None]:
!pip install transformers==2.9.0

Collecting transformers==2.9.0
  Downloading transformers-2.9.0-py3-none-any.whl (635 kB)
[K     |████████████████████████████████| 635 kB 4.9 MB/s 
Collecting tokenizers==0.7.0
  Downloading tokenizers-0.7.0-cp37-cp37m-manylinux1_x86_64.whl (5.6 MB)
[K     |████████████████████████████████| 5.6 MB 23.0 MB/s 
[?25hCollecting sacremoses
  Downloading sacremoses-0.0.53.tar.gz (880 kB)
[K     |████████████████████████████████| 880 kB 16.6 MB/s 
[?25hCollecting sentencepiece
  Downloading sentencepiece-0.1.96-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
[K     |████████████████████████████████| 1.2 MB 33.3 MB/s 
Building wheels for collected packages: sacremoses
  Building wheel for sacremoses (setup.py) ... [?25l[?25hdone
  Created wheel for sacremoses: filename=sacremoses-0.0.53-py3-none-any.whl size=895260 sha256=7482abd8eb22c6d6eae6f035dce5f336dcd8818854279879d5cbab0227b27464
  Stored in directory: /root/.cache/pip/wheels/87/39/dd/a83eeef36d0bf98e7a4d1933a

In [None]:
# Check we have a GPU and check the memory size of the GUP
!nvidia-smi

Mon May  9 15:01:35 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   66C    P8    31W / 149W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

## 2. Prepare Model

In [None]:

import random
import pandas as pd
import numpy as np
import torch
from torch.utils.data import Dataset, DataLoader

from transformers import (
    AdamW,
    T5ForConditionalGeneration,
    T5Tokenizer,
    get_linear_schedule_with_warmup
)

def set_seed(seed):
  random.seed(seed)
  np.random.seed(seed)
  torch.manual_seed(seed)

set_seed(42)

In [None]:
tokenizer = T5Tokenizer.from_pretrained('t5-base')
t5_model1 = T5ForConditionalGeneration.from_pretrained('t5-base')


In [None]:
# optimizer
no_decay = ["bias", "LayerNorm.weight"]
optimizer_grouped_parameters = [
    {
        "params": [p for n, p in t5_model1.named_parameters() if not any(nd in n for nd in no_decay)],
        "weight_decay": 0.0,
    },
    {
        "params": [p for n, p in t5_model1.named_parameters() if any(nd in n for nd in no_decay)],
        "weight_decay": 0.0,
    },
]
optimizer = AdamW(optimizer_grouped_parameters, lr=3e-4, eps=1e-8)



In [None]:
# dataset preparation

arg_keypoint_tuples = [
                               ("a person should have the right to be able to choose if they want to live or die.","Assisted suicide reduces suffering", "unmatched"),
                               ("assisted suicide allows one to end terminal pain and suffering and should be allowed to continue.","Assisted suicide gives dignity to the person that wants to commit it", "unmatched"),
                               ("assisted suicide allows people who have terrible health conditions to die with dignity. if assisted suicide is a crime they may have to endure many more years living in agony.","People should have the freedom to choose to end their life", "unmatched"),
                               ("assisted suicide allows terminally ill people to die with dignity and should not be criminalized.","People should have the freedom to choose to end their life", "unmatched"),
                               ("assisted suicide allows terminally ill people to die with dignity and should not be criminalized.","The terminally ill would benefit from assisted suicide", "unmatched"),
                               ("assisted suicide helps terminally ill people end their suffering.","Assisted suicide gives dignity to the person that wants to commit it", "unmatched"),
                               ("Assisted suicide helps those who are in pain due to a devastating disease end their own lives on their own terms.","Assisted suicide gives dignity to the person that wants to commit it", "unmatched"),
                               ("assisted suicide is a human right and would spare the terminally i'll pain.","Assisted suicide gives dignity to the person that wants to commit it", "unmatched"),
                               ("helping someone commit suicide is like killing them yourself.","Assisted suicide is akin to killing someone","matched"),
                               ("helping someone kill themself is murder", "Assisted suicide is akin to killing someone", "matched"),
                               ("the vow of celibacy is unnatural and causes harm.","Celibacy is unhealthy/unnatural","matched"),
                               ("marriage is a sacred bond that is highly regarded in various religions.","Marriage is important for people, either generally or because of religious/traditional reasons","matched"),
                               ("abolishing intellectual property rights would damage the economy since there would be less incentive to bring new and innovative products to market","Intellectual property rights incentivize investment in developing new products","matched")

]

## 3. Train Loop

In [None]:
t5_model1.train()

epochs = 5

for epoch in range(epochs):
  print ("epoch ",epoch)
  for input1, input2, output in arg_keypoint_tuples:
    input_sent = "The argument: "+input1+ " and the keypoint: "+input2+ "are </s>"
    ouput_sent = output+" </s>"

    tokenized_inp = tokenizer.encode_plus(input_sent,  max_length=96, pad_to_max_length=True,return_tensors="pt")
    tokenized_output = tokenizer.encode_plus(ouput_sent, max_length=96, pad_to_max_length=True,return_tensors="pt")


    input_ids  = tokenized_inp["input_ids"]
    attention_mask = tokenized_inp["attention_mask"]

    lm_labels= tokenized_output["input_ids"]
    decoder_attention_mask=  tokenized_output["attention_mask"]


    # the forward function automatically creates the correct decoder_input_ids
    output = t5_model1(input_ids=input_ids, lm_labels=lm_labels,decoder_attention_mask=decoder_attention_mask,attention_mask=attention_mask)
    loss = output[0]

    loss.backward()
    optimizer.step()
    optimizer.zero_grad()




epoch  0
epoch  1
epoch  2
epoch  3
epoch  4


## 4. Test model

In [None]:
test_sent = 'The argument: It can be counterproductive to subject a child to the side effects of vaccines. and the keypoint: The parents and not the state should decide are </s>'
test_tokenized = tokenizer.encode_plus(test_sent, return_tensors="pt")

test_input_ids  = test_tokenized["input_ids"]
test_attention_mask = test_tokenized["attention_mask"]

t5_model1.eval()
beam_outputs = t5_model1.generate(
    input_ids=test_input_ids,attention_mask=test_attention_mask,
    max_length=64,
    early_stopping=True,
    num_beams=10,
    num_return_sequences=3,
    no_repeat_ngram_size=2
)

for beam_output in beam_outputs:
    sent = tokenizer.decode(beam_output, skip_special_tokens=True,clean_up_tokenization_spaces=True)
    print (sent)

  beam_id = beam_token_id // vocab_size


unmatched
matched
a child's innate ability to learn is matched


In [None]:
test_sent = 'The argument: Death with dignity is having the option to decide your own fate. You are able to take control of your own fate. It allows you to leave the world with your true self intact. means: </s>'
test_tokenized = tokenizer.encode_plus(test_sent, return_tensors="pt")

test_input_ids  = test_tokenized["input_ids"]
test_attention_mask = test_tokenized["attention_mask"]

t5_model1.eval()
beam_outputs = t5_model1.generate(
    input_ids=test_input_ids,attention_mask=test_attention_mask,
    max_length=64,
    early_stopping=True,
    num_beams=10,
    num_return_sequences=3,
    no_repeat_ngram_size=2
)

for beam_output in beam_outputs:
    sent = tokenizer.decode(beam_output, skip_special_tokens=True,clean_up_tokenization_spaces=True)
    print (sent)

  beam_id = beam_token_id // vocab_size


False
True
Argument: Death with dignity is having the option to decide your own fate
