# Fine-tuning the 🤗 t5 model on a end-to-end question generation (answer agnostic)
In this notebook, we're going to learn to fine-tune the 🤗 t5 model to **generate questions without providing answers** and use [Weight and Biases](https://wandb.ai/site) for measurements and logs.

### Dataset 🛢️
As dataset we use [SQuAD v1.1](https://rajpurkar.github.io/SQuAD-explorer/): *Stanford Question Answering Dataset is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.*

### Example 🤖
The process:
- You provide the context (the text you want to generate questions from).
- The model generates multiple questions simultaneously.

`Context: 
"Python is an interpreted, high-level, general-purpose programming language. Created by Guido van Rossum and first released in 1991, Python's design philosophy emphasizes code readability with its notable use of significant whitespace."`

`Questions:`

- `Who created Python?`,
- `When was Python first released?`
- `What is Python's design philosophy?`

### Requirements
- This is **not an introduction** to Hugging Face Transformer library, it's a **hands-on on how to fine tune t5** for this specific task. 
- If you're not familiar with Hugging Face, **you can watch the HF Course on Transformer models** (it's free) [here](https://huggingface.co/course/chapter1)
- 🏗️ This notebook is a work in progress, some elements (check todo at the end) will change.

### Sources 📚
- [Transformer-based End-to-End Question Generation's Paper](https://arxiv.org/pdf/2005.01107v1.pdf)
- [Patil Suraj's work on question generation](https://github.com/patil-suraj/question_generation/tree/bffa0a51e3ecba3922cafd13f424521135677303)

## Download and install the packages 📦

In [None]:
!pip install transformers
!pip install datasets
!pip install sentencepiece

!pip install tqdm

!pip install wandb

!sudo apt-get install git-lfs

In [None]:
!pip install wandb

In [1]:
import torch

from datasets import load_dataset, load_metric, list_metrics
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, DataCollator, T5ForConditionalGeneration, T5TokenizerFast

from tqdm import tqdm

from typing import Dict, List, Optional

import dataclasses
from dataclasses import dataclass, field

import logging
import os
import sys

import numpy as np
import torch

from huggingface_hub import notebook_login

from transformers import (
    T5ForConditionalGeneration, 
    T5Tokenizer, 
    EvalPrediction,
    DataCollator,
    Trainer,
    TrainingArguments)



  from .autonotebook import tqdm as notebook_tqdm


- Connect to Weight and Biases:

In [4]:
import wandb
wandb.login()

%env WANDB_PROJECT=flant5-end-to-end-questions-generation

Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.
[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
[34m[1mwandb[0m: Paste an API key from your profile and hit enter, or press ctrl+c to quit:[34m[1mwandb[0m: [32m[41mERROR[0m API key must be 40 characters long, yours was 1
[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
[34m[1mwandb[0m: Paste an API key from your profile and hit enter, or press ctrl+c to quit:[34m[1mwandb[0m: Paste an API key from your profile and hit enter, or press ctrl+c to quit:[34m[1mwandb[0m: Paste an API key from your profile and hit en

## Connect to Hugging Face 🤗
- To be able to share the model in the Hub, we need to **store our authentification token from the Hugging Face website**.


In [None]:
notebook_login()

Login successful
Your token has been saved to /root/.huggingface/token
[1m[31mAuthenticated through git-crendential store but this isn't the helper defined on your machine.
You will have to re-authenticate when pushing to the Hugging Face Hub. Run the following command in your terminal to set it as the default

git config --global credential.helper store[0m


- Then install Git-lfs and add your mail and username to the config

In [None]:
!git config --global user.email "youremail@gmail.com"
!git config --global user.name "userName"

## Loading the dataset 📚
- We use [SQuAD v1.1](https://rajpurkar.github.io/SQuAD-explorer/), but a **modified version** where questions for a context are **concatenated**.
- You need to [download the file here](https://www.simoninithomas.com/hfdataset/squad_modified_for_t5_qg.zip), unzip it and upload it in the next cell.

In [None]:
files.upload()

Saving squad_modified_for_t5_qg.py to squad_modified_for_t5_qg.py


{'squad_modified_for_t5_qg.py': b'# coding=utf-8\r\n# Copyright 2020 The TensorFlow Datasets Authors and the HuggingFace Datasets Authors.\r\n#\r\n# Licensed under the Apache License, Version 2.0 (the "License");\r\n# you may not use this file except in compliance with the License.\r\n# You may obtain a copy of the License at\r\n#\r\n#     http://www.apache.org/licenses/LICENSE-2.0\r\n#\r\n# Unless required by applicable law or agreed to in writing, software\r\n# distributed under the License is distributed on an "AS IS" BASIS,\r\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\n# See the License for the specific language governing permissions and\r\n# limitations under the License.\r\n\r\n# Lint as: python3\r\n"""SQUAD: The Stanford Question Answering Dataset."""\r\n"""Modified version for fine tuning T5 on Question Generation """\r\n\r\nimport json\r\n\r\nimport datasets\r\n#from datasets.tasks import QuestionAnsweringExtractive\r\n\r\nlogger = datasets.l

In [2]:
raw_dataset = load_dataset("squad_modified_for_t5_qg.py")
#raw_dataset = load_dataset("squad")
#raw_dataset = load_dataset("race",'middle') #RACE dataset contains question with `_`.
#  e.g.  'question': 'Bae Seul-Ki   _   in the MV of the song according to the passage.', 'options': ['sang', 'danced', 'cried', 'laughed']}

Found cached dataset squad_modified_for_t5_qg (/home/ubuntu/.cache/huggingface/datasets/squad_modified_for_t5_qg/plain_text/1.0.0/02ae0815e8483cc76579286179faeb8c8fdbdd328e6741f5c465d9b0bddb8a77)
100%|██████████| 2/2 [00:00<00:00, 563.49it/s]


- Let see one example of the dataset:

In [3]:
raw_dataset["train"][0]

{'context': 'generate questions: Architecturally, the school has a Catholic character. Atop the Main Building\'s gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend "Venite Ad Me Omnes". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.',
 'questions': 'To whom did the Virgin Mary allegedly appear in 1858 in Lourdes France? {sep_token} What is in front of the Notre Dame Main Building? {sep_token} The Basilica of the Sacred heart at Notre Dame is beside to which structure? {sep_token} What is the Grotto

In [None]:
for i in raw_dataset["train"]:
    

## Preprocessing the data 🔧
- We first load the model: `"t5-base"` and the `T5TokenizerFast` tokenizer


In [6]:
checkpoint = "google/flan-t5-base"
model = T5ForConditionalGeneration.from_pretrained(checkpoint)
tokenizer = T5TokenizerFast.from_pretrained(checkpoint)

NameError: name 'T5ForConditionalGeneration' is not defined

- Because we separate each of our questions with `<sep>` token, we need to add it to the tokenizer tokens.

In [5]:
tokenizer.sep_token = '<sep>'

In [6]:
tokenizer.add_tokens(['<sep>'])
model.resize_token_embeddings(len(tokenizer))

Embedding(32101, 768)

In [7]:
# Check the sep_token_id to verify that it was added to the tokenizer
tokenizer.sep_token_id

32100

- Now, we need to preprocess the data in 3 steps:
1. `add_eos_examples`: Add `</s>` (end of string) at the end of each context and each questions combination.
2. `add_special_tokens`: Replace `{sep_token}` to `<sep>` token between each question.
3. `convert_to_features`: Tokenize the examples with 

In [8]:
max_input_length =  512
max_target_length = 64

In [9]:
# tokenize the examples
def convert_to_features(example_batch):

    input_encodings = tokenizer.batch_encode_plus(example_batch['context'], 
                                                  max_length=max_input_length, 
                                                  add_special_tokens=True,
                                                  truncation=True, 
                                                  pad_to_max_length=True)
    
    target_encodings = tokenizer.batch_encode_plus(example_batch['questions'], 
                                                   max_length=max_target_length, 
                                                   add_special_tokens=True,
                                                   truncation=True, pad_to_max_length=True)
                                                   
    encodings = {
        'input_ids': input_encodings['input_ids'], 
        'attention_mask': input_encodings['attention_mask'],
        'decoder_input_ids': target_encodings['input_ids']
        ,'decoder_attention_mask': target_encodings['attention_mask']
    }

    return encodings

def add_eos_examples(example):
  example['context'] = example['context'] + " </s>"
  example['questions'] = example['questions'] + " </s>"
  return example


def add_special_tokens(example):
  example['questions'] = example['questions'].replace("{sep_token}", '<sep>')
  return example

In [10]:
tokenized_dataset  = raw_dataset.map(add_eos_examples)
tokenized_dataset = tokenized_dataset.map(add_special_tokens)
tokenized_dataset  = tokenized_dataset.map(convert_to_features,  batched=True)

Loading cached processed dataset at /home/ubuntu/.cache/huggingface/datasets/squad_modified_for_t5_qg/plain_text/1.0.0/02ae0815e8483cc76579286179faeb8c8fdbdd328e6741f5c465d9b0bddb8a77/cache-8455c1ae1804434a.arrow
Loading cached processed dataset at /home/ubuntu/.cache/huggingface/datasets/squad_modified_for_t5_qg/plain_text/1.0.0/02ae0815e8483cc76579286179faeb8c8fdbdd328e6741f5c465d9b0bddb8a77/cache-9b4b3b88a22547bf.arrow
Loading cached processed dataset at /home/ubuntu/.cache/huggingface/datasets/squad_modified_for_t5_qg/plain_text/1.0.0/02ae0815e8483cc76579286179faeb8c8fdbdd328e6741f5c465d9b0bddb8a77/cache-6de7eb60155bfaaa.arrow
Loading cached processed dataset at /home/ubuntu/.cache/huggingface/datasets/squad_modified_for_t5_qg/plain_text/1.0.0/02ae0815e8483cc76579286179faeb8c8fdbdd328e6741f5c465d9b0bddb8a77/cache-4ec7b805a3acced8.arrow
100%|██████████| 19/19 [00:06<00:00,  2.95ba/s]
100%|██████████| 3/3 [00:00<00:00,  3.51ba/s]


In [11]:
tokenized_dataset["train"][0]["context"]

'generate questions: Architecturally, the school has a Catholic character. Atop the Main Building\'s gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend "Venite Ad Me Omnes". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary. </s>'

- Finally, we remove the useless columns `context` and `questions` and we split the tokenized_dataset between train and validation dataset.

In [12]:
tokenized_dataset = tokenized_dataset.remove_columns(
    ["context", "questions"]
)

train_dataset = tokenized_dataset["train"]
valid_dataset = tokenized_dataset["validation"]

columns = ['input_ids', 'decoder_input_ids', 'attention_mask', 'decoder_attention_mask']
train_dataset.set_format(type='torch', columns=columns)
valid_dataset.set_format(type='torch', columns=columns)

In [13]:
torch.save(train_dataset, 'SQUAD1_train_data.pt')
torch.save(valid_dataset, 'SQUAD1_valid_data.pt')

## Fine-Tuning the t5 model 🧮
- We built a custom DataCollator. A DataCollator **will form a batch using a list of dataset elements as input.** 

In [15]:
# This dataclass implementation is taken from Suraj Patil: https://github.com/patil-suraj/question_generation
@dataclass
class T2TDataCollator():
  def __call__(self, batch: List) -> Dict[str, torch.Tensor]:
    """
    Take a list of samples from a Dataset and collate them into a batch.
    Returns:
    A dictionary of tensors
    """
    
    input_ids = torch.stack([example['input_ids'] for example in batch])
    lm_labels = torch.stack([example['decoder_input_ids'] for example in batch])
    lm_labels[lm_labels[:, :] == 0] = -100 
    attention_mask = torch.stack([example['attention_mask'] for example in batch])
    decoder_attention_mask = torch.stack([example['decoder_attention_mask'] for example in batch])
    
    return {
        'input_ids': input_ids, 
        'attention_mask': attention_mask,
        'labels': lm_labels, 
        'decoder_attention_mask': decoder_attention_mask
    }

- We define the `TrainingArguments` object that contains every hyperparameters (learning_rate, nb of epochs...)

In [17]:
training_args = TrainingArguments(output_dir="./gdrive/My Drive/models", 
                                  per_device_train_batch_size=4, 
                                  per_device_eval_batch_size=4,
                                  gradient_accumulation_steps=16,
                                  learning_rate=1e-4, 
                                  num_train_epochs=7,
                                  logging_steps=100,
                                  run_name="end2end-questions-generation",
                                  evaluation_strategy="steps",
                                  save_steps=500,
                                  report_to="tensorboard",
                                  #push_to_hub=True,
                                  #push_to_hub_model_id="t5-end2end-questions-generation"
                                  )

In [None]:
%load_ext tensorboard
%tensorboard --logdir logs/fit

In [18]:
logger = logging.getLogger(__name__)

# Initialize our Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=valid_dataset,
    data_collator=T2TDataCollator()
)

# Training
trainer.train()

# When training is done, we push the fine-tuned model to the Hub
#trainer.push_to_hub("t5-end2end-questions-generation")

#wandb.finish()

***** Running training *****
  Num examples = 18896
  Num Epochs = 7
  Instantaneous batch size per device = 4
  Total train batch size (w. parallel, distributed & accumulation) = 64
  Gradient Accumulation steps = 16
  Total optimization steps = 2065
  Number of trainable parameters = 247536384


Step,Training Loss,Validation Loss
100,2.11,1.575641
200,1.6966,1.543346
300,1.661,1.532504
400,1.5904,1.519844
500,1.5811,1.497026
600,1.5675,1.513585
700,1.5262,1.492449
800,1.5222,1.500104
900,1.5161,1.501535
1000,1.4843,1.490878


***** Running Evaluation *****
  Num examples = 2067
  Batch size = 4
***** Running Evaluation *****
  Num examples = 2067
  Batch size = 4
***** Running Evaluation *****
  Num examples = 2067
  Batch size = 4
***** Running Evaluation *****
  Num examples = 2067
  Batch size = 4
***** Running Evaluation *****
  Num examples = 2067
  Batch size = 4
Saving model checkpoint to ./gdrive/My Drive/models/checkpoint-500
Configuration saved in ./gdrive/My Drive/models/checkpoint-500/config.json
Model weights saved in ./gdrive/My Drive/models/checkpoint-500/pytorch_model.bin
***** Running Evaluation *****
  Num examples = 2067
  Batch size = 4
***** Running Evaluation *****
  Num examples = 2067
  Batch size = 4
***** Running Evaluation *****
  Num examples = 2067
  Batch size = 4
***** Running Evaluation *****
  Num examples = 2067
  Batch size = 4
***** Running Evaluation *****
  Num examples = 2067
  Batch size = 4
Saving model checkpoint to ./gdrive/My Drive/models/checkpoint-1000
Configura

TrainOutput(global_step=2065, training_loss=1.5312545887205848, metrics={'train_runtime': 11379.5298, 'train_samples_per_second': 11.624, 'train_steps_per_second': 0.181, 'total_flos': 9.055484356696474e+16, 'train_loss': 1.5312545887205848, 'epoch': 7.0})

In [19]:
torch.save(model, 'flanT5-base-finetuned_SQUAD1.pth')

In [1]:
%load_ext tensorboard
%tensorboard --logdir logs/fit

In [5]:
%tensorboard --logdir /home/ubuntu/Questions_generation/gdrive/My Drive/models/runs/Dec19_21-10-00_NOAM-GPU

ERROR: Failed to launch TensorBoard (exited with 2).
Contents of stderr:
TensorFlow installation not found - running with reduced feature set.
usage: tensorboard [-h] [--helpfull] [--logdir PATH] [--logdir_spec PATH_SPEC]
                   [--host ADDR] [--bind_all] [--port PORT]
                   [--reuse_port BOOL] [--load_fast {false,auto,true}]
                   [--extra_data_server_flags EXTRA_DATA_SERVER_FLAGS]
                   [--grpc_creds_type {local,ssl,ssl_dev}]
                   [--grpc_data_provider PORT] [--purge_orphaned_data BOOL]
                   [--db URI] [--db_import] [--inspect] [--version_tb]
                   [--tag TAG] [--event_file PATH] [--path_prefix PATH]
                   [--window_title TEXT] [--max_reload_threads COUNT]
                   [--reload_interval SECONDS] [--reload_task TYPE]
                   [--reload_multifile BOOL]
                   [--reload_multifile_inactive_secs SECONDS]
                   [--generic_data TYPE]
            

In [3]:
!kill 746845

## Testing the model 📝
- You can now load the model from HuggingFace and test it.

In [None]:
from transformers import T5ForConditionalGeneration, T5TokenizerFast

hfmodel = T5ForConditionalGeneration.from_pretrained("ThomasSimonini/t5-end2end-question-generation")

https://huggingface.co/ThomasSimonini/t5-end2end-question-generation/resolve/main/config.json not found in cache or force_download set to True, downloading to /root/.cache/huggingface/transformers/tmp6oatqhgd


Downloading:   0%|          | 0.00/1.42k [00:00<?, ?B/s]

storing https://huggingface.co/ThomasSimonini/t5-end2end-question-generation/resolve/main/config.json in cache at /root/.cache/huggingface/transformers/725b67a54a33b8368edb40e14fc2d26c696b4354f1714ebe937e6850c6f4c6eb.4edcc66a57a8e743bb658b9203a2f0622cdbd48c300484c35e3879da7466d60b
creating metadata file for /root/.cache/huggingface/transformers/725b67a54a33b8368edb40e14fc2d26c696b4354f1714ebe937e6850c6f4c6eb.4edcc66a57a8e743bb658b9203a2f0622cdbd48c300484c35e3879da7466d60b
loading configuration file https://huggingface.co/ThomasSimonini/t5-end2end-question-generation/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/725b67a54a33b8368edb40e14fc2d26c696b4354f1714ebe937e6850c6f4c6eb.4edcc66a57a8e743bb658b9203a2f0622cdbd48c300484c35e3879da7466d60b
Model config T5Config {
  "_name_or_path": "t5-base",
  "architectures": [
    "T5ForConditionalGeneration"
  ],
  "d_ff": 3072,
  "d_kv": 64,
  "d_model": 768,
  "decoder_start_token_id": 0,
  "dropout_rate": 0.1,
  "eo

Downloading:   0%|          | 0.00/892M [00:00<?, ?B/s]

storing https://huggingface.co/ThomasSimonini/t5-end2end-question-generation/resolve/main/pytorch_model.bin in cache at /root/.cache/huggingface/transformers/b3ab715ac4d5a7e4fad3a8bf63ae67dd211b36254f503656006b1f63e0da745d.62e907d42d8a4405a3bc3fb77d990be88dff6e888bba9ad3c0816ea2a6107878
creating metadata file for /root/.cache/huggingface/transformers/b3ab715ac4d5a7e4fad3a8bf63ae67dd211b36254f503656006b1f63e0da745d.62e907d42d8a4405a3bc3fb77d990be88dff6e888bba9ad3c0816ea2a6107878
loading weights file https://huggingface.co/ThomasSimonini/t5-end2end-question-generation/resolve/main/pytorch_model.bin from cache at /root/.cache/huggingface/transformers/b3ab715ac4d5a7e4fad3a8bf63ae67dd211b36254f503656006b1f63e0da745d.62e907d42d8a4405a3bc3fb77d990be88dff6e888bba9ad3c0816ea2a6107878
All model checkpoint weights were used when initializing T5ForConditionalGeneration.

All the weights of T5ForConditionalGeneration were initialized from the model checkpoint at ThomasSimonini/t5-end2end-question-g

In [None]:
def hf_run_model(input_string, **generator_args):
  generator_args = {
  "max_length": 256,
  "num_beams": 4,
  "length_penalty": 1.5,
  "no_repeat_ngram_size": 3,
  "early_stopping": True,
  }
  input_string = "generate questions: " + input_string + " </s>"
  input_ids = tokenizer.encode(input_string, return_tensors="pt")
  res = hfmodel.generate(input_ids, **generator_args)
  output = tokenizer.batch_decode(res, skip_special_tokens=True)
  output = [item.split("<sep>") for item in output]
  return output

In [None]:
text = "Forrest Gump is a 1994 American comedy-drama film directed by Robert Zemeckis and written by Eric Roth. \
It is based on the 1986 novel of the same name by Winston Groom and stars Tom Hanks, Robin Wright, Gary Sinise, \
Mykelti Williamson and Sally Field. The story depicts several decades in the life of Forrest Gump (Hanks), \
a slow-witted but kind-hearted man from Alabama who witnesses and unwittingly influences several defining \
historical events in the 20th century United States. The film differs substantially from the novel."

In [None]:
hf_run_model(text)

[['Who directed the 1994 film Forrest Gump?',
  ' Who wrote the 1994 movie?',
  ' What is the film based on?',
  ' Which movie stars Tom Hanks, Robin Wright and Gary Sinise?',
  '']]

In [None]:
text= "The abolition of feudal privileges by the National Constituent Assembly on 4 August 1789 and the Declaration \
of the Rights of Man and of the Citizen (La Déclaration des Droits de l'Homme et du Citoyen), drafted by Lafayette \
with the help of Thomas Jefferson and adopted on 26 August, paved the way to a Constitutional Monarchy \
(4 September 1791 – 21 September 1792). Despite these dramatic changes, life at the court continued, while the situation \
in Paris was becoming critical because of bread shortages in September. On 5 October 1789, a crowd from Paris descended upon Versailles \
and forced the royal family to move to the Tuileries Palace in Paris, where they lived under a form of house arrest under \
the watch of Lafayette's Garde Nationale, while the Comte de Provence and his wife were allowed to reside in the \
Petit Luxembourg, where they remained until they went into exile on 20 June 1791."

In [None]:
hf_run_model(text)

[['When did the National Constituent Assembly abolish feudal privileges?',
  ' Who drafted the Declaration of the Rights of Man and of the Citizen?',
  ' When was the Constitutional Monarchy established?',
  ' What was the name of the Declaration that paved the way to a constitutional monarchy?',
  '']]

## What's next?
- **This notebook is a work in progress** , the first next step is to add evaluation test using Rouge metrics, if you don't know about this metric, check this [article](https://towardsdatascience.com/the-ultimate-performance-metric-in-nlp-111df6c64460)
- As explained in [the paper](https://arxiv.org/pdf/2005.01107v1.pdf), most of the question are closed questions. This is explained because SQuAD contains 88.26% identification type questions in the training set => **you can improve the model by adding other datasets, by first trying SQuAD v2**
- What about making a webapp? Check [Spaces](https://huggingface.co/spaces)


## My TODO:
- Add Rouge eval test
- Wandb didn't recorded training loss but only evaluation loss.
- Add SQuAD v2
- Pushing the SQuAD version for question generation on HF Hub (instead of using this upload .py file system that's not scalable)
- Solve the issue with Accelerated Inference API => because of the tokenizer

✅ Improve the postprocessing of questions

✅ Make a Spaces web app?
