<a href="https://colab.research.google.com/github/hduongck/AI-ML-Learning/blob/master/Huggingface/3_Question_Answering.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

https://rsilveira79.github.io/fermenting_gradients/machine_learning/nlp/pytorch/pytorch-transformer-squad/

# Step-by-step guide to finetune and use question and answering models with pytorch-transformers

I have used question and answering systems for some time now, and I’m really impressed how these algorithms evolved recently. My first interaction with QA algorithms was with the BiDAF model (Bidirectional Attention Flow) 1 from the great AllenNLP team. It was back in 2017, and ELMo embeddings 2 were not even used in this BiDAF model (I believe they were using GLove vectors in this first model). Since then, a lot of stuff is happened in the NLP arena, such as the Transformer 3, BERT 4 and the many other members of the Sesame Street family (now there are a whole BERT-like-family such as Facebook RoBERTa 4, VilBERT and maybe(why not?) one day, DilBERT).

There are lots of great materias out there (see Probe Further section for more details), so it will be much easier to go on and watch these awesome video materials instead of detailing each model in a blog post.

**I would really want to spend time in the practical usage of question and answering models, as they can be very helpful for real-life applications (besides some challenges that will be addressed in other posts - such as model size, response time, model quantization/pruning, etc).**

In this regard, all the ML community should give a massive shout-out to Hugging Face team. They are really pushing the limits to make the latest and greatest algorithms available for the broader community, and it is really cool to see how their project is growing rapidly in github (at the time I’m writing this they already surpassed more than 10k ⭐️on github for the pytorch-transformer repo, for example). I will focus on SQuAD 1.1 dataset, more details on how fine-tune/use these models with SQuAD 2.0 dataset will be described in further posts.

## Inside pytorch-transformers

The pytorch-transformers lib has some special classes, and the nice thing is that they try to be consistent with this architecture independently of the model (BERT, XLNet, RoBERTa, etc). These 3 important classes are:

- **Config** → this is the class that defines all the configurations of the model in hand, such as number of hidden layers in Transformer, number of attention heads in the Transformer encoder, activation function, dropout rate, etc. Usually, there are 2 default configurations [base, large], but it is possible to tune the configurations to have different models. The file format of the configuration file is a .json file.

- **Tokenizer** → the tokenizer class deals with some linguistic details of each model class, as specific tokenization types are used (such as WordPiece for BERT or SentencePiece for XLNet). It also handles begin-of-sentence (bos), end-of-sentence (eod), unknown, separation, padding, mask and any other special tokens. The tokenizer file can be loaded as a .txt file.

- **Model** → finally, we need to specify the model class. In this specific case, we are going to use special classes for Question and Answering [BertForQuestionAnswering, XLNetForQuestionAnswering], but there are other classes for different downstream tasks that can be used. These downstream classes inherit [BertModel, XLNetModel] classes, which will then go into more specific details (embedding type, Transformer configuration, etc). The weights of a fine-tuned downstream task mode are stored in a .bin file.



## Download Fine-tuned models

[BERT MODEL for SQuAD 1.1](https://drive.google.com/open?id=1OnvT5sKgi0WVWTXnTaaOPTE5KIh-xg_E)

[XLNet Model for SQuAD 1.1](https://drive.google.com/open?id=1e7wu9yI-rGkSzjoPU2TpCC9FMvlKvl8R)


Watch out! The BERT model I downloaded directly from Hugging Face repo, the XLNet model I fine-tuned myself for 3 epochs in a Nvidia 1080ti. Also, I noticed that the XLNet model maybe needs some more training - see Results section



## Finetuning scripts

To run the fine-tuning scripts, the Hugging Face team makes available some dataset-specific files that can be found [here](https://github.com/huggingface/pytorch-transformers/tree/master/examples). These fine-tuning scripts can be highly customizable, for example by passing a config file for a model specified in .json file **e.g. --config_name xlnet_m2.json**. The examples below are showing BERT finetuning with base configuration, and xlnet configuration with specific parameters (**n_head,n_layer**). The models provided for download both use the **large** config.

### Finetuning BERT



```
python run_squad.py \
  --model_type bert \
  --model_name_or_path bert-base-cased \
  --do_train \
  --do_eval \
  --evaluate_during_training \
  --do_lower_case \
  --train_file $SQUAD_DIR/train-v1.1.json \
  --predict_file $SQUAD_DIR/dev-v1.1.json \
  --save_steps 10000 \
  --learning_rate 3e-5 \
  --num_train_epochs 5.0 \
  --max_seq_length 384 \
  --doc_stride 128 \
  --output_dir /home/roberto/tmp/finetuned_xlnet \
  --overwrite_output_dir \
  --overwrite_cache
```



### Finetuning XLNet



```
python -u run_squad.py \
  --model_type xlnet \
  --model_name_or_path xlnet-large-cased \
  --do_train \
  --do_eval \
  --config_name xlnet_m2.json \
  --evaluate_during_training \
  --do_lower_case \
  --train_file $SQUAD_DIR/train-v1.1.json \
  --predict_file $SQUAD_DIR/dev-v1.1.json \
  --save_steps 10000 \
  --learning_rate 3e-5 \
  --num_train_epochs 5.0 \
  --max_seq_length 384 \
  --doc_stride 128 \
  --per_gpu_train_batch_size 1 \
  --output_dir /home/roberto/tmp/finetuned_xlnet \
  --overwrite_output_dir \
  --overwrite_cache
```

**Config xlnet_m2.json**



```
{
  "attn_type": "bi",
  "bi_data": false,
  "clamp_len": -1,
  "d_head": 64,
  "d_inner": 4096,
  "d_model": 1024,
  "dropatt": 0.1,
  "dropout": 0.1,
  "end_n_top": 5,
  "ff_activation": "gelu",
  "finetuning_task": null,
  "init": "normal",
  "init_range": 0.1,
  "init_std": 0.02,
  "initializer_range": 0.02,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "mem_len": null,
  "n_head": 16,
  "n_layer": 18,
  "n_token": 32000,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "reuse_len": null,
  "same_length": false,
  "start_n_top": 5,
  "summary_activation": "tanh",
  "summary_last_dropout": 0.1,
  "summary_type": "last",
  "summary_use_proj": true,
  "torchscript": false,
  "untie_r": true
}
```



## Using the trained models

Now to the fun part: using these models for question and answering!

First things first, let’s import the model classes from pytorch-transformers

In [7]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/drive


In [1]:
!pip install pytorch-transformers

Collecting pytorch-transformers
[?25l  Downloading https://files.pythonhosted.org/packages/a3/b7/d3d18008a67e0b968d1ab93ad444fc05699403fa662f634b2f2c318a508b/pytorch_transformers-1.2.0-py3-none-any.whl (176kB)
[K     |█▉                              | 10kB 14.6MB/s eta 0:00:01[K     |███▊                            | 20kB 1.7MB/s eta 0:00:01[K     |█████▋                          | 30kB 2.5MB/s eta 0:00:01[K     |███████▍                        | 40kB 1.6MB/s eta 0:00:01[K     |█████████▎                      | 51kB 2.0MB/s eta 0:00:01[K     |███████████▏                    | 61kB 2.4MB/s eta 0:00:01[K     |█████████████                   | 71kB 2.8MB/s eta 0:00:01[K     |██████████████▉                 | 81kB 3.2MB/s eta 0:00:01[K     |████████████████▊               | 92kB 3.6MB/s eta 0:00:01[K     |██████████████████▋             | 102kB 2.7MB/s eta 0:00:01[K     |████████████████████▍           | 112kB 2.7MB/s eta 0:00:01[K     |██████████████████████▎     

In [0]:
import os
import time
import torch
from pytorch_transformers import BertConfig, BertTokenizer,BertForQuestionAnswering
from pytorch_transformers import XLNetConfig, XLNetForQuestionAnswering,XLNetTokenizer

These are the 3 important classes:

In [0]:
MODEL_CLASSES = {
    'bert': (BertConfig, BertForQuestionAnswering, BertTokenizer),
    'xlnet': (XLNetConfig, XLNetForQuestionAnswering, XLNetTokenizer)
}

I’ve made this special class to handles all the feature preparation and output formating for both BERT and XLNet, but this could be done in different ways:

In [0]:
class QuestionAnswering(object):
    def __init__(self, config_file, weight_file, tokenizer_file, model_type ):
        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        self.config_class, self.model_class, self.tokenizer_class = MODEL_CLASSES[model_type]
        self.config = self.config_class.from_json_file(config_file)
        self.model = self.model_class(self.config)
        self.model.load_state_dict(torch.load(weight_file, map_location=self.device))
        self.tokenizer = self.tokenizer_class(tokenizer_file)
        self.model_type = model_type
    
    def to_list(self, tensor):
        return tensor.detach().cpu().tolist()

    def get_reply(self, question, passage):
        self.model.eval()
        with torch.no_grad():
            input_ids, _ , tokens = self.prepare_features(question, passage)
            if self.model_type == 'bert':
                span_start,span_end= self.model(input_ids)
                answer = tokens[torch.argmax(span_start):torch.argmax(span_end)+1]
                answer = self.bert_convert_tokens_to_string(answer)
            elif self.model_type == 'xlnet':
                input_vector = {'input_ids': input_ids,
                                'start_positions': None,
                                'end_positions': None }
                outputs = self.model(**input_vector)
                answer = tokens[self.to_list(outputs[1])[0][torch.argmax(outputs[0])]:self.to_list(outputs[3])[0][torch.argmax(outputs[2])]+1]
                answer = self.xlnet_convert_tokens_to_string(answer)
        return answer
    
    def bert_convert_tokens_to_string(self, tokens):
        out_string = ' '.join(tokens).replace(' ##', '').strip()
        if '@' in tokens:
            out_string = out_string.replace(' ', '')
        return out_string

    def xlnet_convert_tokens_to_string(self, tokens):
        out_string = ''.join(tokens).replace('▁', ' ').strip()
        return out_string

    def prepare_features(self, question,  passage, max_seq_length = 300, 
                 zero_pad = False, include_CLS_token = True, include_SEP_token = True):
        ## Tokenzine Input
        tokens_a = self.tokenizer.tokenize(question)
        tokens_b = self.tokenizer.tokenize(passage)
        ## Truncate
        if len(tokens_a) > max_seq_length - 2:
            tokens_a = tokens_a[0:(max_seq_length - 2)]
        ## Initialize Tokens
        tokens = []
        if include_CLS_token:
            tokens.append(self.tokenizer.cls_token)
        ## Add Tokens and separators
        for token in tokens_a:
            tokens.append(token)
        if include_SEP_token:
            tokens.append(self.tokenizer.sep_token)
        for token in tokens_b:
            tokens.append(token)
        ## Convert Tokens to IDs
        input_ids = self.tokenizer.convert_tokens_to_ids(tokens)
        ## Input Mask 
        input_mask = [1] * len(input_ids)
        ## Zero-pad sequence lenght
        if zero_pad:
            while len(input_ids) < max_seq_length:
                input_ids.append(0)
                input_mask.append(0)
        return torch.tensor(input_ids).unsqueeze(0), input_mask, tokens

Finally we just need to instantiate these models and start using them!

### BERT:

In [0]:
'''
bert = QuestionAnswering(
    config_file =   'bert-large-cased-whole-word-masking-finetuned-squad-config.json',
    weight_file=    'bert-large-cased-whole-word-masking-finetuned-squad-pytorch_model.bin',
    tokenizer_file= 'bert-large-cased-whole-word-masking-finetuned-squad-vocab.txt',
    model_type =    'bert'
)
'''

### XLNet


In [0]:
xlnet = QuestionAnswering(
    config_file =   '/content/drive/My Drive/xlnet-cased-finetuned-squad.json',
    weight_file=    '/content/drive/My Drive/xlnet-cased-finetuned-squad.bin',
    tokenizer_file= '/content/drive/My Drive/xlnet-large-cased-spiece.txt',
    model_type =    'xlnet'
)

## Results

I’ve included some sample facts and questions to give these algorithms a go:



```
facts = " My wife is great. \
My complete name is Roberto Pereira Silveira. \
I am 40 years old. \
My dog is cool. \
My dog breed is jack russel. \
My dog was born in 2014.\
My dog name is Mallu. \
My dog is 5 years old. \
I am an engineer. \
I was born in 1979. \
My e-mail is rsilveira79@gmail.com."

questions = [
    "What is my complete name?",
    "What is dog name?",
    "What is my dog age?",
    "What is my age?",
    "What is my dog breed?",
    "When I was born?",
    "What is my e-mail?"
]
```
And here are the results! As you could see I should have trained XLNet a bit more, but it is already returning good results:



```
QUESTION: What is my complete name?
  BERT: roberto pereira silveira
  XLNET: Roberto Pereira Silveira
--------------------------------------------------
QUESTION: What is dog name?
  BERT: mallu
  XLNET: Roberto Pereira Silveira. I am 40 years old. My dog is cool. My dog breed is jack russel. My dog was born in 2014.My dog name is Mallu
--------------------------------------------------
QUESTION: What is my dog age?
  BERT: 5 years old
  XLNET: 40 years old
--------------------------------------------------
QUESTION: What is my age?
  BERT: 40
  XLNET: 40 years old
--------------------------------------------------
QUESTION: What is my dog breed?
  BERT: jack russel
  XLNET: jack russel
--------------------------------------------------
QUESTION: When I was born?
  BERT: 1979
  XLNET: 1979
--------------------------------------------------
QUESTION: What is my e-mail?
  BERT: rsilveira79@gmail.com
  XLNET: rsilveira79@gmail.com
--------------------------------------------------
```




In [0]:
question ='what is check-in time?'
context ='CHECK IN/CHECK OUT Check in time is 3:00 p.m. Accommodations before this time cannot be guaranteed. To facilitate the check in process, we suggest that you present a major credit card at the Front Desk upon arrival. A $50.00 deposit will be required at check in for all persons not using a credit card. This fee covers incidental charges that may be accrued. Room reservations will be held until 4:00 p.m. unless the reservation is accompanied by a one night’s deposit or guaranteed with a credit card. Check out time is 12:00 noon. A charge of one full night’s room and tax will be assessed as the Early Departure fee, meaning any guest that checks out before the departure date that was made with the reservation. The following programs were developed to better service the guest: 1-800-CHECK-IN: Allows you to check-in by telephone the day of your arrival. EXPRESS CHECK IN: A self check-in kiosk is conveniently located next to the Front Desk for guests paying by credit card. EXPRESS CHECK OUT: Guests who are utilizing a credit card and are wishing to leave the hotel without stopping by the Front Desk may do so by leaving their keys in their room or dropping them in the express box located at the front desk. VIDEO CHECK OUT: A guest may check out or review their bill from their guest room by simply pressing "88" on the television or the green guest services button on the remote control. The bill is prepared at the front desk and may be retrieved shortly after the guest checks out. COAT CHECK Coat check services may be arranged for meetings, conventions or social events. We recommend one attendant per every 150 guests. A hosted coat check has a minimum charge of $100.00 per attendant. Cash on delivery (COD) coat check charges are $1.00 per garment, with a minimum guarantee. Please contact your Catering/Convention Services Manager for details on securing these services. CONCIERGE SERVICES Our concierge staff located in the main lobby at the Bellstand is at your service daily. You may seek assistance from them for dinner or airline reservations. They can also assist you in planning activities around the area, arranging baby-sitting services, along with other services. 11 CREDIT CARDS The following credit cards are accepted for your convenience: American Express Diners Club Discover JCB MasterCard Visa CURRENCY EXCHANGE Foreign currency exchange may be accommodated at the Hotel’s front desk. For currency services not provided by Hyatt Regency we recommend the UMB’s International Department located at 9th and Grand. They can be reached at (816) 860-7000. DANCE FLOOR The hotel has portable dance floor available for your social events. Dance floor pieces are available in 3’x3’ interlocking sections, we also have a few 4’x4’ interlocking sections available. Arrangements for dance floors should be made through your Catering/Convention Services Manager. DIRECTIONS TO HYATT REGENCY CROWN CENTER KCI Airport: Take I-29 south from the airport to U.S. 169 South/Downtown Exit. Take 169 South over the bridge and proceed onto Broadway. Turn left on 20th street. Follow 20th Street to McGee Street and make a right on McGee. The Hyatt is on the left. I-70 East Bound: Follow I-70 East to I-670 East. Continue on I-670 East to I-35 North; then take the Broadway Exit (2A). Follow Broadway to 20th Street and make a left. Follow 20th street to McGee Street and take a right onto McGee. The Hyatt is on the left. I-70 West Bound: Follow I-70 West to the I-35 South Exit. Follow I-35 south to the Broadway Exit (2A). Follow Broadway to 20th Street and make a left. Follow 20th street to McGee Street and take a right onto McGee. The Hyatt is on the left. I-35 South Bound: Follow I-35 south to the Broadway Exit (2A). Follow Broadway to 20th Street and make a left. Follow 20th street to McGee Street and take a right onto McGee. The Hyatt is on the left. I-35 North Bound: Follow I-35 North to the Broadway Exit (2A). Take a right on Broadway. Take a left onto Pershing. Turn left on McGee and the Hyatt is on the right. 12 DIRECTIONS TO HYATT REGENCY CROWN CENTER (CONTINUED) Hwy 71 North Bound: Exit onto 22nd Street and turn left. Take 22nd Street to McGee Street and turn left. The Hyatt is on your left. Hwy 71 South Bound: Follow Hwy 71 south to I-35 South. Follow I-35 south to the Broadway Exit (2A). Follow Broadway to 20th Street and make a left. Follow 20th street to McGee Street and take a right onto McGee. The Hyatt is on the left. ELECTRICAL Capabilities for meeting rooms and ballrooms: Regency Ballroom: 200 and 100 AMP, 208 volt 3-phase, 208 volt single phase and 120 volt spyder box (multiple individual circuits) capability. Pershing Hall: 120 volt, 208 volt single phase and 208 volt 3-phase Empire A: 1 Spyder box B: None (wall outlets only) C: 1 Spyder box Chouteau A: None (wall outlet only) B: 1 Spyder box Van Horn A: None (wall outlets only) B: None (wall outlets only) C: 1 Spyder box Benton A: 1 Spyder box B: None (wall outlets only) Fremont: 1 Spyder box Northrup: 1 Spyder box Executive Boardroom: None (wall outlets only) Electrical requirements must be submitted to your Convention Services Manager before your convention. Drayage companies will provide exhibitors with electrical booth service forms in their exhibitor kits. Electrical forms should be mailed or faxed to the hotel prior to arrival. The Convention Service’s fax line is (816) 398-4931. The current charge for a spyder box is $200.00 per unit/per day. Please ask for a power tie-in estimate and please refer to the booth services agreement form for all other electrical pricing. 13 ENGINEERING SERVICES The Engineering Department is available to provide assistance with all of your mechanical and electrical needs. All electrical needs for meetings must be confirmed with your Convention Services Manager before the convention. The Hotel cannot guarantee availability of electrical resources without advance notice. Please consult your Convention Services Manager for pricing and order forms. Refer to “Electrical” for individual meeting room capabilities. Should you require a lock change for a meeting room, please contact your Catering/Convention Services Manager at least one week in advance. EXHIBIT HALLS Pershing Hall, our 15,360 square feet exhibit hall, can hold approximately (85) 8’x10’ or (75) 10’x10’ exhibit booths. It is located on the lobby level of the hotel. Crown Center Exhibit Hall, a 52,000 square foot exhibit hall is located on the North side of the hotel. It consists of two halls: • Hall A is approximately 37,000 square feet • Hall B is approximately 15,000 square feet There is a 7,000 square foot pre-function space adjacent to both sections of the Center. The ceiling height is approximately 20 feet high with 100 foot adjustable candle lighting. Exhibition Crown Center Halls A and B holds approximately (284) 10’x10’ booths or (324) 8’x10’ booths. It is the responsibility of the group to have the exhibit area clean and clear by the contracted ending date. This includes all trash, boxes, skids and miscellaneous items. This is typically contracted with the Exhibit and Drayage Company. If there is an excess of trash left in the hall, a service charge for disposal will be applied to the Group’s Master Account. The hotel does not provide the use of its ladders or electrical lift for guest or vendor use. For more information on either exhibit hall, please contact your Convention Services Manager. FAX MACHINES The Front Desk can assist you in sending or receiving faxes. The fax number for guests to receive faxes is (816) 435-4190. The secure fax number that credit card authorizations should be sent to is (816) 329-2340. Should you have a fax to send to the Catering/Convention Services department, the fax number is (816) 398-4931.'

In [19]:
xlnet.get_reply(question,context)

'3:00 p.m'

In [22]:
xlnet.get_reply("What is check-out and check-in time?",context)

'12:00 noon'

In [23]:
xlnet.get_reply("can we check-in early?",context)

'Allows you to check-in by telephone the day of your arrival'