### Question Answer Application
The goal of Question Answering is to find the answer to a question given a question and an accompanying context. The predicted answer will be either a span of text from the context or an empty string (indicating the question cannot be answered from the context).

In [1]:
!pip install simpletransformers

Collecting simpletransformers
[?25l  Downloading https://files.pythonhosted.org/packages/af/62/c27d9781c7469d4abe7a6e6658120bfb4a41535e8212d11f9d41d379af5d/simpletransformers-0.61.5-py3-none-any.whl (220kB)
[K     |█▌                              | 10kB 23.5MB/s eta 0:00:01[K     |███                             | 20kB 6.1MB/s eta 0:00:01[K     |████▌                           | 30kB 4.2MB/s eta 0:00:01[K     |██████                          | 40kB 4.0MB/s eta 0:00:01[K     |███████▍                        | 51kB 2.1MB/s eta 0:00:01[K     |█████████                       | 61kB 2.3MB/s eta 0:00:01[K     |██████████▍                     | 71kB 2.5MB/s eta 0:00:01[K     |████████████                    | 81kB 2.6MB/s eta 0:00:01[K     |█████████████▍                  | 92kB 2.8MB/s eta 0:00:01[K     |██████████████▉                 | 102kB 2.9MB/s eta 0:00:01[K     |████████████████▍               | 112kB 2.9MB/s eta 0:00:01[K     |█████████████████▉             

In [1]:
import json
with open(r"train.json", "r") as read_file:
    train = json.load(read_file)

In [2]:
train

[{'context': 'Mistborn is a series of epic fantasy novels written by American author Brandon Sanderson.',
  'qas': [{'answers': [{'answer_start': 71, 'text': 'Brandon Sanderson'}],
    'id': '00001',
    'is_impossible': False,
    'question': 'Who is the author of the Mistborn series?'}]},
 {'context': 'The first series, published between 2006 and 2008, consists of The Final Empire,The Well of Ascension, and The Hero of Ages.',
  'qas': [{'answers': [{'answer_start': 28, 'text': 'between 2006 and 2008'}],
    'id': '00002',
    'is_impossible': False,
    'question': 'When was the series published?'},
   {'answers': [{'answer_start': 63,
      'text': 'The Final Empire, The Well of Ascension, and The Hero of Ages'}],
    'id': '00003',
    'is_impossible': False,
    'question': 'What are the three books in the series?'},
   {'answers': [],
    'id': '00004',
    'is_impossible': True,
    'question': 'Who is the main character in the series?'}]}]

In [3]:
with open(r"test.json", "r") as read_file:
    test = json.load(read_file)

In [4]:
test

[{'context': 'The series primarily takes place in a region called the Final Empire on a world called Scadrial, where the sun and sky are red, vegetation is brown, and the ground is constantly being covered under black volcanic ashfalls.',
  'qas': [{'answers': [{'answer_start': 38,
      'text': 'region called the Final Empire'},
     {'answer_start': 74, 'text': 'world called Scadrial'}],
    'id': '00001',
    'is_impossible': False,
    'question': 'Where does the series take place?'}]},
 {'context': '"Mistings" have only one of the many Allomantic powers, while "Mistborns" have all the powers.',
  'qas': [{'answers': [{'answer_start': 21, 'text': 'one'}],
    'id': '00002',
    'is_impossible': False,
    'question': 'How many powers does a Misting possess?'},
   {'answers': [],
    'id': '00003',
    'is_impossible': True,
    'question': 'What are Allomantic powers?'}]}]

In [5]:
import logging

from simpletransformers.question_answering import QuestionAnsweringModel, QuestionAnsweringArgs

In [63]:
model_type="bert"
model_name= "bert-base-cased"
if model_type == "bert":
    model_name = "bert-base-cased"

elif model_type == "roberta":
    model_name = "roberta-base"

elif model_type == "distilbert":
    model_name = "distilbert-base-cased"

elif model_type == "distilroberta":
    model_type = "roberta"
    model_name = "distilroberta-base"

elif model_type == "electra-base":
    model_type = "electra"
    model_name = "google/electra-base-discriminator"

elif model_type == "electra-small":
    model_type = "electra"
    model_name = "google/electra-small-discriminator"

elif model_type == "xlnet":
    model_name = "xlnet-base-cased"

In [39]:
# Configure the model 
model_args = QuestionAnsweringArgs()
model_args.train_batch_size = 16
model_args.evaluate_during_training = True
model_args.n_best_size=3
model_args.num_train_epochs=5


In [73]:
### Advanced Methodology
train_args = {
    "reprocess_input_data": True,
    "overwrite_output_dir": True,
    "use_cached_eval_features": True,
    "output_dir": f"outputs/{model_type}",
    "best_model_dir": f"outputs/{model_type}/best_model",
    "evaluate_during_training": True,
    "max_seq_length": 128,
    "num_train_epochs": 5,
    "evaluate_during_training_steps": 1000,
    "wandb_project": "Question Answer Application",
    "wandb_kwargs": {"name": model_name},
    "save_model_every_epoch": False,
    "save_eval_checkpoints": False,
    "n_best_size":3,
    # "use_early_stopping": True,
    # "early_stopping_metric": "mcc",
    # "n_gpu": 2,
    # "manual_seed": 4,
    # "use_multiprocessing": False,
    "train_batch_size": 128,
    "eval_batch_size": 64,
    # "config": {
    #     "output_hidden_states": True
    # }
}

In [75]:
model = QuestionAnsweringModel(
    model_type,model_name, args=train_args
)

Some weights of the model checkpoint at bert-base-cased were not used when initializing BertForQuestionAnswering: ['cls.seq_relationship.weight', 'cls.predictions.bias', 'cls.seq_relationship.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.weight']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForQuestionAnswering were not initialized from the model checkpoint at bert-base-cased and a

In [74]:
### Remove output folder
!rm -rf outputs

In [76]:
# Train the model
model.train_model(train, eval_data=test)


convert squad examples to features:   0%|          | 0/4 [00:00<?, ?it/s][ACould not find answer: 'The Final Empire,The Well of Ascension, and The Hero of Ages.' vs. 'The Final Empire, The Well of Ascension, and The Hero of Ages'
convert squad examples to features: 100%|██████████| 4/4 [00:00<00:00, 395.83it/s]

add example index and unique id: 100%|██████████| 4/4 [00:00<00:00, 18275.83it/s]


Epoch:   0%|          | 0/5 [00:00<?, ?it/s]

VBox(children=(Label(value=' 0.03MB of 0.03MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
global_step,3.0
correct,0.0
similar,1.0
incorrect,2.0
train_loss,4.37429
eval_loss,-0.10474
_runtime,11.0
_timestamp,1622030455.0
_step,2.0


0,1
global_step,▁▅█
correct,▁▁▁
similar,▁▁▁
incorrect,▁▁▁
train_loss,██▁
eval_loss,█▃▁
_runtime,▁▂█
_timestamp,▁▂█
_step,▁▅█


Running Epoch 0 of 5:   0%|          | 0/1 [00:00<?, ?it/s]



Running Evaluation:   0%|          | 0/1 [00:00<?, ?it/s]

Running Epoch 1 of 5:   0%|          | 0/1 [00:00<?, ?it/s]

Running Evaluation:   0%|          | 0/1 [00:00<?, ?it/s]

Running Epoch 2 of 5:   0%|          | 0/1 [00:00<?, ?it/s]

Running Evaluation:   0%|          | 0/1 [00:00<?, ?it/s]

Running Epoch 3 of 5:   0%|          | 0/1 [00:00<?, ?it/s]

Running Evaluation:   0%|          | 0/1 [00:00<?, ?it/s]

Running Epoch 4 of 5:   0%|          | 0/1 [00:00<?, ?it/s]

Running Evaluation:   0%|          | 0/1 [00:00<?, ?it/s]

(5,
 {'correct': [0, 0, 0, 0, 0],
  'eval_loss': [-0.2003173828125,
   -0.260009765625,
   -0.296630859375,
   -0.320556640625,
   -0.33251953125],
  'global_step': [1, 2, 3, 4, 5],
  'incorrect': [1, 1, 1, 1, 1],
  'similar': [2, 2, 2, 2, 2],
  'train_loss': [5.065276145935059,
   5.1174397468566895,
   4.639975547790527,
   4.307891845703125,
   3.9419758319854736]})

In [77]:
# Evaluate the model
result, texts = model.eval_model(test)

Running Evaluation:   0%|          | 0/1 [00:00<?, ?it/s]

In [78]:
# Make predictions with the model
to_predict = [
    {
        "context": "Vin is a Mistborn of great power and skill.",
        "qas": [
            {
                "question": "What is Vin's speciality?",
                "id": "0",
            }
        ],
    }
]

In [79]:
answers, probabilities = model.predict(to_predict)

print(answers)


convert squad examples to features: 100%|██████████| 1/1 [00:00<00:00, 4744.69it/s]

add example index and unique id: 100%|██████████| 1/1 [00:00<00:00, 658.45it/s]


Running Prediction:   0%|          | 0/1 [00:00<?, ?it/s]

[{'id': '0', 'answer': ['is a Mistborn', 'is a Mi', 'born']}]
