#目標:
使用SQuAD 2.0 資料集和bert-base-uncased 訓練 Question-Answering 模型並預測出答案

#SQuAD 2.0

SQuAD2.0 combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions

# 安裝套件

In [None]:
%%capture
!pip install transformers

In [None]:
import json
from pathlib import Path
import torch
from torch.utils.data import DataLoader

#連接雲端硬碟

In [None]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


#下載資料集
( 下載第一次就好 )

In [None]:
!wget https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v2.0.json -O /content/drive/MyDrive/讀書會/bert/SQuAD20/train-v2.0.json
!wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v2.0.json -O /content/drive/MyDrive/讀書會/bert/SQuAD20/dev-v2.0.json

In [None]:
%cd /content/drive/MyDrive/讀書會/bert/SQuAD20
%ls

/content/drive/MyDrive/讀書會/bert/SQuAD20
dev-v2.0.json  test_model  test_model2  train-v2.0.json


#資料集格式
+ version: 資料集版本
+ data:
  + title: 文章標題
  + id: 文章編號 ( 沒看到 )
  + paragraphs :
    + id: 段落編號 ( 沒看到 )
    + context: 段落內容
    + qas:
      + answers: 可能為\[ ] ( 取決於is_impossible )
        + answer_start: text在文中位置
        + text: 答案內容
      + id: 問題編號
      + is_impossible: "True"表示為不可回答，"False"為可回答
      + plausible_answers: 
        + answer_start: text在文中位置
        + text: 答案內容
      + question: 問題內容
      
訓練集中只有一個正確答案，驗證集中會有多個相同/不同的答案，評估時依照分數最高的取值

      

In [None]:
import json
from pprint import pprint
with open('train-v2.0.json') as file:
  train_data = json.load(file)

pprint("version: "+ train_data['version'])
pprint("title: "+ train_data['data'][0]['title'])
pprint(train_data['data'][0]['paragraphs'][0])


'version: v2.0'
'title: Beyoncé'
{'context': 'Beyoncé Giselle Knowles-Carter (/biːˈjɒnseɪ/ bee-YON-say) (born '
            'September 4, 1981) is an American singer, songwriter, record '
            'producer and actress. Born and raised in Houston, Texas, she '
            'performed in various singing and dancing competitions as a child, '
            'and rose to fame in the late 1990s as lead singer of R&B '
            "girl-group Destiny's Child. Managed by her father, Mathew "
            "Knowles, the group became one of the world's best-selling girl "
            "groups of all time. Their hiatus saw the release of Beyoncé's "
            'debut album, Dangerously in Love (2003), which established her as '
            'a solo artist worldwide, earned five Grammy Awards and featured '
            'the Billboard Hot 100 number-one singles "Crazy in Love" and '
            '"Baby Boy".',
 'qas': [{'answers': [{'answer_start': 269, 'text': 'in the late 1990s'}],
          'id': '

In [None]:
import json
from pprint import pprint
with open('dev-v2.0.json') as file:
  val_data = json.load(file)

pprint("version: "+ val_data['version'])
pprint("title: "+ val_data['data'][0]['title'])
pprint(val_data['data'][0]['paragraphs'][0])


'version: v2.0'
'title: Normans'
{'context': 'The Normans (Norman: Nourmands; French: Normands; Latin: '
            'Normanni) were the people who in the 10th and 11th centuries gave '
            'their name to Normandy, a region in France. They were descended '
            'from Norse ("Norman" comes from "Norseman") raiders and pirates '
            'from Denmark, Iceland and Norway who, under their leader Rollo, '
            'agreed to swear fealty to King Charles III of West Francia. '
            'Through generations of assimilation and mixing with the native '
            'Frankish and Roman-Gaulish populations, their descendants would '
            'gradually merge with the Carolingian-based cultures of West '
            'Francia. The distinct cultural and ethnic identity of the Normans '
            'emerged initially in the first half of the 10th century, and it '
            'continued to evolve over the succeeding centuries.',
 'qas': [{'answers': [{'answer_start': 159, 

#讀取資料

In [None]:
train_path = Path('/content/drive/MyDrive/讀書會/bert/SQuAD20/train-v2.0.json')
val_path = Path('/content/drive/MyDrive/讀書會/bert/SQuAD20/dev-v2.0.json')

def read_data(path,limit):
  with open(path, 'rb') as f:
      squad_dict = json.load(f)

  contexts = []
  questions = []
  answers = []
  for group in squad_dict['data']:
        for passage in group['paragraphs']:
            context = passage['context']
            for qa in passage['qas']:
                question = qa['question']
                # 檢查答案在'answers'或'plausible_answers'
                if 'plausible_answers' in qa.keys():
                    access = 'plausible_answers'
                else:
                    access = 'answers'
                for answer in qa[access]:
                    contexts.append(context)
                    questions.append(question)
                    answers.append(answer)
                if limit != None and len(contexts) > limit:
                    return contexts, questions, answers
    
  return contexts, questions, answers

In [None]:
train_contexts, train_questions, train_answers = read_data(train_path,4000)
val_contexts, val_questions, val_answers = read_data(val_path,2000)

#確認儲存的資料

In [None]:
print(len(train_contexts))
print(len(train_questions))
print(len(train_answers))

4001
4001
4001


In [None]:
print("Context: ",train_contexts[0])  
print("Question: ",train_questions[0])
print("Answer: ",train_answers[0])

Context:  Beyoncé Giselle Knowles-Carter (/biːˈjɒnseɪ/ bee-YON-say) (born September 4, 1981) is an American singer, songwriter, record producer and actress. Born and raised in Houston, Texas, she performed in various singing and dancing competitions as a child, and rose to fame in the late 1990s as lead singer of R&B girl-group Destiny's Child. Managed by her father, Mathew Knowles, the group became one of the world's best-selling girl groups of all time. Their hiatus saw the release of Beyoncé's debut album, Dangerously in Love (2003), which established her as a solo artist worldwide, earned five Grammy Awards and featured the Billboard Hot 100 number-one singles "Crazy in Love" and "Baby Boy".
Question:  When did Beyonce start becoming popular?
Answer:  {'text': 'in the late 1990s', 'answer_start': 269}


In [None]:
print(len(val_contexts))
print(len(val_questions))
print(len(val_answers))

2001
2001
2001


In [None]:
print("Context: ",val_contexts[0])  
print("Question: ",val_questions[0])
print("Answer: ",val_answers[0])

Context:  The Normans (Norman: Nourmands; French: Normands; Latin: Normanni) were the people who in the 10th and 11th centuries gave their name to Normandy, a region in France. They were descended from Norse ("Norman" comes from "Norseman") raiders and pirates from Denmark, Iceland and Norway who, under their leader Rollo, agreed to swear fealty to King Charles III of West Francia. Through generations of assimilation and mixing with the native Frankish and Roman-Gaulish populations, their descendants would gradually merge with the Carolingian-based cultures of West Francia. The distinct cultural and ethnic identity of the Normans emerged initially in the first half of the 10th century, and it continued to evolve over the succeeding centuries.
Question:  In what country is Normandy located?
Answer:  {'text': 'France', 'answer_start': 159}


In [None]:
print(val_contexts[20])
print(val_questions[20])
print(val_answers[20])

The Normans (Norman: Nourmands; French: Normands; Latin: Normanni) were the people who in the 10th and 11th centuries gave their name to Normandy, a region in France. They were descended from Norse ("Norman" comes from "Norseman") raiders and pirates from Denmark, Iceland and Norway who, under their leader Rollo, agreed to swear fealty to King Charles III of West Francia. Through generations of assimilation and mixing with the native Frankish and Roman-Gaulish populations, their descendants would gradually merge with the Carolingian-based cultures of West Francia. The distinct cultural and ethnic identity of the Normans emerged initially in the first half of the 10th century, and it continued to evolve over the succeeding centuries.
Who gave their name to Normandy in the 1000's and 1100's
{'text': 'Normans', 'answer_start': 4}


#新增 answer 的結束位置
( 這是指在context中的位置 )

In [None]:
# def add_end_idx(answers):
#     for answer in answers:
#         gold_text = answer['text']
#         start_idx = answer['answer_start']
#         if gold_text == '':
#           end_idx = 0
#         else:
#           end_idx = start_idx + len(gold_text) # Find end character index of answer in context
#         answer['answer_end'] = end_idx

# add_end_idx(train_answers)
# add_end_idx(val_answers)

In [None]:
def add_end_idx(answers, contexts):
    for answer, context in zip(answers, contexts):
        # gold_text:期望在context中找到的答案
        gold_text = answer['text']
        start_idx = answer['answer_start']
        end_idx = start_idx + len(gold_text)

        # 例外情況(context中是sixth, answer卻是six)
        if context[start_idx:end_idx] == gold_text:
            answer['answer_end'] = end_idx
        else:
            # 將答案向左移動1-2個位置
            for n in [1, 2]:
                if context[start_idx-n:end_idx-n] == gold_text:
                    answer['answer_start'] = start_idx - n
                    answer['answer_end'] = end_idx - n

add_end_idx(train_answers, train_contexts)
add_end_idx(val_answers, val_contexts)

In [None]:
print(val_answers[0])
print(val_answers[20])

{'text': 'France', 'answer_start': 159, 'answer_end': 165}
{'text': 'Normans', 'answer_start': 4, 'answer_end': 11}


#Tokenizer
將 input 資料轉換成 input_ids、token_type_ids 與 attention_mask

* **input_ids**：每一個單字(包含標點符號)對應到一個數字，這就是所謂的 token。
* **token_type_ids**：因為context會跟question併在一起，所以用不同id來區分。
* **attention_mask**：表示模型該注意的部分，padding 的部分會標記為0。

In [None]:
from transformers import AutoTokenizer,BertTokenizerFast, BertTokenizer, AdamW, BertForQuestionAnswering
tokenizer_auto = AutoTokenizer.from_pretrained("bert-base-uncased")
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
tokenizer_fast = BertTokenizerFast.from_pretrained("bert-base-uncased")

train_encodings = tokenizer_fast(train_contexts, train_questions, truncation=True, padding=True)
val_encodings = tokenizer_fast(val_contexts, val_questions, truncation=True, padding=True)

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/570 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/466k [00:00<?, ?B/s]

In [None]:
print('tokenizer_auto 最大input長度 = ' ,tokenizer_auto.model_max_length)
print('tokenizer 最大input長度 = ' ,tokenizer.model_max_length)
print('tokenizer_fast 最大input長度 = ' ,tokenizer_fast.model_max_length)

tokenizer_auto 最大input長度 =  512
tokenizer 最大input長度 =  512
tokenizer_fast 最大input長度 =  512


In [None]:
train_encodings.keys()

dict_keys(['input_ids', 'token_type_ids', 'attention_mask'])

也可以將 input_ids 轉回原本的文字，會發現 tokenizer 自動在句子前後加上了 `[CLS]` 和 `[SEP]` 這兩個特殊的 token。

* `[CLS]`：放在句子最前端。
* `[SEP]`：放在句子最末端，或分開兩個句子。

In [None]:
print("input_ids\n" ,train_encodings['input_ids'][0])
print("length = ",len(train_encodings['input_ids'][0]))

print("\ninput_ids convert_ids_to_tokens \n",tokenizer_fast.convert_ids_to_tokens(train_encodings['input_ids'][0]))
print("\ninput_ids decode\n", tokenizer_fast.decode(train_encodings['input_ids'][0]))

print("\ntoken_type_ids\n",train_encodings['token_type_ids'][0])
print("length = ",len(train_encodings['token_type_ids'][0]))

print("\nattention_mask\n",train_encodings['attention_mask'][0])
print("length = ",len(train_encodings['attention_mask'][0]))


input_ids
 [101, 20773, 21025, 19358, 22815, 1011, 5708, 1006, 1013, 12170, 23432, 29715, 3501, 29678, 12325, 29685, 1013, 10506, 1011, 10930, 2078, 1011, 2360, 1007, 1006, 2141, 2244, 1018, 1010, 3261, 1007, 2003, 2019, 2137, 3220, 1010, 6009, 1010, 2501, 3135, 1998, 3883, 1012, 2141, 1998, 2992, 1999, 5395, 1010, 3146, 1010, 2016, 2864, 1999, 2536, 4823, 1998, 5613, 6479, 2004, 1037, 2775, 1010, 1998, 3123, 2000, 4476, 1999, 1996, 2397, 4134, 2004, 2599, 3220, 1997, 1054, 1004, 1038, 2611, 1011, 2177, 10461, 1005, 1055, 2775, 1012, 3266, 2011, 2014, 2269, 1010, 25436, 22815, 1010, 1996, 2177, 2150, 2028, 1997, 1996, 2088, 1005, 1055, 2190, 1011, 4855, 2611, 2967, 1997, 2035, 2051, 1012, 2037, 14221, 2387, 1996, 2713, 1997, 20773, 1005, 1055, 2834, 2201, 1010, 20754, 1999, 2293, 1006, 2494, 1007, 1010, 2029, 2511, 2014, 2004, 1037, 3948, 3063, 4969, 1010, 3687, 2274, 8922, 2982, 1998, 2956, 1996, 4908, 2980, 2531, 2193, 1011, 2028, 3895, 1000, 4689, 1999, 2293, 1000, 1998, 1000, 3336,

#新增 answer 的 start_position 跟 end_positions 
( 這是找到開始跟結束的token位置 )

In [None]:
def add_token_positions(encodings, answers):
    cnt1 = 0
    cnt2 = 0
    start_positions = []
    end_positions = []
    for i in range(len(answers)):
        start_positions.append(encodings.char_to_token(i, answers[i]['answer_start']))
        end_positions.append(encodings.char_to_token(i, answers[i]['answer_end']))
        # print("* ",start_positions)
        # print("# ",answers[i]['answer_start'])
        
        # 如果開始位置為None，代表答案已被截断
        if start_positions[-1] is None:
            start_positions[-1] = tokenizer.model_max_length
            cnt1 += 1
            
        # 如果結束位置為None，往左移直到找到不為None的值
        shift = 1
        while end_positions[-1] is None:
            end_positions[-1] = encodings.char_to_token(i, answers[i]['answer_end'] - shift)
            shift += 1
            cnt2 += 1

        # 如果結束位置仍為None，代表答案已被截断
        if end_positions[-1] is None:
            end_positions[-1] = tokenizer.model_max_length
            cnt1 += 1
        
    # 更新開始與结束位置
    encodings.update({'start_positions': start_positions, 'end_positions': end_positions})
    print("* ",start_positions)
    print("# ",end_positions)
    print(cnt1," ",cnt2)

add_token_positions(train_encodings, train_answers)
add_token_positions(val_encodings, val_answers)

*  [67, 55, 128, 47, 69, 81, 124, 91, 69, 72, 124, 128, 141, 72, 124, 57, 94, 141, 67, 145, 53, 103, 13, 23, 67, 94, 118, 22, 188, 198, 147, 198, 6, 194, 78, 89, 78, 114, 198, 78, 55, 85, 139, 41, 29, 67, 129, 41, 33, 67, 129, 139, 11, 31, 106, 29, 150, 102, 77, 11, 31, 69, 150, 68, 120, 195, 68, 119, 269, 195, 2, 3, 40, 68, 119, 241, 49, 209, 291, 12, 20, 49, 210, 291, 20, 49, 73, 165, 291, 32, 62, 136, 35, 77, 136, 23, 30, 136, 23, 7, 47, 73, 171, 117, 126, 23, 81, 126, 133, 236, 15, 67, 91, 149, 178, 22, 42, 89, 92, 115, 22, 11, 48, 91, 92, 31, 48, 93, 106, 32, 11, 48, 31, 247, 11, 54, 106, 247, 164, 107, 258, 2, 140, 199, 257, 9, 19, 49, 178, 33, 71, 83, 83, 26, 77, 82, 7, 33, 77, 148, 12, 35, 35, 111, 133, 22, 116, 191, 12, 87, 98, 116, 191, 9, 62, 207, 354, 422, 17, 393, 354, 422, 2, 9, 37, 137, 367, 12, 76, 103, 129, 215, 76, 131, 135, 182, 12, 76, 107, 131, 10, 60, 83, 107, 120, 10, 89, 83, 120, 60, 10, 60, 83, 120, 11, 11, 63, 11, 15, 34, 53, 3, 15, 39, 53, 2, 70, 92, 60, 116,

In [None]:
# def add_token_positions(encodings, answers):
#     start_positions = []
#     end_positions = []
#     for i in range(len(answers)):
#         start_positions.append(encodings.char_to_token(i, answers[i]['answer_start']))
#         end_positions.append(encodings.char_to_token(i, answers[i]['answer_end']))
#         # 如果開始位置為None，代表答案已被截断
#         if start_positions[-1] is None:
#             start_positions[-1] = tokenizer.model_max_length
            
#         # 如果結束位置為None，往左移直到找到不為None的值
#         shift = 1
#         while end_positions[-1] is None:
#             end_positions[-1] = encodings.char_to_token(i, answers[i]['answer_end'] - shift)
#             shift += 1
            
#         # 如果結束位置仍為None，代表答案已被截断
#         if end_positions[-1] is None:
#             end_positions[-1] = tokenizer.model_max_length
            
#     # 更新開始與结束位置
#     encodings.update({'start_positions': start_positions, 'end_positions': end_positions})

# add_token_positions(train_encodings, train_answers)
# add_token_positions(val_encodings, val_answers)

In [None]:
train_encodings.keys()

dict_keys(['input_ids', 'token_type_ids', 'attention_mask', 'start_positions', 'end_positions'])

#定義Dataset，並轉換成 tensor 格式

In [None]:
class SquadDataset(torch.utils.data.Dataset):
    def __init__(self, encodings):
        self.encodings = encodings

    def __getitem__(self, idx):
        return {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}

    def __len__(self):
        return len(self.encodings.input_ids)

# 建dataset
train_dataset = SquadDataset(train_encodings)
val_dataset = SquadDataset(val_encodings)

In [None]:
print(val_encodings)

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)



In [None]:
pprint(val_dataset[0])

{'attention_mask': tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0

#建bert model與環境設定
* **Batch size:** 8, 16
* **Learning rate (lr):**  5e−5, 3e−5, 2e−5
* **epochs:**  3

In [None]:
from torch.utils.data import DataLoader
from transformers import AdamW
from tqdm import tqdm

# 使用GPU
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
# 載入模型架構( BertForQuestionAnswering )
model = BertForQuestionAnswering.from_pretrained('bert-base-uncased').to(device)

# 決定一次要放多少訓練資料集給模型訓練
# 每個iteration以8或16筆做計算
# batch_size = 8
batch_size = 16

# 初始化AdamW優化器(learning rate: 當找到方向後，決定一次走多遠)
optim = AdamW(model.parameters(), lr=5e-5)
# optim = AdamW(model.parameters(), lr=3e-5)
# optim = AdamW(model.parameters(), lr=2e-5)

# 決定模型要看整個訓練資料集幾遍
# 決定訓練要跑幾回合 
epochs = 3

Downloading:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForQuestionAnswering: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.decoder.weight']
- This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForQuestionAnswering were not initialized from the model checkpoint at bert-base-uncased a

#訓練模型

In [None]:
#model.train()
# 將資料丟入DataLoader
# shuffle=True每次都是隨機打亂，然後再取batch
# 每個 epoch 隨機調整訓練資料集裡頭的數據以讓訓練過程更穩定
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

for epoch in range(epochs):
    model.train()
    # 使用 tqdm 套件顯示進度條 (leave:完成後是否保留進度條)
    loop = tqdm(train_loader, leave=True)
    for batch in loop:
        # 初始化梯度為0
        optim.zero_grad()
        
        input_ids = batch['input_ids'].to(device)
        attention_mask = batch['attention_mask'].to(device)
        start_positions = batch['start_positions'].to(device)
        end_positions = batch['end_positions'].to(device)

        # forward pass
        outputs = model(input_ids, attention_mask=attention_mask,
                        start_positions=start_positions,
                        end_positions=end_positions)
        
        loss = outputs[0]
        # 計算每個需要更新參數的loss
        loss.backward()
        # 更新參數
        optim.step()
        # 在進度條顯示
        loop.set_description(f'Epoch {epoch}')
        loop.set_postfix(loss=loss.item())

Epoch 0: 100%|██████████| 251/251 [06:15<00:00,  1.50s/it, loss=0.0357]
Epoch 1: 100%|██████████| 251/251 [06:15<00:00,  1.50s/it, loss=4.19]
Epoch 2: 100%|██████████| 251/251 [06:15<00:00,  1.50s/it, loss=0.0872]


# 預測資料與計算score

In [None]:
model.eval()

# 將資料丟入DataLoader
# shuffle=True每次都是隨機打亂，然後再取batch
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=True)

score = []
for batch in val_loader:
    # 不需要計算梯度，因為沒有訓練
    with torch.no_grad():
        input_ids = batch['input_ids'].to(device)
        attention_mask = batch['attention_mask'].to(device)
        # 使用正確答案的開始和結束位置
        start_true = batch['start_positions'].to(device)
        end_true = batch['end_positions'].to(device)
        
        outputs = model(input_ids, attention_mask=attention_mask)

        # 預測
        # start_logits跟end_logits: 每個Bert預測答案開始和結束位置的confidence level
        start_pred = torch.argmax(outputs['start_logits'], dim=1)
        end_pred = torch.argmax(outputs['end_logits'], dim=1)

        score.append(((start_pred == start_true).sum()/len(start_pred)).item())
        score.append(((end_pred == end_true).sum()/len(end_pred)).item())
# 計算score
score = sum(score)/len(score)

In [None]:
print(score)

0.48933531746031744


# Save model

In [None]:
# Save model
torch.save(model,"./test_model")