# GPT-2 Pre-trained
GPT-2 is trained with a simple object:  
  predict the next word, given all of the previous words within some text.  
- Zero-Shot Transfer: The pre-training task for GPT-2 is solely language modeling  



In [5]:
#GPU check
import torch
if torch.cuda.is_available():       
    device = torch.device("cuda")
    print('There are %d GPU(s) available.' % torch.cuda.device_count())
    print('We will use the GPU:', torch.cuda.get_device_name(0))
else:
    print('No GPU available, using the CPU instead.')
    device = torch.device("cpu")

There are 2 GPU(s) available.
We will use the GPU: GeForce RTX 2080 Ti


In [6]:
!pip install transformers



1. Dataset Preprocessing  
- ROCStories sentence5개 연속으로 이루어져 있음.  
- train, valid, test로 나누기  
- input:  
  [start_token]+sentence+[end token]  
  -> input=embedded token+positional encoding
  -> self-attention -> ffnn -> output vector  
  -> 결과물은 최종 어텐션 값을 가지고 있다. * Token embedding값과 곱해서 단어들 확률 값 가장 높은 것이 출력되고, 다음의 입력값이 된다.  

- model 마지막에 classification layer 추가해서 true or false story 분류



In [7]:
import numpy as np
import pandas as pd
dataset=pd.read_csv("/home/inglab/sybae/ROCStories_winter2017 - ROCStories_winter2017.csv")
dataset.head(10)

Unnamed: 0,storyid,storytitle,sentence1,sentence2,sentence3,sentence4,sentence5
0,8bbe6d11-1e2e-413c-bf81-eaea05f4f1bd,David Drops the Weight,David noticed he had put on a lot of weight re...,He examined his habits to try and figure out t...,He realized he'd been eating too much fast foo...,He stopped going to burger places and started ...,"After a few weeks, he started to feel much bet..."
1,0beabab2-fb49-460e-a6e6-f35a202e3348,Frustration,Tom had a very short temper.,One day a guest made him very angry.,He punched a hole in the wall of his house.,Tom's guest became afraid and left quickly.,Tom sat on his couch filled with regret about ...
2,87da1a22-df0b-410c-b186-439700b70ba6,Marcus Buys Khakis,Marcus needed clothing for a business casual e...,All of his clothes were either too formal or t...,He decided to buy a pair of khakis.,The pair he bought fit him perfectly.,Marcus was happy to have the right clothes for...
3,2d16bcd6-692a-4fc0-8e7c-4a6f81d9efa9,Different Opinions,Bobby thought Bill should buy a trailer and ha...,Bill thought a truck would be better for what ...,Bobby pointed out two vehicles were much more ...,Bill was set in his ways with conventional thi...,He ended up buying the truck he wanted despite...
4,c71bb23b-7731-4233-8298-76ba6886cee1,Overcoming shortcomings,John was a pastor with a very bad memory.,He tried to memorize his sermons many days in ...,He decided to learn to sing to overcome his ha...,He then made all his sermons into music and sa...,His congregation was delighted and so was he.
5,4d7b022e-25d2-4300-a9b0-24ab35f4045b,Melody's trip to the aquarium.,Melody's parents surprised her with a trip to ...,Melody took a nap during the two hour car ride...,"When they arrived, Melody was energetic and ex...","At the aquarium Melody saw sharks, tropical fi...","After five hours at the aquarium, Melody and h..."
6,8036c905-f23e-4976-83a1-85d679b5e0c2,Pop Quiz,The math teacher announced a pop quiz as class...,"While some students complained, he began passi...",I took out my pencil and began to work.,"About 5 minutes later, I finished.",I stood up feeling confident and turned it in.
7,77338898-07d4-4143-8451-284540c8b082,My first girlfriend,My first girlfriend i met on the internet.,She lives about 4 hours away from me.,Finally after 2 years we met each other.,She stayed with me for a week or two.,We decided we couldn't be apart so she moved i...
8,110fafd1-2bb7-4ffe-aac7-475706165d41,Charlie Horse,I got Charlie Horse when I was four years old.,"He's a brown stuffed horse, and at 35 I still ...","He was my best friend, and always laid at the ...","I laid him next to me, smelling his soft fur e...",I liked to listen to my radio as I fell asleep...
9,13573c2e-5eed-40eb-bbe5-ed259b5c76a6,Corn,Laura loved corn.,So she decided to grow some in her backyard.,The whole process of growing them made her ver...,But she realized that they required too much w...,So Laura quickly abandoned her corn garden idea.


In [9]:
print(len(dataset))
print('max length of sentence 1: ', max(len(x) for x in dataset['sentence1']))
print('max length of sentence 2: ', max(len(x) for x in dataset['sentence2']))
print('max length of sentence 3: ', max(len(x) for x in dataset['sentence3']))
print('max length of sentence 4: ', max(len(x) for x in dataset['sentence4']))
print('max length of sentence 5: ', max(len(x) for x in dataset['sentence5']))

52665
max length of sentence 1:  79
max length of sentence 2:  73
max length of sentence 3:  86
max length of sentence 4:  78
max length of sentence 5:  93


In [10]:
New_dataset=pd.DataFrame(columns=['sentence1', 'sentence2'])
New_dataset['sentence1']=dataset['sentence1']
New_dataset['sentence2']=dataset['sentence2']
print(len(New_dataset))

append_dataset=pd.DataFrame(columns=['sentence1', 'sentence2'])
append_dataset['sentence1']=dataset['sentence2']
append_dataset['sentence2']=dataset['sentence3']
New_dataset=New_dataset.append(append_dataset, ignore_index = True)
print(len(New_dataset))

append_dataset=pd.DataFrame(columns=['sentence1', 'sentence2'])
append_dataset['sentence1']=dataset['sentence3']
append_dataset['sentence2']=dataset['sentence4']
New_dataset=New_dataset.append(append_dataset, ignore_index = True)
print(len(New_dataset))

append_dataset=pd.DataFrame(columns=['sentence1', 'sentence2'])
append_dataset['sentence1']=dataset['sentence4']
append_dataset['sentence2']=dataset['sentence5']
New_dataset=New_dataset.append(append_dataset, ignore_index = True)
print(len(New_dataset))

52665
105330
157995
210660


In [12]:
import sklearn
New_dataset=sklearn.utils.shuffle(New_dataset) 

In [13]:
New_dataset=New_dataset.sample(n = 10000)
print(len(New_dataset))

10000


In [14]:
New_dataset.head(10)
print('max length of sentence 1: ', max(len(x) for x in New_dataset['sentence1']))
print('max length of sentence 2: ', max(len(x) for x in New_dataset['sentence2']))

max length of sentence 1:  71
max length of sentence 2:  78


In [15]:
#Create a very small test set to compare generated text with the reality
train_set=New_dataset[:9000]
valid_set=New_dataset[9000:10000]
test_set=valid_set
#Reset the indexes
valid_set = valid_set.reset_index()
train_set = train_set.reset_index()
test_set = test_set.reset_index()
print(len(test_set)) #1000
print(len(valid_set)) #1000
print(len(train_set)) #9000


1000
1000
9000


In [16]:
train_df=pd.DataFrame(columns=['sent'])
train_df['sent']=train_set["sentence1"]+train_set["sentence2"]
print('max length of train_df: ', max(len(x) for x in train_df['sent']))

valid_df=pd.DataFrame(columns=['sent'])
valid_df['sent']=valid_set["sentence1"]+valid_set["sentence2"]
print('max length of valid_df: ', max(len(x) for x in valid_df['sent']))


max length of train_df:  141
max length of valid_df:  146


In [17]:
train_df.head(20)
print(type(train_df['sent']))

<class 'pandas.core.series.Series'>


In [18]:
valid_df.head(20)

Unnamed: 0,sent
0,The pastor droned on with the service.William ...
1,When they all arrived and were about to start ...
2,Javier loved playing outside.His dad bought hi...
3,"My brothers and I fought, and one of us broke ..."
4,They decided to have dessert at a bakery.They ...
5,Jim was really into books.He wanted to build a...
6,He would visit often.ONe day the river flooded.
7,Cathy wanted to plan a surprise party for Dann...
8,They needed to get a family vehicle.Tom decide...
9,I bought him a copy as a surprise for his birt...


In [19]:
test_set.head(20)

Unnamed: 0,index,sentence1,sentence2
0,65352,The pastor droned on with the service.,William rolled his eyes.
1,185717,When they all arrived and were about to start ...,They arrest the cool kids and instead of popul...
2,48990,Javier loved playing outside.,His dad bought him a baseball and told him to ...
3,153872,"My brothers and I fought, and one of us broke ...",We painstakingly glued hundreds of pieces back...
4,159873,They decided to have dessert at a bakery.,They talked until 2 am!
5,43017,Jim was really into books.,He wanted to build a home library.
6,88901,He would visit often.,ONe day the river flooded.
7,79467,Cathy wanted to plan a surprise party for Dann...,She planned a luncheon at his favorite restaur...
8,103389,They needed to get a family vehicle.,Tom decided to get a minivan.
9,156355,I bought him a copy as a surprise for his birt...,He didn't seem to appreciate it much.


In [20]:
train_df.to_csv('/home/inglab/sybae/train_dataset.csv', index=False) #dataset 저장
valid_df.to_csv('/home/inglab/sybae/valid_dataset.csv', index=False) #dataset 저장
test_set.to_csv('/home/inglab/sybae/test_dataset.csv', index=False) #test set 저장


In [21]:
from torch.utils.data import Dataset, DataLoader, RandomSampler, SequentialSampler
import torch.nn.functional as F

from transformers import GPT2Tokenizer, GPT2LMHeadModel
from transformers import AdamW, get_linear_schedule_with_warmup
import random
import time
from tqdm import tqdm

In [22]:
#gpt-model setting
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')

In [23]:
max_length=512

class ROCstoriesDataset(Dataset):
    def __init__(self, control_code, truncate=False, max_length=1024):
        super().__init__()
        self.tokenizer = GPT2Tokenizer.from_pretrained('gpt2') 
        self.input_ids=[]
    
        for row in train_df['sent']:
            encodings_dict=self.tokenizer.encode(f"<|{control_code}|>{row[:max_length]}<|endoftext|>")
            self.input_ids.append(torch.tensor(encodings_dict))
        if truncate:
            self.input_ids=self.input_ids[:20000]
        self.count = len(self.input_ids)
    def __len__(self):
        return self.count

    def __getitem__(self, idx):
        return self.input_ids[idx]

In [24]:
train_rocdataset=ROCstoriesDataset(train_df['sent'], truncate=True, max_length=max_length)
valid_rocdataset=ROCstoriesDataset(valid_df['sent'], truncate=True, max_length=max_length)

In [25]:
#Accumulated batch size (since GPT2 is so big)
def pack_tensor(new_tensor, packed_tensor, max_seq_len):
    if packed_tensor is None:
        return new_tensor, True, None
    if new_tensor.size()[1] + packed_tensor.size()[1] > max_seq_len:
        return packed_tensor, False, new_tensor
    else:
        packed_tensor = torch.cat([new_tensor, packed_tensor[:, 1:]], dim=1)
        return packed_tensor, True, None

In [26]:
import os
import math
def train(
    train_rocdataset, valid_rocdataset, model, tokenizer,
    batch_size=4, epochs=5, lr=1e-4, eps=1e-8,
    max_seq_len=400, warmup_steps=200, output_dir=".", output_prefix="wreckgar", 
    test_mode=False, save_model_on_epoch=False,
):

    acc_steps = 100
    
    device=torch.device("cuda")
    model = model.cuda()
    torch.cuda.empty_cache()
    model.train()

    optimizer = AdamW(model.parameters(),
                  lr = lr, # args.learning_rate
                  eps = eps # args.adam_epsilon
                )

    scheduler = get_linear_schedule_with_warmup(
        optimizer, num_warmup_steps=warmup_steps, num_training_steps = -1
        )

    train_dataloader = DataLoader(train_rocdataset, batch_size=1,shuffle=True)
    valid_dataloader = DataLoader(valid_rocdataset, batch_size=1,shuffle=True)
    loss=0
    accumulating_batch_count = 0
    input_tensor = None

    for epoch in range(epochs):

        print(f"Training epoch {epoch}")
        print(loss)
        for idx, entry in tqdm(enumerate(train_dataloader)):
            (input_tensor, carry_on, remainder) = pack_tensor(entry, input_tensor, 768)

            if carry_on and idx != len(train_dataloader) - 1:
                continue

            input_tensor = input_tensor.to(device)
            outputs = model(input_tensor, labels=input_tensor)
            loss = outputs[0]
            loss.backward()

            if (accumulating_batch_count % batch_size) == 0:
                optimizer.step()
                scheduler.step()
                optimizer.zero_grad()
                model.zero_grad()

            accumulating_batch_count += 1
            input_tensor = None

        model.eval()
        epoch_loss=[] 
        val_loss=0
        accumulating_batch_count = 0

        with torch.no_grad():
            for epoch in range(epochs):
                print(val_loss)
                for idx, entry in tqdm(enumerate(valid_dataloader)):
                    (input_tensor, carry_on, remainder) = pack_tensor(entry, input_tensor, 768)

                    if carry_on and idx != len(valid_dataloader) - 1:
                        continue

                    input_tensor = input_tensor.to(device)
                    outputs = model(input_tensor, labels=input_tensor)
                    val_loss = outputs[0]
                    epoch_loss+=[val_loss.cpu().item()]

                    accumulating_batch_count += 1
                    input_tensor = None
    
    epoch_loss=np.mean(epoch_loss)
    perplexity=math.exp(epoch_loss)
    print("perplexity: ",perplexity)

    return model 

In [26]:
#Train the model on the specific data we have
model = train(train_rocdataset,valid_rocdataset, model, tokenizer)

Training epoch 0
0


9000it [03:58, 37.72it/s]


0


9000it [01:12, 124.44it/s]


tensor(2.2927, device='cuda:0')


9000it [01:12, 123.92it/s]


tensor(2.2943, device='cuda:0')


9000it [01:12, 123.67it/s]


tensor(2.2437, device='cuda:0')


9000it [01:12, 123.52it/s]


tensor(2.3143, device='cuda:0')


9000it [01:13, 123.26it/s]


Training epoch 1
tensor(0.2770, device='cuda:0', grad_fn=<NllLossBackward0>)


9000it [03:57, 37.86it/s]


0


9000it [01:13, 122.45it/s]


tensor(2.2663, device='cuda:0')


9000it [01:13, 122.81it/s]


tensor(2.2781, device='cuda:0')


9000it [01:13, 123.25it/s]


tensor(2.2706, device='cuda:0')


9000it [01:12, 123.58it/s]


tensor(2.2583, device='cuda:0')


9000it [01:12, 123.63it/s]


Training epoch 2
tensor(0.2063, device='cuda:0', grad_fn=<NllLossBackward0>)


9000it [03:57, 37.94it/s]


0


9000it [01:12, 123.58it/s]


tensor(2.2964, device='cuda:0')


9000it [01:12, 123.53it/s]


tensor(2.2461, device='cuda:0')


9000it [01:12, 123.67it/s]


tensor(2.2888, device='cuda:0')


9000it [01:12, 123.67it/s]


tensor(2.2275, device='cuda:0')


9000it [01:12, 123.70it/s]


Training epoch 3
tensor(0.2390, device='cuda:0', grad_fn=<NllLossBackward0>)


9000it [03:57, 37.91it/s]


0


9000it [01:12, 123.68it/s]


tensor(2.2265, device='cuda:0')


9000it [01:12, 123.57it/s]


tensor(2.2681, device='cuda:0')


9000it [01:12, 123.60it/s]


tensor(2.2777, device='cuda:0')


9000it [01:12, 123.64it/s]


tensor(2.2407, device='cuda:0')


9000it [01:12, 123.64it/s]


Training epoch 4
tensor(0.2211, device='cuda:0', grad_fn=<NllLossBackward0>)


9000it [03:57, 37.93it/s]


0


9000it [01:12, 123.58it/s]


tensor(2.2799, device='cuda:0')


9000it [01:12, 123.45it/s]


tensor(2.2420, device='cuda:0')


9000it [01:12, 123.56it/s]


tensor(2.2820, device='cuda:0')


9000it [01:12, 123.70it/s]


tensor(2.3048, device='cuda:0')


9000it [01:12, 123.72it/s]

perplexity:  9.797349797497557





In [27]:
torch.save(model,'/home/inglab/sybae/gpt2-generation.pt')

In [27]:
#Load the model to use it
model = torch.load('/home/inglab/sybae/gpt2-generation.pt')

In [101]:
#generation 코드
def generate(
    model,
    tokenizer,
    prompt,
    entry_count=10,
    entry_length=300, #maximum number of words
    top_p=0.85,
    temperature=1.0,
):
    model.eval()

    generated_num = 0
    generated_list = []

    filter_value = -float("Inf")

    with torch.no_grad():

        for entry_idx in range(entry_count):

            entry_finished = False

            generated = torch.tensor(tokenizer.encode(prompt)).unsqueeze(0) #sentent1

            for i in range(entry_length):
                outputs = model(generated, labels=generated)
                loss, logits = outputs[:2]
                logits = logits[:, -1, :] / (temperature if temperature > 0 else 1.0)

                sorted_logits, sorted_indices = torch.sort(logits, descending=True)
                cumulative_probs = torch.cumsum(F.softmax(sorted_logits, dim=-1), dim=-1)

                sorted_indices_to_remove = cumulative_probs > top_p
                sorted_indices_to_remove[..., 1:] = sorted_indices_to_remove[
                    ..., :-1
                ].clone()
                sorted_indices_to_remove[..., 0] = 0

                indices_to_remove = sorted_indices[sorted_indices_to_remove]
                logits[:, indices_to_remove] = filter_value

                next_token = torch.multinomial(F.softmax(logits, dim=-1), num_samples=1)
                generated = torch.cat((generated, next_token), dim=1)

                if next_token in tokenizer.encode("<|endoftext|>"):
                    entry_finished = True

                if entry_finished:

                    generated_num = generated_num + 1

                    output_list = list(generated.squeeze().numpy())
                    output_text = tokenizer.decode(output_list)
                    generated_list.append(output_text)
                    break
            
            if not entry_finished:
              output_list = list(generated.squeeze().numpy())
              output_text = f"{tokenizer.decode(output_list)}<|endoftext|>" 
              generated_list.append(output_text)
                
    return str(generated_list)

In [102]:
import re
#Function to generate multiple sentences. Test data should be a dataframe
def text_generation(text):
  generated_text = []
  for i in range(len(text)):
    x = generate(model.to('cpu'), tokenizer, text['sentence1'][i], entry_count=1)
    #print(x)
    a=text['sentence1'][i] #we want
    #print(a)
    b=x[len(a)+2:-1]
    b=str(b)
    #print(b)
    b=b.replace('\n',"")
    b=b.replace('\n\n',"")
    b=b.replace('\t',"")
    b=b.replace('"',"")
    b=b.replace('<|endoftext|>',"")  
    c=b.split(sep='.')[0]  
    print(i, ": ",c)
    generated_text.append(c+".")
  return generated_text

In [103]:
test_set=pd.read_csv('/home/inglab/sybae/test_dataset.csv')
new_test_set=test_set[:100]
generated_text = text_generation(new_test_set)
new_test_set['generated_text'] = generated_text

0 :  \n\nIn response, Jim apologized and ordered him to take a seat
1 :   The cop fired a shot
2 :  \n\nNatalie was nervous and stuck to the work trip
3 :  '
4 :  '
5 :  \n\n8:01PM|0|0       During the walk, he dropped down on one knee
6 :  \n\nOne night he bought a new job at the airport
7 :  \n\nThe woman who was assigned to make the party would be happy to hear Danny's name
8 :  \n\nThe people of Bexar County pulled him over
9 :  \n\nHe looked at it carefully, and then gave me a go
10 :   The man ordered his attention back to her
11 :  \n\nSo she should be a fan!'
12 :  \n\nPete said the hike was great
13 :   The BeachTime crew arrived and he was the son of his aunt
14 :  \n\nIt then came to the office and the employee told her that she was working with her boss
15 :  '
16 :  
17 :  \n\nThe credit cards that were on the anniversary are going to expire today
18 :  '
19 :  \n\nMiguel didn't know he could do anything to get it
20 :  '
21 :  \n\nThe fair was a raucous affair
22 :  '
23 

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  after removing the cwd from sys.path.


In [104]:
new_test_set.head(20)

Unnamed: 0,index,sentence1,sentence2,generated_text
0,65352,The pastor droned on with the service.,William rolled his eyes.,"\n\nIn response, Jim apologized and ordered hi..."
1,185717,When they all arrived and were about to start ...,They arrest the cool kids and instead of popul...,The cop fired a shot.
2,48990,Javier loved playing outside.,His dad bought him a baseball and told him to ...,\n\nNatalie was nervous and stuck to the work ...
3,153872,"My brothers and I fought, and one of us broke ...",We painstakingly glued hundreds of pieces back...,'.
4,159873,They decided to have dessert at a bakery.,They talked until 2 am!,'.
5,43017,Jim was really into books.,He wanted to build a home library.,"\n\n8:01PM|0|0 During the walk, he dropp..."
6,88901,He would visit often.,ONe day the river flooded.,\n\nOne night he bought a new job at the airport.
7,79467,Cathy wanted to plan a surprise party for Dann...,She planned a luncheon at his favorite restaur...,\n\nThe woman who was assigned to make the par...
8,103389,They needed to get a family vehicle.,Tom decided to get a minivan.,\n\nThe people of Bexar County pulled him over.
9,156355,I bought him a copy as a surprise for his birt...,He didn't seem to appreciate it much.,"\n\nHe looked at it carefully, and then gave m..."


In [105]:
print(len(new_test_set))

print(new_test_set['generated_text'][18])
print(type(new_test_set['generated_text'][18]))
print(len(new_test_set['generated_text'][18]))

for i,s in enumerate(new_test_set['generated_text']):
    if(len(s)<=5):
        new_test_set=new_test_set.drop(index=i, axis=0)

print(len(new_test_set))


100
'.
<class 'str'>
2
72


In [106]:
new_test_set.head(50)

Unnamed: 0,index,sentence1,sentence2,generated_text
0,65352,The pastor droned on with the service.,William rolled his eyes.,"\n\nIn response, Jim apologized and ordered hi..."
1,185717,When they all arrived and were about to start ...,They arrest the cool kids and instead of popul...,The cop fired a shot.
2,48990,Javier loved playing outside.,His dad bought him a baseball and told him to ...,\n\nNatalie was nervous and stuck to the work ...
5,43017,Jim was really into books.,He wanted to build a home library.,"\n\n8:01PM|0|0 During the walk, he dropp..."
6,88901,He would visit often.,ONe day the river flooded.,\n\nOne night he bought a new job at the airport.
7,79467,Cathy wanted to plan a surprise party for Dann...,She planned a luncheon at his favorite restaur...,\n\nThe woman who was assigned to make the par...
8,103389,They needed to get a family vehicle.,Tom decided to get a minivan.,\n\nThe people of Bexar County pulled him over.
9,156355,I bought him a copy as a surprise for his birt...,He didn't seem to appreciate it much.,"\n\nHe looked at it carefully, and then gave m..."
10,97266,At first he was shy and scared of the baby.,Sam didn't like the baby because he would cry ...,The man ordered his attention back to her.
11,33576,Neil loves watching rock concerts.,He went to a show with his girlfriend.,\n\nSo she should be a fan!'.


In [99]:
new_test_set.to_csv('/home/inglab/sybae/test_dataset100.csv', index=False)

In [100]:
# metrix 평가
test_set_load=pd.read_csv('/home/inglab/sybae/test_dataset002.csv')

In [74]:
#Using BLEU score to compare the real sentences with the generated ones
import statistics
from nltk.translate.bleu_score import sentence_bleu
sentence2=[]
generated=[]

scores=[]

for i in range(len(test_set_load)):
  sentence2.append(test_set_load['sentence2'][i])
  generated.append(test_set_load['generated_text'][i])
weights = (0.25,0.25,0.25,0.25)
score1 = sentence_bleu([sentence2], generated, weights)
score1

0

In [63]:
!pip install rouge



In [75]:
#Rouge score
from rouge import Rouge
rouge=Rouge()

rouge.get_scores(test_set_load['generated_text'], test_set_load['sentence2'], avg=True)

{'rouge-1': {'r': 0.08076212676212674,
  'p': 0.11050310118731169,
  'f': 0.0871779689622092},
 'rouge-2': {'r': 0.02080919080919081,
  'p': 0.020285714285714285,
  'f': 0.018692810094783296},
 'rouge-l': {'r': 0.07767299367299366,
  'p': 0.10478881547302599,
  'f': 0.08317128558710475}}