### Fine-tuning gpt2-small

GPT-2 is a transformers model pretrained on a very large corpus of English data in a self-supervised fashion. This means it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it was trained to guess the next word in sentences.

Moreover, inputs are sequences of continuous text of a certain length and the targets are the same sequence, shifted one token (word or piece of word) to the right. The model uses internally a mask-mechanism to make sure the predictions for the token i only uses the inputs from 1 to i but not the future tokens.

This is the smallest version of GPT-2, with 124M parameters. 

Connect to Goodle Drive and upload the dataset

In [49]:
try:
  from google.colab import drive
  drive.mount('/content/drive')
  import sys
  path_to_project = '/content/drive/MyDrive/NLP_Project'
  sys.path.append(path_to_project)
  IN_COLAB = True
except:
  IN_COLAB = False

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [51]:
import pandas as pd
import torch
import numpy as np

In [52]:
dataset_path = path_to_project + '/final_ds.csv' if IN_COLAB else '/final_ds.csv'
df = pd.read_csv(dataset_path)

Import libraries that are going to be used in the following subsections

In [None]:
!pip install -q transformers
!pip install datasets

In [None]:
from datasets import Dataset, DatasetDict

#### Dataset Pre-processing



For the fine-tuning of gpt2-small we decided to keep 15000 samples. The training lasts approximately 2 hours, and, given the size and the pre-training of the model, gpt2 had better performance with a bigger amount of samples.

In [None]:
df = df.head(15000).copy()
df.shape

Subset the data into training dataset, validation dataset and test dataset. We decided to use 10% of the dataset for tesing, and the remaining for train and validation.

In [57]:
from sklearn.model_selection import train_test_split

train_val, test = train_test_split(df, test_size=0.1)
train, val = train_test_split(train_val, test_size=0.2)

In [58]:
print('# train instances: ', train.shape[0])
print('# test instances:  ', test.shape[0])
print('# val instances:   ', val.shape[0])

# train instances:  10800
# test instances:   1500
# val instances:    2700


Now we check the device

In [59]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

cuda


#### Model and Tokenizer

Retrieve the model and the tokenizer

In [64]:
from transformers import GPT2Tokenizer, GPT2LMHeadModel

In [65]:
model_name = 'gpt2'

Upload the Tokenizer

In [66]:
tokenizer_gpt2 = GPT2Tokenizer.from_pretrained(model_name)

In [67]:
tokenizer_gpt2.pad_token = tokenizer_gpt2.eos_token

In [68]:
print("vocabulary size: ", tokenizer_gpt2.vocab_size)

vocabulary size:  50257


In [69]:
list(tokenizer_gpt2.get_vocab().items())[600:610]

[('int', 600),
 ('ress', 601),
 ('ations', 602),
 ('ail', 603),
 ('Ġ4', 604),
 ('ical', 605),
 ('Ġthem', 606),
 ('Ġher', 607),
 ('ount', 608),
 ('ĠCh', 609)]

In [70]:
text = "You have an array in input, order the elements in it in O(n) time complexity. Add a wrong wordd"
encoded_input = tokenizer_gpt2._tokenize(text)
print(encoded_input)

['You', 'Ġhave', 'Ġan', 'Ġarray', 'Ġin', 'Ġinput', ',', 'Ġorder', 'Ġthe', 'Ġelements', 'Ġin', 'Ġit', 'Ġin', 'ĠO', '(', 'n', ')', 'Ġtime', 'Ġcomplexity', '.', 'ĠAdd', 'Ġa', 'Ġwrong', 'Ġword', 'd']


In [71]:
encoded_ids = tokenizer_gpt2(text)['input_ids']
print(encoded_ids)

[1639, 423, 281, 7177, 287, 5128, 11, 1502, 262, 4847, 287, 340, 287, 440, 7, 77, 8, 640, 13357, 13, 3060, 257, 2642, 1573, 67]


It was necessary to use GPT2LMHeadModel instead of GPT2MOdel, beacuse the latter is not compatible with the Trainer due to a mismatch of the arguments.

In [73]:
gpt2 = GPT2LMHeadModel.from_pretrained(model_name, device_map=device)
gpt2.resize_token_embeddings(len(tokenizer_gpt2))

Embedding(50257, 768)

In [74]:
print(gpt2)

GPT2LMHeadModel(
  (transformer): GPT2Model(
    (wte): Embedding(50257, 768)
    (wpe): Embedding(1024, 768)
    (drop): Dropout(p=0.1, inplace=False)
    (h): ModuleList(
      (0-11): 12 x GPT2Block(
        (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (attn): GPT2Attention(
          (c_attn): Conv1D(nf=2304, nx=768)
          (c_proj): Conv1D(nf=768, nx=768)
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (mlp): GPT2MLP(
          (c_fc): Conv1D(nf=3072, nx=768)
          (c_proj): Conv1D(nf=768, nx=3072)
          (act): NewGELUActivation()
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (ln_f): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
  )
  (lm_head): Linear(in_features=768, out_features=50257, bias=False)
)


Consider now the number of parameters of the model

In [75]:
n_params = sum(param.numel() for param in gpt2.parameters())
n_params

124439808

In [76]:
for name, param in gpt2.named_parameters():
    print(f"Parameter name: {name}")
    print(f"Parameter shape: {param.size()}")
    print(f"Is trainable: {param.requires_grad}")
    print()

Parameter name: transformer.wte.weight
Parameter shape: torch.Size([50257, 768])
Is trainable: True

Parameter name: transformer.wpe.weight
Parameter shape: torch.Size([1024, 768])
Is trainable: True

Parameter name: transformer.h.0.ln_1.weight
Parameter shape: torch.Size([768])
Is trainable: True

Parameter name: transformer.h.0.ln_1.bias
Parameter shape: torch.Size([768])
Is trainable: True

Parameter name: transformer.h.0.attn.c_attn.weight
Parameter shape: torch.Size([768, 2304])
Is trainable: True

Parameter name: transformer.h.0.attn.c_attn.bias
Parameter shape: torch.Size([2304])
Is trainable: True

Parameter name: transformer.h.0.attn.c_proj.weight
Parameter shape: torch.Size([768, 768])
Is trainable: True

Parameter name: transformer.h.0.attn.c_proj.bias
Parameter shape: torch.Size([768])
Is trainable: True

Parameter name: transformer.h.0.ln_2.weight
Parameter shape: torch.Size([768])
Is trainable: True

Parameter name: transformer.h.0.ln_2.bias
Parameter shape: torch.Size([7

#### Training


Now that we have both Tokenizer and Model we can tokenize the dataset and train the model

The eos tokens have to be added by hand to obtain a single string for each 'conversation'. In a single string we include the user input, that includes the description of the problem, and the time and space complexities, and the espected answer from the assistant.

In [None]:
def apply_eos_token(idx, df, eos_token):
    # build the user input including the problem description, the time complexity and the space complexity
    chat_string = 'User:' + df.loc[idx, 'problem_description'] + ' Time complexity: ' + df.loc[idx, 'time_complexity_inferred'] + '; Space complexity: ' + df.loc[idx, 'space_complexity_inferred']
    # now add the eos token and the response from the assistant, the code solution for the problem
    chat_string = chat_string + eos_token + 'Assistant: ' + df.loc[idx, 'solution_code'] + eos_token
    return chat_string

In [None]:
train_str = [apply_eos_token(idx, train, tokenizer_gpt2.eos_token) for idx in train.index]
test_str = [apply_eos_token(idx, test, tokenizer_gpt2.eos_token) for idx in test.index]
val_str = [apply_eos_token(idx, val, tokenizer_gpt2.eos_token) for idx in val.index]

In [79]:
train_str[0]

"User: Naruto has sneaked into the Orochimaru's lair and is now looking for Sasuke. There are T rooms there. Every room has a door into it, each door can be described by the number n of seals on it and their integer energies a_1, a_2, ..., a_n. All energies a_i are nonzero and do not exceed 100 by absolute value. Also, n is even.\n\nIn order to open a door, Naruto must find such n seals with integer energies b_1, b_2, ..., b_n that the following equality holds: a_{1} ⋅ b_{1} + a_{2} ⋅ b_{2} + ... + a_{n} ⋅ b_{n} = 0. All b_i must be nonzero as well as a_i are, and also must not exceed 100 by absolute value. Please find required seals for every room there.\n\nInput\n\nThe first line contains the only integer T (1 ≤ T ≤ 1000) standing for the number of rooms in the Orochimaru's lair. The other lines contain descriptions of the doors.\n\nEach description starts with the line containing the only even integer n (2 ≤ n ≤ 100) denoting the number of seals.\n\nThe following line contains the s

The strings where the eos token was applied, are now put in a dictionary structure

In [80]:
train_data = Dataset.from_dict({'chat': train_str})
test_data = Dataset.from_dict({'chat': test_str})
val_data = Dataset.from_dict({'chat': val_str})

In [81]:
data = DatasetDict()
data['train'] = train_data
data['val'] = val_data
data['test'] = test_data

Tokenize the data

In [82]:
def tokenize_function(examples):
    input_encodings = tokenizer_gpt2(examples["chat"],
        truncation=True,
        padding="max_length",
        max_length=512)
    sample = {
        'input_ids': input_encodings.input_ids,
        'attention_mask': input_encodings.attention_mask,
        'labels': input_encodings.input_ids.copy()
    }
    return sample

tokenized_data = data.map(tokenize_function, batched=True)

Map:   0%|          | 0/10800 [00:00<?, ? examples/s]

Map:   0%|          | 0/2700 [00:00<?, ? examples/s]

Map:   0%|          | 0/1500 [00:00<?, ? examples/s]

In [83]:
# get all sequences in same batch
from transformers import DataCollatorForLanguageModeling
data_collator = DataCollatorForLanguageModeling(tokenizer_gpt2, mlm=False)

Now the training starts

#### Calculate Perplexity before fine-tuning.

Get a random conversation from the test set

In [84]:
import random

random.seed(43)

idx = random.choice(range(len(test_data))) # select a random conversation
print(idx)
dialogue = test_data['chat'][idx]
print(dialogue)

78
User: Pavel has several sticks with lengths equal to powers of two.

He has a_0 sticks of length 2^0 = 1, a_1 sticks of length 2^1 = 2, ..., a_{n-1} sticks of length 2^{n-1}. 

Pavel wants to make the maximum possible number of triangles using these sticks. The triangles should have strictly positive area, each stick can be used in at most one triangle.

It is forbidden to break sticks, and each triangle should consist of exactly three sticks.

Find the maximum possible number of triangles.

Input

The first line contains a single integer n (1 ≤ n ≤ 300 000) — the number of different lengths of sticks.

The second line contains n integers a_0, a_1, ..., a_{n-1} (1 ≤ a_i ≤ 10^9), where a_i is the number of sticks with the length equal to 2^i.

Output

Print a single integer — the maximum possible number of non-degenerate triangles that Pavel can make.

Examples

Input


5
1 2 2 2 2


Output


3


Input


3
1 1 1


Output


0


Input


3
3 3 3


Output


3

Note

In the first example,

In [85]:
# now take only the 'input part' and the 'output part'
# parse string
test_input, test_output, end = dialogue.split('<|endoftext|>')

test_input = test_input + tokenizer_gpt2.eos_token + ' Assistant: '
test_output = test_output + tokenizer_gpt2.eos_token

print('User input: ', test_input)
print('####')
print('Correct solution output: ', test_output)

User input:  User: Pavel has several sticks with lengths equal to powers of two.

He has a_0 sticks of length 2^0 = 1, a_1 sticks of length 2^1 = 2, ..., a_{n-1} sticks of length 2^{n-1}. 

Pavel wants to make the maximum possible number of triangles using these sticks. The triangles should have strictly positive area, each stick can be used in at most one triangle.

It is forbidden to break sticks, and each triangle should consist of exactly three sticks.

Find the maximum possible number of triangles.

Input

The first line contains a single integer n (1 ≤ n ≤ 300 000) — the number of different lengths of sticks.

The second line contains n integers a_0, a_1, ..., a_{n-1} (1 ≤ a_i ≤ 10^9), where a_i is the number of sticks with the length equal to 2^i.

Output

Print a single integer — the maximum possible number of non-degenerate triangles that Pavel can make.

Examples

Input


5
1 2 2 2 2


Output


3


Input


3
1 1 1


Output


0


Input


3
3 3 3


Output


3

Note

In the firs

We decided to use pipeline from the transformers library to generate the output of the model, indicating the task 'text-generation'

In [None]:
!pip install pipeline transformers evaluate
from transformers import pipeline
import evaluate

In [87]:
pipe = pipeline("text-generation", model= gpt2, tokenizer=tokenizer_gpt2)

Device set to use cuda


In [88]:
gen_output = pipe(test_input, max_new_tokens=256)
print("Generated text:\n", gen_output[0]["generated_text"])

Generated text:
 User: Pavel has several sticks with lengths equal to powers of two.

He has a_0 sticks of length 2^0 = 1, a_1 sticks of length 2^1 = 2, ..., a_{n-1} sticks of length 2^{n-1}. 

Pavel wants to make the maximum possible number of triangles using these sticks. The triangles should have strictly positive area, each stick can be used in at most one triangle.

It is forbidden to break sticks, and each triangle should consist of exactly three sticks.

Find the maximum possible number of triangles.

Input

The first line contains a single integer n (1 ≤ n ≤ 300 000) — the number of different lengths of sticks.

The second line contains n integers a_0, a_1, ..., a_{n-1} (1 ≤ a_i ≤ 10^9), where a_i is the number of sticks with the length equal to 2^i.

Output

Print a single integer — the maximum possible number of non-degenerate triangles that Pavel can make.

Examples

Input


5
1 2 2 2 2


Output


3


Input


3
1 1 1


Output


0


Input


3
3 3 3


Output


3

Note

In the 

The generated text has nothing to do with python code and includes references to pop culture, probably included in the pre-training of the model

**Perplexity**

In [89]:
# get inputs from test_data
test_input = [dialogue.split('<|endoftext|>')[0] + tokenizer_gpt2.eos_token for dialogue in test_data['chat'][:3]]

# get outputs from test data
test_output = [dialogue.split('<|endoftext|>')[1] + tokenizer_gpt2.eos_token for dialogue in test_data['chat'][:3]]

In [90]:
perplexity_metric = evaluate.load("perplexity", module_type="metric")
perplexity_before_finetuning = perplexity_metric.compute(
    predictions= test_input,
    add_start_token=False,
    model_id= 'gpt2'
)
print("Perplexity before fine-tuning:", perplexity_before_finetuning)

Downloading builder script:   0%|          | 0.00/8.46k [00:00<?, ?B/s]

  0%|          | 0/1 [00:00<?, ?it/s]

Perplexity before fine-tuning: {'perplexities': [28.182695388793945, 14.845183372497559, 14.715679168701172], 'mean_perplexity': np.float64(19.24785264333089)}


#### Fine-Tuning
Now we can start the fine-tuning

In [None]:
from transformers import TrainingArguments, Trainer
import os

In [None]:
os.environ["WANDB_DISABLED"] = "true"

In [None]:
training_args = TrainingArguments(
    "gpt2_trainer",
    #evaluation_strategy="steps",
    per_device_train_batch_size=2,
    per_device_eval_batch_size=2,
    gradient_accumulation_steps=8, 
    num_train_epochs=3,
    save_steps=500,
    eval_steps=500,
    learning_rate=6.25e-5,
    lr_scheduler_type="linear",
    bf16=True,  
    report_to=None,  
)

Using the `WANDB_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).


In [None]:
trainer = Trainer(
    model= gpt2,
    args= training_args,
    train_dataset= tokenized_data['train'],
    eval_dataset= tokenized_data['val'],
    data_collator= data_collator
)

Measure the time taken by the model to finish training

In [None]:
import time
begin = time.time()

In [None]:
trainer.train()

Step,Training Loss
500,1.1339
1000,0.3411
1500,0.2369
2000,0.2114


TrainOutput(global_step=2025, training_loss=0.4775619372615108, metrics={'train_runtime': 7315.882, 'train_samples_per_second': 4.429, 'train_steps_per_second': 0.277, 'total_flos': 8465861836800000.0, 'train_loss': 0.4775619372615108, 'epoch': 3.0})

In [None]:
end = time.time()
print("Training time: ", end - begin)

Training time:  7316.4667019844055


In [None]:
# Get all of the model's parameters as a list of tuples.
params = list(gpt2.named_parameters())

print('The GPT-2 model has {:} different named parameters.\n'.format(len(params)))

print('==== Embedding Layer ====\n')

for p in params[0:2]:
    print("{:<55} {:>12}".format(p[0], str(tuple(p[1].size()))))

print('\n==== First Transformer ====\n')

for p in params[2:14]:
    print("{:<55} {:>12}".format(p[0], str(tuple(p[1].size()))))

print('\n==== Output Layer ====\n')

for p in params[-2:]:
    print("{:<55} {:>12}".format(p[0], str(tuple(p[1].size()))))

The GPT-2 model has 148 different named parameters.

==== Embedding Layer ====

transformer.wte.weight                                  (50257, 768)
transformer.wpe.weight                                   (1024, 768)

==== First Transformer ====

transformer.h.0.ln_1.weight                                   (768,)
transformer.h.0.ln_1.bias                                     (768,)
transformer.h.0.attn.c_attn.weight                       (768, 2304)
transformer.h.0.attn.c_attn.bias                             (2304,)
transformer.h.0.attn.c_proj.weight                        (768, 768)
transformer.h.0.attn.c_proj.bias                              (768,)
transformer.h.0.ln_2.weight                                   (768,)
transformer.h.0.ln_2.bias                                     (768,)
transformer.h.0.mlp.c_fc.weight                          (768, 3072)
transformer.h.0.mlp.c_fc.bias                                (3072,)
transformer.h.0.mlp.c_proj.weight                        (3072

In [None]:
gpt2.eval()

GPT2LMHeadModel(
  (transformer): GPT2Model(
    (wte): Embedding(50257, 768)
    (wpe): Embedding(1024, 768)
    (drop): Dropout(p=0.1, inplace=False)
    (h): ModuleList(
      (0-11): 12 x GPT2Block(
        (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (attn): GPT2Attention(
          (c_attn): Conv1D(nf=2304, nx=768)
          (c_proj): Conv1D(nf=768, nx=768)
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (mlp): GPT2MLP(
          (c_fc): Conv1D(nf=3072, nx=768)
          (c_proj): Conv1D(nf=768, nx=3072)
          (act): NewGELUActivation()
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (ln_f): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
  )
  (lm_head): Linear(in_features=768, out_features=50257, bias=False)
)

Save model's parameters

In [None]:
from datetime import datetime

gpt2_finetune_path = path_to_project + '/Transformer-trained-models/' + f"gpt2_fine_tuning_(chat+code)_{datetime.now().strftime('%Y_%m_%d_%H_%M_%S')}"
tokenizer_gpt2.save_pretrained(gpt2_finetune_path)
gpt2.save_pretrained(gpt2_finetune_path)
print(f"Checkpoint saved at: \'{gpt2_finetune_path}\'")

Checkpoint saved at: '/content/drive/MyDrive/UnB/NLP_Project/Transformer-trained-models/gpt2_fine_tuning_(chat+code)_2025_05_15_13_50_58'


#### Testing

Retrieve the trained model

In [None]:
!pip install evaluate
import evaluate

In [92]:
device = 'cuda'

gpt2_finetune_path = '/content/drive/MyDrive/NLP_Project/Transformer-trained-models/gpt2_fine_tuning_(chat+code)_2025_05_15_13_50_58'
tokenizer_gpt2 = GPT2Tokenizer.from_pretrained(gpt2_finetune_path)
gpt2 = GPT2LMHeadModel.from_pretrained(gpt2_finetune_path, device_map=device)

To test the model, we first extract one chat from the test data randomly, give it as input to the model, and we see the response

In [93]:
import random

random.seed(43)

idx = random.choice(range(len(test_data))) # select a random conversation
print(idx)
dialogue = test_data['chat'][idx]
print(dialogue)

78
User: Pavel has several sticks with lengths equal to powers of two.

He has a_0 sticks of length 2^0 = 1, a_1 sticks of length 2^1 = 2, ..., a_{n-1} sticks of length 2^{n-1}. 

Pavel wants to make the maximum possible number of triangles using these sticks. The triangles should have strictly positive area, each stick can be used in at most one triangle.

It is forbidden to break sticks, and each triangle should consist of exactly three sticks.

Find the maximum possible number of triangles.

Input

The first line contains a single integer n (1 ≤ n ≤ 300 000) — the number of different lengths of sticks.

The second line contains n integers a_0, a_1, ..., a_{n-1} (1 ≤ a_i ≤ 10^9), where a_i is the number of sticks with the length equal to 2^i.

Output

Print a single integer — the maximum possible number of non-degenerate triangles that Pavel can make.

Examples

Input


5
1 2 2 2 2


Output


3


Input


3
1 1 1


Output


0


Input


3
3 3 3


Output


3

Note

In the first example,

In [94]:
# now take only the 'input part' and the 'output part'
# parse string
test_input, test_output, end = dialogue.split('<|endoftext|>')

# add eos token at the end of test_input and test_output
test_input = test_input + tokenizer_gpt2.eos_token + ' Assistant: '
test_output = test_output + tokenizer_gpt2.eos_token

print('User input: \n', test_input)
print('####')
print('Correct solution output: \n', test_output)

User input: 
 User: Pavel has several sticks with lengths equal to powers of two.

He has a_0 sticks of length 2^0 = 1, a_1 sticks of length 2^1 = 2, ..., a_{n-1} sticks of length 2^{n-1}. 

Pavel wants to make the maximum possible number of triangles using these sticks. The triangles should have strictly positive area, each stick can be used in at most one triangle.

It is forbidden to break sticks, and each triangle should consist of exactly three sticks.

Find the maximum possible number of triangles.

Input

The first line contains a single integer n (1 ≤ n ≤ 300 000) — the number of different lengths of sticks.

The second line contains n integers a_0, a_1, ..., a_{n-1} (1 ≤ a_i ≤ 10^9), where a_i is the number of sticks with the length equal to 2^i.

Output

Print a single integer — the maximum possible number of non-degenerate triangles that Pavel can make.

Examples

Input


5
1 2 2 2 2


Output


3


Input


3
1 1 1


Output


0


Input


3
3 3 3


Output


3

Note

In the fir

In [None]:
!pip install pipeline transformers
from transformers import pipeline

pipe_finetuned = pipeline("text-generation", model= gpt2, tokenizer=tokenizer_gpt2)

In [96]:
gen_output = pipe_finetuned(test_input, max_new_tokens=256)
print("Generated text after the fine-tuning: \n", gen_output[0]["generated_text"])

Generated text after the fine-tuning: 
 User: Pavel has several sticks with lengths equal to powers of two.

He has a_0 sticks of length 2^0 = 1, a_1 sticks of length 2^1 = 2, ..., a_{n-1} sticks of length 2^{n-1}. 

Pavel wants to make the maximum possible number of triangles using these sticks. The triangles should have strictly positive area, each stick can be used in at most one triangle.

It is forbidden to break sticks, and each triangle should consist of exactly three sticks.

Find the maximum possible number of triangles.

Input

The first line contains a single integer n (1 ≤ n ≤ 300 000) — the number of different lengths of sticks.

The second line contains n integers a_0, a_1, ..., a_{n-1} (1 ≤ a_i ≤ 10^9), where a_i is the number of sticks with the length equal to 2^i.

Output

Print a single integer — the maximum possible number of non-degenerate triangles that Pavel can make.

Examples

Input


5
1 2 2 2 2


Output


3


Input


3
1 1 1


Output


0


Input


3
3 3 3


Ou

As it can be noticed, after the fine-tuning the model is able to generate python code. The generated text starts with a reference to the user input, but it able to go forward and include an answer from the 'Assistant'. The model is not yet able to generate a correct solution for the problem, but given the smaller size of gpt2-small, it was expected.

#### Evaluation Metrics

It was decided to consider the metrics Perplexity and BLEU.

**Perplexity**

In [97]:
# get inputs from test_data
test_input = [dialogue.split('<|endoftext|>')[0] + tokenizer_gpt2.eos_token for dialogue in test_data['chat'][:3]]

# get outputs from test data
test_output = [dialogue.split('<|endoftext|>')[1] + tokenizer_gpt2.eos_token for dialogue in test_data['chat'][:3]]

In [98]:
perplexity_metric = evaluate.load("perplexity", module_type="metric")
perplexity = perplexity_metric.compute(
    predictions= test_input,
    add_start_token=False,
    model_id= gpt2_finetune_path
)
print("Perplexity post fine-tuning:", perplexity)

  0%|          | 0/1 [00:00<?, ?it/s]

Perplexity post fine-tuning: {'perplexities': [1.0604777336120605, 1.047598958015442, 1.0766164064407349], 'mean_perplexity': np.float64(1.0615643660227458)}


The perplexity calculated before the fine-tuning was [28.182695388793945, 14.845183372497559, 14.715679168701172] with mean_perplexity: np.float64(19.24785264333089)

We can see, now, that the perplexity values have sistematically decreased, indicating that the fine-tuning  process was able to improve the performance of the model on this specific task.

**BLEU**

BLEU focuses on precision by counting matching n-grams.

In [None]:
# BLEU Score
bleu_metric = evaluate.load("bleu")
bleu_results = bleu_metric.compute(
    predictions= [pipe_finetuned(test_input[0], max_new_tokens=256)[0]['generated_text']], 
    references= [test_output[0]]  
)
print("BLEU Score:", bleu_results)

Downloading builder script:   0%|          | 0.00/5.94k [00:00<?, ?B/s]

Downloading extra modules:   0%|          | 0.00/1.55k [00:00<?, ?B/s]

Downloading extra modules:   0%|          | 0.00/3.34k [00:00<?, ?B/s]

BLEU Score: {'bleu': 0.12839728594651886, 'precisions': [0.3142857142857143, 0.14537444933920704, 0.09271523178807947, 0.06415929203539823], 'brevity_penalty': 1.0, 'length_ratio': 1.6666666666666667, 'translation_length': 455, 'reference_length': 273}


The Bleu scores are still low overall, but are much higher than the Bleu scores for the T5 models, further indicating how the better performance among similar sized model was achieved by the gpt2 model