## Optimizing the finetuned custom GPT2 using Reinforcement Learning from Human Feedback (RLHF) 

Instead of human feedback as a reward mechanism, we use a text generation evaluation metric like `BERTScore` to automate human evaluation. 

##### Prerequisite

In [None]:
%%capture

!pip install jupyter==1.0.0
!pip install ipywidgets==8.0.4
!pip install transformers==4.26.0
!pip install datasets==2.9.0
!pip install wandb==0.13.9
!pip install evaluate==0.4.0
!pip install bert-score==0.3.12
!pip install -e git+https://arunprsh:43211b1b75fad82266961eff3b85a061b53daae5@github.com/lvwerra/trl.git@v0.2.1#egg=trl

#### Imports 

In [3]:
from trl import AutoModelForCausalLMWithValueHead
from transformers import GPT2Tokenizer
from transformers import set_seed
from datasets import load_dataset
from transformers import pipeline
from datasets import Dataset
from random import choices
from trl import PPOTrainer
from trl import PPOConfig
from evaluate import load
from tqdm import tqdm
import transformers 
import pandas as pd
import numpy as np
import bert_score
import ipywidgets
import datasets
import evaluate
import logging
import jupyter
import random
import torch
import wandb
import time
import trl
import os

##### Setup logging

In [4]:
logger = logging.getLogger('sagemaker')
logger.setLevel(logging.DEBUG)
logger.addHandler(logging.StreamHandler())

##### Log versions of dependencies 

In [5]:
logger.info(f'[Using transformers version: {transformers.__version__}]')
logger.info(f'[Using bert_score version: {bert_score.__version__}]')
logger.info(f'[Using evaluate version: {evaluate.__version__}]')
logger.info(f'[Using datasets version: {datasets.__version__}]')
logger.info(f'[Using wandb version: {wandb.__version__}]')
logger.info(f'[Using trl version: {trl.__version__}]')

[Using transformers version: 4.18.0]
[Using bert_score version: 0.3.12]
[Using evaluate version: 0.4.0]
[Using datasets version: 2.9.0]
[Using wandb version: 0.13.9]
[Using trl version: 0.2.1]


#### Setup essentials 

In [6]:
pd.options.display.max_colwidth = None
np.random.seed(123)
tqdm.pandas()
set_seed(123)

In [7]:
!wandb login <USE YOUR WEIGHTS & BIASES API KEY HERE>

[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


In [8]:
path = os.path.abspath('01-rlhf.ipynb')
os.environ['WANDB_NOTEBOOK_NAME'] = path

In [9]:
bertscore = load('bertscore')

##### Set constants 

In [10]:
MODEL_PATH = '.././02-finetune/model/custom-finetuned'
BOS_TOKEN = '<|startoftext|>'
EOS_TOKEN = '<|endoftext|>'
PAD_TOKEN = '<|pad|>'
MAX_LEN = 128

FORWARD_BATCH_SIZE = 16
BATCH_SIZE = FORWARD_BATCH_SIZE * 2

##### Setup configs

In [11]:
config = PPOConfig(model_name=MODEL_PATH, 
                   batch_size=BATCH_SIZE,
                   learning_rate=1.41e-6,
                   forward_batch_size=FORWARD_BATCH_SIZE,
                   remove_unused_columns=False,
                   log_with='wandb')

#### Load models 

In [12]:
active_model = AutoModelForCausalLMWithValueHead.from_pretrained(MODEL_PATH)

In [13]:
ref_model = AutoModelForCausalLMWithValueHead.from_pretrained(MODEL_PATH)

#### Load tokenizer 

In [14]:
tokenizer = GPT2Tokenizer.from_pretrained('../01-tokenize/vocab-custom', 
                                          bos_token=BOS_TOKEN, 
                                          eos_token=EOS_TOKEN, 
                                          pad_token=PAD_TOKEN, 
                                          lower=True,
                                          return_tensors='pt')
# tokenizer.padding_side = 'left'
tokenizer.model_max_length = MAX_LEN
logger.info(f'Tokenizer: {tokenizer}')

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Tokenizer: PreTrainedTokenizer(name_or_path='../01-tokenize/vocab-custom', vocab_size=50257, model_max_len=128, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'bos_token': AddedToken("<|startoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'eos_token': AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'unk_token': AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'pad_token': '<|pad|>'})


#### Load dataset

In [15]:
dataset = load_dataset('csv', 
                       data_files='.././01-tokenize/data/faq_test.csv',  
                       delimiter=',', 
                       split='train[:100%]',
                       download_mode='force_redownload')
dataset

Using custom data configuration default-128f60e33d0bd468


Downloading and preparing dataset csv/default (download: 237.42 KiB, generated: 240.12 KiB, post-processed: Unknown size, total: 477.54 KiB) to /root/.cache/huggingface/datasets/csv/default-128f60e33d0bd468/0.0.0/6b34fb8fcf56f7c8ba51dc895bfa2bfbe43546f190a60fcf74bb5e8afdcc2317...


Downloading data files:   0%|          | 0/1 [00:00<?, ?it/s]

Extracting data files:   0%|          | 0/1 [00:00<?, ?it/s]

Generating train split:   0%|          | 0/681 [00:00<?, ? examples/s]

Dataset csv downloaded and prepared to /root/.cache/huggingface/datasets/csv/default-128f60e33d0bd468/0.0.0/6b34fb8fcf56f7c8ba51dc895bfa2bfbe43546f190a60fcf74bb5e8afdcc2317. Subsequent calls will reuse this data.


Dataset({
    features: ['question', 'answer'],
    num_rows: 681
})

In [16]:
def tokenize(samples: list):
    questions = samples['question']
    ground_truth = samples['answer']
    
    input_ids = []
    query = []
    
    for question in questions:
        prompted_input = f'question: {question}\nanswer:'
        query.append(prompted_input)
        tokenized_input = tokenizer(prompted_input, 
                                    truncation=True)
        input_ids.append(torch.tensor(tokenized_input['input_ids'], dtype=torch.long))
        
    return {'input_ids': input_ids, 'query': query, 'ground_truth': ground_truth, 'questions': questions}

In [17]:
dataset = dataset.map(tokenize, 
                      batched=True, 
                      #num_proc=num_proc, 
                      load_from_cache_file=False, 
                      remove_columns=['question', 'answer'])
dataset.set_format('pt', 
                   columns=['input_ids', 'query', 'ground_truth'],
                   output_all_columns=True)
dataset

  0%|          | 0/1 [00:00<?, ?ba/s]

Dataset({
    features: ['input_ids', 'query', 'ground_truth', 'questions'],
    num_rows: 681
})

##### Create data collator

In [18]:
def collator(dataset):
    result = {}
    for key in dataset[0]:
        values = []
        for d in dataset:
            values.append(d[key])
        result[key] = values
    return result

#### Create Trainer for PPO (Proximal Policy Optimization)

In [19]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

In [20]:
ppo_trainer = PPOTrainer(config, active_model, ref_model, tokenizer, dataset=dataset, data_collator=collator)

[34m[1mwandb[0m: Currently logged in as: [33mshankar-arunp[0m. Use [1m`wandb login --relogin`[0m to force relogin


#### Define CTRL tokens 

In [21]:
ctrl_str = ['[bad]', '[good]']
ctrl_tokens = dict((s, tokenizer.encode(s, return_tensors='pt').squeeze().to(device)) for s in ctrl_str)
ctrl_tokens

{'[bad]': tensor([   59, 32171,    61], device='cuda:0'),
 '[good]': tensor([   59, 13071,    61], device='cuda:0')}

#### Load BERT Pipeline from evaluation phase to generate reward logits 

In [22]:
bert_pipe = pipeline('sentiment-analysis', 
                     model='.././03-evaluate/model', 
                     return_all_scores=True)

#### Define Reward function

In [23]:
def logits_to_reward(logit, task):
    for i in range(len(logits)):
        if task[i]=='[bad]':
            logit[i] = -logit[i]
        elif task[i]=='[good]':
            pass
        else:
            raise ValueError('Task has to be in [0, 1]!')
    return logit

#### Training Loop

In [24]:
for epoch in range(1):
    for i, batch in tqdm(enumerate(ppo_trainer.dataloader)):
        if len(batch['input_ids']) == BATCH_SIZE:
            logger.info(f'Epoch = {epoch+1} | Batch = {i+1} | Size = {BATCH_SIZE}')
            logs, game_data,  = dict(), dict()
            
            task_list = choices(ctrl_str, k=BATCH_SIZE)
            game_data['query'] = [t+q for t, q in zip(task_list, batch['query'])]
            query_tensors = [torch.cat((ctrl_tokens[t], input_ids)) for t, input_ids in zip(task_list, batch['input_ids'])]
            
            bert_scores = []
            ground_truth_responses = batch['ground_truth']
            questions = batch['questions']
            response_tensors = []

            for query, ground_truth_response, question in zip(query_tensors, ground_truth_responses, questions):
                gt_len = len(question.split()) + len(ground_truth_response.split()) + 1
                response = ppo_trainer.generate(query, 
                                                do_sample=True, 
                                                top_k=1, 
                                                min_new_tokens=gt_len,
                                                max_new_tokens=gt_len, 
                                                repetition_penalty=10.0,
                                                length_penalty=-0.1,
                                                pad_token_id=tokenizer.eos_token_id,
                                                eos_token_id=-1,
                                                top_p=1.0)
                response_tensors.append(response.squeeze())
                
            game_data['response'] = [tokenizer.decode(response, skip_special_tokens=True) for response in response_tensors]

            pipe_outputs = bert_pipe(game_data['response'])
       
            logits = [torch.tensor(output[1]['score']) for output in pipe_outputs]
            rewards = logits_to_reward(logits, task_list)
            
            stats = ppo_trainer.step(query_tensors, response_tensors, rewards)
            
            for cs in ctrl_str:
                key = 'env/reward_' + cs.strip('[]')
                stats[key] = np.mean([r.cpu().numpy() for r, t in zip(rewards, task_list) if t==cs])
                
            ppo_trainer.log_stats(stats, game_data, rewards)

0it [00:00, ?it/s]Epoch = 1 | Batch = 1 | Size = 32
1it [00:31, 31.50s/it]Epoch = 1 | Batch = 2 | Size = 32
2it [01:02, 31.38s/it]Epoch = 1 | Batch = 3 | Size = 32
3it [01:32, 30.67s/it]Epoch = 1 | Batch = 4 | Size = 32
4it [02:04, 31.07s/it]Epoch = 1 | Batch = 5 | Size = 32
5it [02:36, 31.36s/it]Epoch = 1 | Batch = 6 | Size = 32
6it [03:08, 31.68s/it]Epoch = 1 | Batch = 7 | Size = 32
7it [03:41, 32.07s/it]Epoch = 1 | Batch = 8 | Size = 32
8it [04:13, 32.15s/it]Epoch = 1 | Batch = 9 | Size = 32
9it [04:44, 31.71s/it]Epoch = 1 | Batch = 10 | Size = 32
10it [05:14, 31.35s/it]Epoch = 1 | Batch = 11 | Size = 32
11it [05:47, 31.83s/it]Epoch = 1 | Batch = 12 | Size = 32
12it [06:18, 31.40s/it]Epoch = 1 | Batch = 13 | Size = 32
13it [06:48, 31.10s/it]Epoch = 1 | Batch = 14 | Size = 32
14it [07:21, 31.66s/it]Epoch = 1 | Batch = 15 | Size = 32
15it [07:53, 31.79s/it]Epoch = 1 | Batch = 16 | Size = 32
16it [08:26, 32.05s/it]Epoch = 1 | Batch = 17 | Size = 32
17it [08:58, 32.09s/it]Epoch = 1 | Ba

##### Save optimized PPO model to local dir

In [25]:
active_model.save_pretrained('./model/gpt2-ppo-bertscore')
tokenizer.save_pretrained('./model/gpt2-ppo-bertscore')

('./model/gpt2-ppo-bertscore/tokenizer_config.json',
 './model/gpt2-ppo-bertscore/special_tokens_map.json',
 './model/gpt2-ppo-bertscore/vocab.json',
 './model/gpt2-ppo-bertscore/merges.txt',
 './model/gpt2-ppo-bertscore/added_tokens.json')

### Compare the PPO tuned models with the reference GPT2 model 

In [26]:
active_model = AutoModelForCausalLMWithValueHead.from_pretrained('./model/gpt2-ppo-bertscore')

Some weights of the model checkpoint at ./model/gpt2-ppo-bertscore were not used when initializing GPT2LMHeadModel: ['v_head.summary.bias', 'v_head.summary.weight']
- This IS expected if you are initializing GPT2LMHeadModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing GPT2LMHeadModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [27]:
test_df = pd.read_csv('.././01-tokenize/data/faq_test.csv')
test_df = test_df.sample(50)
test_df.count()

question    50
answer      50
dtype: int64

In [28]:
def predict(question: str, ground_truth: str, tokenizer: GPT2Tokenizer, model: AutoModelForCausalLMWithValueHead) -> str:
    # create a prompt in compliance with the one used during training without the answer part
    prompt = f'question: {question}\nanswer:'
    # generate tokens
    input_ids = tokenizer(prompt, return_tensors='pt').input_ids
    input_ids = input_ids.to('cuda:0')
    # predict response (answer)
    gt_len = len(question.split()) + len(ground_truth_response.split()) + 1
    model.to(device)
    response = model.generate(input_ids, 
                              do_sample=True, 
                              top_k=1, 
                              min_new_tokens=gt_len,
                              max_new_tokens=gt_len, 
                              repetition_penalty=10.0,
                              length_penalty=-0.1,
                              pad_token_id=tokenizer.eos_token_id,
                              eos_token_id=-1,
                              top_p=1.0)
    # decode the predicted tokens into texts
    response_text = tokenizer.decode(response[0], skip_special_tokens=True)
    answer = response_text.split('answer: ')[-1]
    return answer

In [29]:
ref_gpt2_answers = []
ppo_gpt2_answers_good = []
ppo_gpt2_answers_bad = []

for _, row in tqdm(test_df.iterrows()):
    question, ground_truth = row
    answer = predict(question, ground_truth, tokenizer, ref_model)
    ref_gpt2_answers.append(answer)
    
    answer = predict('[good]'+question, ground_truth, tokenizer, active_model)
    ppo_gpt2_answers_good.append(answer)
    
    answer = predict('[bad]'+question, ground_truth, tokenizer, active_model)
    ppo_gpt2_answers_bad.append(answer)

50it [01:00,  1.21s/it]


In [30]:
bert_score_ref_gpt2 = bertscore.compute(predictions=ref_gpt2_answers, references=test_df['answer'].to_list(), lang='en')['f1']
bert_score_ppo_gpt2_good = bertscore.compute(predictions=ppo_gpt2_answers_good, references=test_df['answer'].to_list(), lang='en')['f1']
bert_score_ppo_gpt2_bad = bertscore.compute(predictions=ppo_gpt2_answers_bad, references=test_df['answer'].to_list(), lang='en')['f1']

test_df['ref_gpt2_answers'] = ref_gpt2_answers
test_df['ppo_gpt2_answers_good'] = ppo_gpt2_answers_good
test_df['ppo_gpt2_answers_bad'] = ppo_gpt2_answers_bad

test_df['bert_score_ref_gpt2'] = bert_score_ref_gpt2
test_df['bert_score_ppo_gpt2_good'] = bert_score_ppo_gpt2_good
test_df['bert_score_ppo_gpt2_bad'] = bert_score_ppo_gpt2_bad

Downloading:   0%|          | 0.00/482 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

In [31]:
test_df.head(50)

Unnamed: 0,question,answer,ref_gpt2_answers,ppo_gpt2_answers_good,ppo_gpt2_answers_bad,bert_score_ref_gpt2,bert_score_ppo_gpt2_good,bert_score_ppo_gpt2_bad
631,what if i was due to start training but the school canceled the training program?,"there are measures in place to assure trade affected workers meet their training enrollment deadline.if all criteria are met, a “training unavailable” waiver can be applied to secure your benefits.your career advisor from the pa careerlink® office will contact you via email with instructions on your next steps.","you should contact your local health department or state public education agency. please consult with their medical professional for advice about rescheduling, and follow up on any changes that may be needed in order not later than 30 days after starting work (see guidance","yes. you should contact your local health department and ask for a waiver from their usual enrollment requirements, including an extension of eligibility that would have expired on or after march 30th 2020 (or at least 60 days before schools are scheduled)","yes. you should contact your local health department and ask for a delay in starting or finishing work, which is why it's so important that we follow all of our protocols as well!abay'mon however this isn't always",0.833093,0.840821,0.83462
375,get dressed like you’re going to class in person.,"if you stay in your pajamas all day, it’s going to be super tough to pay attention in class. try waking up, eating breakfast, taking a shower, and putting on real clothes to get yourself in the school mindset. you can even put on a pair of shoes to really trick your brain into thinking you’re going somewhere.","if your school is closed, try using a dress code that says “no clothes or shoes!” then go on the shopping list and see what they want for their kids instead of just buying pajamas from","if your school is open, they may have a few options for students who are not comfortable wearing masks or coverings and can choose between using cloth face covers (or putting them on the back of their own).","if your school is open, they may have a few options for students who are not comfortable wearing masks or keeping their distance from others when out and about (or even just staying home). this will help keep",0.849176,0.835661,0.836425
131,can i still apply for my educator certification or complete my educator preparation program (epps),"for questions about educator certification or the new jersey department of education's education preparation programs, please see the department's resource page on educator preparation programs and certification. source: https://www.nj.gov/education/covid19/teacherresources/eppcert.shtml","yes, you may be eligible to receive your education. if so please contact the funding icuc career development center at 1-800‑4255 and ask questions about eligibility requirements related specifically covid19 relief programs that are available on our website","yes, you can continue to provide your education with the same educational materials as usual. however please note that if a school is closed due of covid-19 and/or has an altered schedule in which students are unableto attend classes on","yes, you can continue to provide your education with the same educational materials as other students. however if a student is unable due their school’s closure and/or has been placed on leave without pay because of covid-19 concerns they",0.831177,0.813092,0.811913
573,i have a reason i cannot wear a face covering. am i required to document or prove that?,"no, the order does not require that. please do not contact your healthcare provider to obtain documentation.","yes, you can use cloth masks if they are not medical-grade and don’t provide any protection against covid 19 (or respiratory droplets) from your mouth/mouth contact with an infected person wearing the mask while in public without physical distancing measures at","no, you are not obligated under the pennsylvania order and can be fined for wearing one of these coverings in public places without having an excuse from your governor!abayffrance is exempt if someone refuses voluntarily complying with this executive orders because they do so","no, you are not obligated under the pennsylvania order and can be fined for wearing one of these coverings in public places without having an excuse from your governor!abay rescinded this requirement on july 2nd when gov murphy announced new jersey’s mask mandate",0.840175,0.859676,0.858914
676,should a mask be worn during school?,"evidence continues to mount on the importance of universal face coverings in interrupting the spread of sars-cov-2. school staff and older students (middle or high school) may be able to wear cloth face coverings safely and consistently and should be encouraged to do so. children under 2 years and anyone who has trouble breathing or is unconscious, incapacitated, or otherwise unable to remove a face covering without assistance should not wear cloth face coverings.","yes, masks are required in schools. if you have questions about wearing face coverings and the rules around masking (including when children under 2 years old can not wear them), please contact your local health","yes. masks are not required in schools, but they should still have an ear loops or nose wire to insert into the fabric of your face covering for at least 20 seconds (or 1 metre)","yes. the cdc does not recommend wearing masks in schools, but it encourages everyone to wear them when they are around people who do or have covid-19 and for adults with underlying health conditions (",0.849636,0.831789,0.840038
491,check in on your loved ones in the morning.,"consider chatting with people while you eat meals or at other downtime during the day. if they are also working remotely this should be easy to make happen. if you or your family gets the virus, that becomes the first priority.","if you’re feeling anxious, call ahead to see how they are doing and talk with them about what is going right now! try setting up a video chat or phone conversation so that everyone can stay calm","if you’re feeling anxious, call ahead to see how they are doing and talk with them about what is going well for their day so that everyone can get a sense of where things go from here!","if you’re feeling confused or scared, call ahead to see how they are doing and what their actions can help them cope with it!abayix® is a registered nurse licensed by pennsylvania state health",0.861488,0.850587,0.833988
78,additional resources:,the u.s. centers for disease control and prevention (cdc) is the best source for information about covid-19 in people and risks to animals. cdc information about covid-19 in animals may be found at www.cdc.gov/coronavirus/2019-ncov/daily-life-coping/animals.html.,the department of health is working closely with other public and private entities to provide support for covid-19 response. please visit https//www1dhhsphoriagov/,"please note that the cdc does not provide any additional information on covid-19.abay, however!” if you’re unsure about what to do with your","if you’re not sure what your child is doing, call ahead to ask them.abay thank the school district for providing information and support during this time of crisis",0.819391,0.811321,0.805682
452,how can i find out if i can now have the elective surgery the mtf scheduled me for before cancelling it due to covid-19?,"work with your military medical hospital or clinic, or your dental treatment facility. staff should be able to talk to you about the status of your surgery or procedure. they can guide you on what need to do to reschedule it.","you should contact your local health department or hospital. please visit our website at wwwhealthscccsnursepracticecovid 19 webpage and follow cdc guidance on rescheduling surgeries, including recommendations from providers regarding patient care during this pandemic (see faq). see frequently asked questions about delaying surgical procedures","yes. you should contact your local health department and ask them how long they will allow this procedure, which is a few weeks after when we are supposed back in business again!abay thank goodness that surgeons continue their normal procedures as well so there’s no need at all not","yes. you should contact your local health department and ask them about rescheduling, including any changes in patient care plans that may be needed during this pandemic or when surgeries are rescheduled because of a lack thereof (including delays related specifically with coronavirus). please note they will not provide specific information on",0.851786,0.844723,0.863034
57,can i catch covid-19 from the faeces of someone with the disease?,"while initial investigations suggest the virus may be present in faeces in some cases, to date, there have not been reports of faecal-oral transmission of covid-19. additionally, there is no evidence to date on the survival of the covid-19 virus in water or sewage. who is assessing ongoing research on the ways covid-19 is spread and will continue to share new findings on this topic.","no. there is currently not a vaccine available for this virus, and it’s unlikely that any person who has had contact to an infected surface will be able or willing (or even want) get sick again after they","no. this is not a common virus, but it can be spread by people who are infected and don't have any symptoms (including fever or cough). however there's always some chance that you could get sick if they","no. this is not a common virus, but it can be spread by people who are infected and don't have any symptoms (including fever or cough). however there's always some chance that you could get sick if they",0.836049,0.836124,0.836124
670,what is the food and nutrition service (fns) doing to ensure children have food to eat while schools are closed?,"states switch to their summer food service program (sfsp) or seamless summer option (sso) to serve meals to children when schools are closed. through these summer meal programs, usda allows sites to serve up to two free meals a day to children 18 and under.","fsis has been working closely with school districts, local health departments in new york state as well. they will continue providing meals for students who need them most during this time of closure or may be able provide additional services if needed by other organizations that can help address","yes. usda has provided guidance on how school meals can be delivered, including a list of foods that should not include any additional information about covid-19 in this article for students who may need them most during these times due their health or financial concerns related specifically with","yes, usda has taken steps in recent weeks including distributing meals of any kind. however it does not provide guidance on how much school meal programs can be provided during this time as well – which may change depending upon local conditions or other circumstances that might dictate when students",0.842897,0.836163,0.847232


In [32]:
np.mean(test_df['bert_score_ref_gpt2'])

0.838792290687561

In [33]:
np.mean(test_df['bert_score_ppo_gpt2_good'])

0.8369160592556

In [34]:
np.mean(test_df['bert_score_ppo_gpt2_bad'])

0.8360180878639221