# How ChatGPT Works Part 2: The Rewards Model

> Given a prompt and a response, the reward model is a model trained to predict a scalar value representing how good a response is

## Data Collection and Labelling

The model is trained on a dataset of multiple responses to the same prompt.

To construct a dataset, human labellers were asked to rank different responses to the same prompt.

With too many options, decision paralysis can kick in (where it takes too long to decide between a lot of options).
A common technique used to simplify this can be to provide only a limited set of options.

## The Training Objective



> The absolute value of the reward predicted by the model is not important. What's important, is the difference between the reward predicted for different responses to a prompt.


In [None]:
from transformers import GPT2LMHeadModel, GPT2Tokenizer

model = GPT2LMHeadModel.from_pretrained('gpt2')
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')


In [None]:
import torch
import pandas as pd
import random

class RewardDataset(torch.utils.data.Dataset):
    def __init__(self, tokenizer, max_len=512):
        """Initializes the dataset."""
        self.data = pd.read_csv('data.csv')
        self.tokenizer = tokenizer
        self.max_len = max_len

    def __len__(self):
        """Returns the length of the dataset."""
        return len(self.data)

    def __getitem__(self, idx):
        """Returns the example in the dataset at the given index."""
        context = self.data[idx]['context']
        response = self.data[idx]['response']
        return context, response
        # text = self.tokenizer.encode(text, add_special_tokens=True, max_length=self.max_len, truncation=True)
        # return torch.tensor(text)


dataset = RewardDataset(tokenizer)
print("Dataset length:", len(dataset))
example_idx = random.randint(0, len(dataset))])
print(dataset[example_idx])


In [None]:


class ChatGPT2RewardModel:
    def __init__(self, model, tokenizer):
        self.model = model
        self.tokenizer = tokenizer

    def __call__(self, context, response):
        """
        Returns a scalar value representing the reward for this response, given the context.
        Args:
            context (str): The context. aka. the prompt.
            response (str): The response. aka. the response to the prompt.
        Returns:
            float: The reward for generating this response given the context.    
        """
        context = self.tokenizer.encode(context, return_tensors='pt')
        response = self.tokenizer.encode(response, return_tensors='pt')
        input_ids = torch.cat([context, response], dim=-1)

        return 
