<a href="https://colab.research.google.com/github/parthasarathydNU/gen-ai-coursework/blob/main/advanced-llms/direct-preference-optimization/DPODataSetGeneration.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Direct Preference Optimiation

This is a technique proposed in the paper https://arxiv.org/abs/2305.18290 which talks about a method that we can leverage to fine tune our model to understand our preferences better than compared to a reinforcement learning technique.

**Check out this YouTube Video to learn more about DPO.**

In [None]:
from IPython.display import YouTubeVideo

# Replace 'VIDEO_ID' with the actual ID of the YouTube video you want to embed
video_id = 'HCFTXTn1PHA'

# Create a YouTubeVideo object
video = YouTubeVideo(video_id)

# Display the video
display(video)


**Logging into huggingFace to get access to dataset and models**

In [2]:
!git config --global credential.helper store

# Install the huggingface_hub library
!pip install huggingface_hub -q

In [3]:
# Import the notebook_login method
from huggingface_hub import notebook_login

# Log in interactively
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

### In this Notebook we will be generating a preference data set to later fine tune a base model

#### Step 1: Extract the Lima dataset’s instruction

In [3]:
!pip install datasets -q --no-cache

In [4]:
from datasets import load_dataset

# Login using e.g. `huggingface-cli login` to access this dataset
ds = load_dataset("GAIR/lima")

Downloading data:   0%|          | 0.00/1.68M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/27.3k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/1030 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/300 [00:00<?, ? examples/s]

**Extract the `conversations` from the `train` subset and sample 50 instructions**

In [19]:
import random

# Set the random seed for reproducibility
random.seed(42)  # You can choose any number as the seed

# Extract conversations from the train subset
conversations = ds['train']['conversations']

# Sample 50 random instructions
sampled_conversations = random.sample(conversations, 50)

sample_conversation_list = []

# Display the sampled conversations
for i, conversation in enumerate(sampled_conversations, 1):
    sample_conversation_list.append(conversation[0])
    print(f"{i}. {conversation[0]}")

1. Can I spend the night alone in a tent in a forest outside Stockholm in -20°C without risking my life?

The backstory
From the end of January, I'm starting my studies in a suburb of Stockholm. I've decided to, if it turns out plausible, not rent an apartment, but live in a tent. (This is not out of frugality, but out of a will to try something new.)
I do have friends who I could visit once a week or so to prepare food and wash my clothes, so I think I can solve the practical problems, or at least those that I've come to think of. I'd camp in one of the forests, maybe 1 km from "civilisation". I'd have access to showers etc at university every day.
However: I don't want to freeze to death in my sleep! That's very important to me. I've read that the nights can get as cold as -20°C (-4°F). With the proper preparations, would this be a plausible way of living, at least for a month or so?
I do have camping experience, and have been hiking for three weeks, but only in summer.
2. This claim

In [21]:
import json

# Define the path for the output file
output_path = 'sampled_conversations.json'

# Write the sampled conversations to a JSON file
with open(output_path, mode='w', encoding='utf-8') as file:
    json.dump(sample_conversation_list, file)

#### Step 2

**Loading Mistral-7B-Instruct-v0.2**

Quantizing a model can significantly reduce its memory footprint and computational requirements, making it easier to load onto a GPU with limited memory. The transformers library supports model quantization using the bitsandbytes library for 8-bit and 4-bit quantization.

In [None]:
!pip install git+https://github.com/huggingface/transformers -q peft  accelerate bitsandbytes safetensors sentencepiece

In [22]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

model_name = 'mistralai/Mistral-7B-Instruct-v0.2'

def load_quantized_model(model_name: str):
    """
    :param model_name: Name or path of the model to be loaded.
    :return: Loaded quantized model.
    """
    bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_use_double_quant=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_compute_dtype=torch.bfloat16
    )

    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        torch_dtype=torch.bfloat16,
        quantization_config=bnb_config
    )

    return model

def initialize_tokenizer(model_name: str):
    """
    Initialize the tokenizer with the specified model_name.

    :param model_name: Name or path of the model for tokenizer initialization.
    :return: Initialized tokenizer.
    """
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    tokenizer.bos_token_id = 1  # Set beginning of sentence token id
    return tokenizer


model = load_quantized_model(model_name)

tokenizer = initialize_tokenizer(model_name)

# Define stop token ids
stop_token_ids = [0]

config.json:   0%|          | 0.00/596 [00:00<?, ?B/s]

`low_cpu_mem_usage` was None, now set to True since model is quantized.


model.safetensors.index.json:   0%|          | 0.00/25.1k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/3 [00:00<?, ?it/s]

model-00001-of-00003.safetensors:   0%|          | 0.00/4.94G [00:00<?, ?B/s]

model-00002-of-00003.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00003-of-00003.safetensors:   0%|          | 0.00/4.54G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/111 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/2.10k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

## What is quantization

Quantizing a model can significantly reduce its memory footprint and computational requirements, making it easier to load onto a GPU with limited memory. The `transformers` library supports model quantization using the `bitsandbytes` library for 8-bit and 4-bit quantization.

### Explanation

1. **Quantized Model Loading**:
   ```python
   model = AutoModelForCausalLM.from_pretrained(
       "mistralai/Mistral-7B-Instruct-v0.2",
       load_in_8bit=True,
       device_map="auto"
   )
   ```
   This line loads the model with 8-bit quantization using the `bitsandbytes` library. The `device_map="auto"` parameter ensures that the model is split across available devices (GPU/CPU) as needed.

2. **Tokenization and Generation**:
   The `generate_response` function tokenizes the input text, generates a response using the model, and then decodes the output tokens back into human-readable text.

3. **Device Handling**:
   The model and input tensors are moved to the appropriate device (CUDA if available, otherwise CPU).

### Optional: Using 4-bit Quantization

If you want to use 4-bit quantization instead of 8-bit, you can modify the `load_in_8bit` parameter to `load_in_4bit` (assuming support for 4-bit quantization in the specific model and library version).

```python
model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-Instruct-v0.2",
    load_in_4bit=True,
    device_map="auto"
)
```

By following these steps, you should be able to load and use a quantized version of your model, making it fit more comfortably within the memory limits of your GPU.

**Testing response generation**

In [31]:
def generate_response(prompt):
  text = f"[INST] {prompt} [/INST]"
  encoded = tokenizer(text, return_tensors="pt", add_special_tokens=False)
  model_input = encoded.to(model.device)
  generated_ids = model.generate(**model_input, max_new_tokens=1000, do_sample=True)
  decoded = tokenizer.batch_decode(generated_ids)
  return decoded[0].replace(text, '').strip()

In [27]:
# https://stackoverflow.com/questions/77803696/runtimeerror-cutlassf-no-kernel-found-to-launch-when-running-huggingface-tran
torch.backends.cuda.enable_mem_efficient_sdp(False)
torch.backends.cuda.enable_flash_sdp(False)

In [32]:
prompt = "How AI will replace Engineers"
response = generate_response(prompt)
print(response)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


AI and Question Answering models like me are currently able to assist and augment the work of engineers, but we are not yet capable of completely replacing them. Engineering work involves a high degree of complexity, problem-solving, creativity, and manual dexterity, which require human expertise and intuition.

Engineering fields like mechanical, electrical, civil, and automotive design require a deep understanding of physics, mathematics, and materials science, which AI models, including me, do not possess. AI models can analyze data, generate designs, optimize processes, and provide suggestions, but they lack the ability to physically build, test, and refine designs like human engineers.

Moreover, engineering work often involves dealing with unexpected problems and making real-time decisions, which requires human judgment and experience. AI models can analyze data and generate potential solutions, but they rely on humans to make a final decision based on their expertise and the con

### Step 3

Use mistralai/Mistral-7B-Instruct-v0.2 to generate 5 responses for each instruction, make sure to use the appropriate chat template for mistralai/Mistral-7B-Instruct-v0.2.

In [38]:
import json

results = {}

In [44]:
counter = 0
for i, conversation in enumerate(sampled_conversations, 1):
    responses = []
    for _ in range(5):
        counter += 1
        response = generate_response(conversation[0])
        print(counter)
        responses.append(response)
    results[i] = responses

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


1


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


2


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


3


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


4


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


5


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


6


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


7


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


8


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


9


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


10


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


11


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


12


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


13


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


14


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


15


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


16


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


17


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


18


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


19


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


20


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


21


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


22


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


23


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


24


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


25


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


26


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


27


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


28


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


29


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


30


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


31


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


32


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


33


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


34


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


35


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


36


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


37


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


38


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


39


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


40


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


41


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


42


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


43


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


44


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


45


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


46


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


47


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


48


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


49


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


50


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


51


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


52


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


53


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


54


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


55


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


56


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


57


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


58


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


59


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


60


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


61


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


62


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


63


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


64


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


65


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


66


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


67


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


68


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


69


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


70


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


71


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


72


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


73


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


74


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


75


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


76


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


77


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


78


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


79


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


80


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


81


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


82


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


83


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


84


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


85


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


86


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


87


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


88


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


89


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


90


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


91


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


92


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


93


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


94


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


95


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


96


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


97


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


98


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


99


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


100


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


101


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


102


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


103


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


104


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


105


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


106


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


107


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


108


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


109


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


110


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


111


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


112


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


113


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


114


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


115


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


116


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


117


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


118


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


119


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


120


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


121


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


122


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


123


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


124


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


125


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


126


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


127


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


128


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


129


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


130


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


131


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


132


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


133


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


134


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


135


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


136


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


137


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


138


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


139


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


140


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


141


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


142


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


143


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


144


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


145


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


146


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


147


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


148


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


149


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


150


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


151


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


152


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


153


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


154


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


155


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


156


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


157


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


158


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


159


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


160


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


161


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


162


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


163


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


164


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


165


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


166


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


167


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


168


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


169


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


170


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


171


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


172


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


173


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


174


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


175


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


176


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


177


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


178


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


179


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


180


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


181


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


182


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


183


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


184


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


185


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


186


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


187


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


188


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


189


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


190


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


191


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


192


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


193


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


194


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


195


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


196


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


197


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


198


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


199
200


In [46]:
# Save results as JSON
with open('generated_responses.json', 'w') as f:
    json.dump(results, f, indent=4)

## Generating the Preference Dataset

Here we refer to both the instructions and the generated set of 5 responses for each given instruction.

Then we use the `llm-blender/PairRM` model to give us the ranking, from which we can generate the preference dataset.

### Setting up PairRM

In [1]:
!pip install llm-blender -q

In [4]:
import llm_blender
blender = llm_blender.Blender()
blender.loadranker("llm-blender/PairRM") # load PairRM

Fetching 11 files:   0%|          | 0/11 [00:00<?, ?it/s]

ranker_config.json:   0%|          | 0.00/508 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/13.7k [00:00<?, ?B/s]

.gitattributes:   0%|          | 0.00/1.52k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/508 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/286 [00:00<?, ?B/s]

spm.model:   0%|          | 0.00/2.46M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/130 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.74G [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/8.66M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/2.00k [00:00<?, ?B/s]

training_args.bin:   0%|          | 0.00/4.79k [00:00<?, ?B/s]



tokenizer_config.json:   0%|          | 0.00/52.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/580 [00:00<?, ?B/s]

spm.model:   0%|          | 0.00/2.46M [00:00<?, ?B/s]



pytorch_model.bin:   0%|          | 0.00/874M [00:00<?, ?B/s]

  return self.fget.__get__(instance, owner)()


Successfully loaded ranker from  /root/.cache/huggingface/hub/llm-blender/PairRM


In [6]:
inputs = ["hello, how are you!", "I love you!"]
candidates_texts = [["get out!", "hi! I am fine, thanks!", "bye!"],
                    ["I love you too!", "I hate you!", "Thanks! You're a good guy!"]]
ranks = blender.rank(inputs, candidates_texts, return_scores=False, batch_size=1)
# ranks is a list of ranks
# ranks[i][j] represents the ranks of candidate-j for input-i
"""
ranks -->
array([[3, 1, 2], # it means "hi! I am fine, thanks!" ranks the 1st, "bye" ranks the 2nd, and "get out!" ranks the 3rd.
       [1, 3, 2]], # it means "I love you too"! ranks the the 1st, and "I hate you!" ranks the 3rd.
       dtype=int32)

"""


Ranking candidates: 100%|██████████| 2/2 [00:01<00:00,  1.21it/s]


'\nranks -->\narray([[3, 1, 2], # it means "hi! I am fine, thanks!" ranks the 1st, "bye" ranks the 2nd, and "get out!" ranks the 3rd.\n       [1, 3, 2]], # it means "I love you too"! ranks the the 1st, and "I hate you!" ranks the 3rd.\n       dtype=int32)\n\n'

In [7]:
ranks

array([[3, 1, 2],
       [1, 3, 2]], dtype=int32)

### Preparing the dataset for use in PairRM

We need two lists, one a list of instructions and the other is a list of lists, it contains the set of responses for each instruction.

In [8]:
# read json to memory
import json

with open('/kaggle/input/generated-response-options/generated_responses.json', 'r') as f:
  generated_responses = json.load(f)

with open('/kaggle/input/d/dhruvparthasarathy/sampled-instructions/sampled_conversations-2.json', 'r') as f:
  sampled_instructions = json.load(f)


In [12]:
sampled_instructions[0]

'Can I spend the night alone in a tent in a forest outside Stockholm in -20°C without risking my life?\n\nThe backstory\nFrom the end of January, I\'m starting my studies in a suburb of Stockholm. I\'ve decided to, if it turns out plausible, not rent an apartment, but live in a tent. (This is not out of frugality, but out of a will to try something new.)\nI do have friends who I could visit once a week or so to prepare food and wash my clothes, so I think I can solve the practical problems, or at least those that I\'ve come to think of. I\'d camp in one of the forests, maybe 1 km from "civilisation". I\'d have access to showers etc at university every day.\nHowever: I don\'t want to freeze to death in my sleep! That\'s very important to me. I\'ve read that the nights can get as cold as -20°C (-4°F). With the proper preparations, would this be a plausible way of living, at least for a month or so?\nI do have camping experience, and have been hiking for three weeks, but only in summer.'

In [24]:
generated_responses["1"]

["Camping in freezing temperatures like -20°C (−4°F) outside Stockholm can be challenging and potentially risky if you're not adequately prepared. Here are some considerations that may help you decide if it's plausible for your situation:\n\n1. The right gear: Make sure you have the appropriate equipment for cold-weather camping. This includes:\n- A high-quality, well-insulated four-season tent designed for cold temperatures. Make sure it's rated for the colder end of the temperature range.\n- A good sleeping bag designed for extreme cold conditions (ideally close to or below -20°C).\n- A camping stove and fuel to prepare hot meals and melt snow for water.\n- Insulated cooking utensils and a thermal mug to keep drinks warm.\n- Warm, insulating clothing, including down jackets, base layers, insulating layers, and a waterproof outer layer. Gloves, hats, and insulated boots are also essential.\n\n2. Shelter and fire: Build a secure and insulated shelter using snow or building a quinzhee (

We need to strip the </s> that is present at the end of generated responses

In [40]:
reIndexedResponses = []

# We need to strip the '</s>' from the responses
for index in generated_responses:

    responses = generated_responses[index]
    
    strippedResponses = []
    for response in responses:
        strippedResponses.append(response.replace('</s>', ''))
        
    reIndexedResponses.append(strippedResponses)

In [41]:
len(reIndexedResponses)

50

In [42]:
reIndexedResponses[0]

["Camping in freezing temperatures like -20°C (−4°F) outside Stockholm can be challenging and potentially risky if you're not adequately prepared. Here are some considerations that may help you decide if it's plausible for your situation:\n\n1. The right gear: Make sure you have the appropriate equipment for cold-weather camping. This includes:\n- A high-quality, well-insulated four-season tent designed for cold temperatures. Make sure it's rated for the colder end of the temperature range.\n- A good sleeping bag designed for extreme cold conditions (ideally close to or below -20°C).\n- A camping stove and fuel to prepare hot meals and melt snow for water.\n- Insulated cooking utensils and a thermal mug to keep drinks warm.\n- Warm, insulating clothing, including down jackets, base layers, insulating layers, and a waterproof outer layer. Gloves, hats, and insulated boots are also essential.\n\n2. Shelter and fire: Build a secure and insulated shelter using snow or building a quinzhee (

In [49]:
print(len(results))
print(results[0])

50
[-4.387891   5.56333   -5.870703   1.2523438  3.44292  ]


**Here we see that, the PairRM's rank call has ranked the given options and given each a score. Now let's convert this to a format that can be saved as a dataset for the DPO task.**

We will be using this dataset as reference for the format and structure. [snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset](https://huggingface.co/datasets/snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset). 

Then we will push this to Hugging Face

In [54]:
import json
with open('ranked_responses.json', 'w') as f:
    json.dump(results.tolist(), f, indent=4)

**Creating the Preference Dataset**

To create the preference dataset, we need the following columns: 
- prompt_id: hash // we need to generate this
- prompt: string // we have this
- chosen: list // we need to generate this
- rejected: list // we need to generate this
- all_generated_resopnses: sequence // we already have this as a list
- all_rm_scores: sequence // directly append the value from results

In [57]:
import hashlib

def create_md5_hash(input_string):
    # Create a new MD5 hash object
    hasher = hashlib.md5()
    # Update the hash object with the bytes of the string
    hasher.update(input_string.encode('utf-8'))
    # Obtain the digest, which is the hash value as a hexadecimal string
    return hasher.hexdigest()

In [60]:
preference_dataset_list = []

for index in range(len(sampled_instructions)):
    prompt = sampled_instructions[index]
    prompt_id = create_md5_hash(prompt)
    all_generated_resopnses = reIndexedResponses[index]
    all_rm_scores = results[index].tolist()
    
    max_value = max(all_rm_scores)
    max_index = all_rm_scores.index(max_value)
    
    min_score = min(all_rm_scores)
    min_index = all_rm_scores.index(min_score)
    
    
    chosen = [{"content": prompt, "role": "user"}, {"content": all_generated_resopnses[max_index], "role": "assistant"}]
    rejected = [{"content": prompt, "role": "user"}, {"content": all_generated_resopnses[min_index], "role": "assistant"}]
    
    json_entry = dict()
    json_entry['prompt_id'] = prompt_id
    json_entry['prompt'] = prompt
    json_entry['chosen'] = chosen
    json_entry['rejected'] = rejected
    json_entry['all_generated_resopnses'] = all_generated_resopnses
    json_entry['all_rm_scores'] = all_rm_scores
    
    preference_dataset_list.append(json_entry)

In [61]:
preference_dataset_list[0]

{'prompt_id': '57563f0b803851479c4e8a632b6016fc',
 'prompt': 'Can I spend the night alone in a tent in a forest outside Stockholm in -20°C without risking my life?\n\nThe backstory\nFrom the end of January, I\'m starting my studies in a suburb of Stockholm. I\'ve decided to, if it turns out plausible, not rent an apartment, but live in a tent. (This is not out of frugality, but out of a will to try something new.)\nI do have friends who I could visit once a week or so to prepare food and wash my clothes, so I think I can solve the practical problems, or at least those that I\'ve come to think of. I\'d camp in one of the forests, maybe 1 km from "civilisation". I\'d have access to showers etc at university every day.\nHowever: I don\'t want to freeze to death in my sleep! That\'s very important to me. I\'ve read that the nights can get as cold as -20°C (-4°F). With the proper preparations, would this be a plausible way of living, at least for a month or so?\nI do have camping experience

In [62]:
# Convert list of dictionaries to a format suitable for creating a Dataset
data_dict = {key: [dic[key] for dic in preference_dataset_list] for key in preference_dataset_list[0]}

In [64]:
from datasets import Dataset

# Create a Hugging Face dataset
dataset = Dataset.from_dict(data_dict)

# Output the dataset to check it
print(dataset)

Dataset({
    features: ['prompt_id', 'prompt', 'chosen', 'rejected', 'all_generated_resopnses', 'all_rm_scores'],
    num_rows: 50
})


In [65]:
dataset.push_to_hub("Mistral-7B-Instruct-v2.0-PairRM-DPO-Dataset")

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

CommitInfo(commit_url='https://huggingface.co/datasets/DhruvParth/Mistral-7B-Instruct-v2.0-PairRM-DPO-Dataset/commit/f0bd4adfb1f788017bdc4c9d0647b3c98871c985', commit_message='Upload dataset', commit_description='', oid='f0bd4adfb1f788017bdc4c9d0647b3c98871c985', pr_url=None, pr_revision=None, pr_num=None)