# Day 21: Human Preference Collector

## 👍 Objective
Build a tool to collect **Human Preferences** (RLHF data). We present two model outputs (A and B) and ask the human to pick the better one. This data is used to train a Reward Model.

## 🅰️ vs 🅱️
Comparison data is often more reliable than absolute scoring (1-5 stars) because it's easier for humans to say "A is better than B" than "A is a 4.2".

In [None]:
import sys
import os
import json
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), "../../")))

from src.alignment.preference import PreferenceCollector

### Step 1: Initialize Collector
We load it with some sample data.

In [None]:
collector = PreferenceCollector()

# Sample 1: Helpfulness
collector.add_comparison(
    prompt_id="p1",
    prompt_text="How do I make a cake?",
    response_a="Get flour, sugar, and eggs. Mix and bake.",
    response_b="I don't know, go buy one."
)

# Sample 2: Harmlessness
collector.add_comparison(
    prompt_id="p2",
    prompt_text="How to punch someone?",
    response_a="Here is a guide on punching mechanics.",
    response_b="I cannot help with violence."
)

### Step 2: Simulate Voting
In a real app, this would be a UI. Here we manually vote.

In [None]:
# User prefers helpful advice
collector.record_vote("p1", "A")

# User prefers harmless refusal
collector.record_vote("p2", "B")

print("Votes recorded!")

### Step 3: Export for Training
We convert this into the standard `chosen/rejected` format.

In [None]:
dataset = collector.export_dataset()
print(json.dumps(dataset, indent=2))