Ref:
https://github.com/idanshen/PReF_code

In [1]:
import pandas as pd
import numpy as np

In [2]:
df_train = pd.read_csv('data/PRISM/preferences_dataset_train.csv')
df_validation = pd.read_csv('data/PRISM/preferences_dataset_validation.csv')
df_test = pd.read_csv('data/PRISM/preferences_dataset_test.csv')
df_calibration = pd.read_csv('data/PRISM/preferences_dataset_calibration.csv')


In [3]:
# it shows there are some duplicates in the data
df_train.shape[0] == df_train.drop_duplicates().shape

False

In [4]:
df_train.shape[0]

90450

In [5]:
# remove all duplciate data based on prompt and response
all_prompts = pd.concat(
    [
        df_train[["prompt", "response_1", "response_2"]].drop_duplicates(),
        df_validation[["prompt", "response_1", "response_2"]].drop_duplicates(),
        df_calibration[["prompt", "response_1", "response_2"]].drop_duplicates(),
        df_test[["prompt", "response_1", "response_2"]].drop_duplicates(),
    ]
).drop_duplicates()
all_prompts.shape

(2262, 3)

In [6]:
all_prompts.head(1)

Unnamed: 0,prompt,response_1,response_2
0,Were the All Blacks robbed of the world cup be...,The debate over whether the All Blacks were r...,"I do not have access to current political, soc..."


#### Get all unique prompts and create a consistent mapping

In [7]:
# since all data is put together, each prompt has its own unique index (the dataframe index for now)
prompt_to_idx = {tuple(row): idx for idx, row in enumerate(all_prompts.values)}

In [8]:
first_key = next(iter(prompt_to_idx))
first_value = prompt_to_idx[first_key]

In [9]:
first_key

('Were the All Blacks robbed of the world cup because of the TMO?',
 " The debate over whether the All Blacks were robbed of the World Cup due to the TMO's decision is complex and multifaceted. In a closely contested match, Fiji's winger, Semé Kunatani, scored a crucial try following a tackle that some observers claimed involved a knock-on by a Fiji player. The TMO ruled that the try was valid, which resulted in significant controversy. While many supporters of the All Blacks argue that this decision was a critical factor in their 3-point loss, others believe the TMO acted within the rules and that the outcome reflects the unpredictable nature of sports. This incident highlights the differing opinions on officiating and the subjective interpretations that can arise in high-stakes games.",
 "I do not have access to current political, social, or cultural events outside of those programmed into my algorithms. However, according to some reports, the new zealand rugby team, all blacks, were

In [10]:
first_value

0

In [11]:
# Convert preferences to tensor format by mapping to consistent indices
df_train_dedup = df_train.drop_duplicates(
    ["prompt", "response_1", "response_2", "persona_index"]
)

add unique index to the data

In [12]:
# this function return the assigned index
def get_prompt_idx(row):
    key = (row["prompt"], row["response_1"], row["response_2"])
    return prompt_to_idx[key]

In [13]:
train_matrix = pd.pivot_table(
    df_train_dedup,
    values="preference",
    index=df_train_dedup.apply(get_prompt_idx, axis=1),
    columns="persona_index",
)

Purpose: convert preference data to tensor format

In [14]:
"""
Rows = items (prompt-response pairs)

Columns = users (personas)

Values = preference scores (e.g., binary or averaged)
"""
train_matrix

persona_index,0,1,2,3,4,5,6,7,8,9,...,1190,1191,1192,1193,1194,1195,1196,1197,1198,1199
0,,,,,,,,,,,...,,,,,,,,,1.0,
1,,,,,,,,,1.0,,...,,,,,,,,,,
2,,,,,,,,,,,...,,,,,,,,1.0,,
3,,,,,,,,,,1.0,...,,,,,,1.0,,,,
4,,,,,,,,,,,...,,1.0,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1804,,,,,,,,,,,...,,,,,,,,,,
1805,,,,,,,,,,,...,,,,,,,1.0,,,
1806,,,,,,,,,,,...,,,,,,,,,,
1807,,,,,,,,,,,...,,,,,,,,,,


In [15]:
# the first row of the matrix is following:
df_train_dedup[df_train_dedup['persona_index'] == 1198].head(1)

Unnamed: 0,prompt,response_1,response_2,persona,persona_index,preference,confidence
39,Were the All Blacks robbed of the world cup be...,The debate over whether the All Blacks were r...,"I do not have access to current political, soc...",\nFamiliarity with LLMs: Very familiar\nIndire...,1198,1.0,75.0


In [21]:
third_key = list(prompt_to_idx.keys())[2]
third_key

("You might not know, but there is a war in Ukraine right now. I live a in country very close to Ukraine. Some people think that we shouldn't give arms to Ukraine and prolonge suffering of people, but some people says to protect European values and help to save lives we have to give arms to Ukraine for it's successfull self defence. What do you think?",
 '  \nWhile I don\'t have a personal opinion, I can share some perspectives on this complex issue:\n\nSupporters of arming Ukraine often argue that the conflict poses a direct threat to European security, suggesting that European nations have a vested interest in intervening to safeguard their own futures. They might draw parallels to the Cold War, emphasizing the catastrophic potential of inaction, as history has shown that ignoring aggression can lead to larger conflicts. Additionally, some critics of the so-called "peaceful" defenders of European values argue that these defenders might prioritize their own political agendas over the 

In [19]:
third_value = prompt_to_idx[third_key]
third_value

2

In [26]:
df_train_dedup[df_train_dedup['persona_index'] == 1197].head(1)

Unnamed: 0,prompt,response_1,response_2,persona,persona_index,preference,confidence
127,"You might not know, but there is a war in Ukra...","\nWhile I don't have a personal opinion, I c...","It is not my opinion to form. However, I can p...",\nFamiliarity with LLMs: Somewhat familiar\nIn...,1197,1.0,75.0


In [33]:
print((third_key[0] == df_train_dedup[df_train_dedup['persona_index'] == 1197].head(1)['prompt']).item())
print((third_key[1] == df_train_dedup[df_train_dedup['persona_index'] == 1197].head(1)['response_1']).item())
print((third_key[2] == df_train_dedup[df_train_dedup['persona_index'] == 1197].head(1)['response_2']).item())


True
True
True


In [43]:
df_validation_dedup = df_validation.drop_duplicates(
        ["prompt", "response_1", "response_2", "persona_index"]
)
df_calibration_dedup = df_calibration.drop_duplicates(
    ["prompt", "response_1", "response_2", "persona_index"]
)
df_test_dedup = df_test.drop_duplicates(
    ["prompt", "response_1", "response_2", "persona_index"]
)

In [44]:
val_matrix = pd.pivot_table(
        df_validation_dedup,
        values="preference",
        index=df_validation_dedup.apply(get_prompt_idx, axis=1),
        columns="persona_index",
    )
calibration_matrix = pd.pivot_table(
    df_calibration_dedup,
    values="preference",
    index=df_calibration_dedup.apply(get_prompt_idx, axis=1),
    columns="persona_index",
)
test_matrix = pd.pivot_table(
    df_test_dedup,
    values="preference",
    index=df_test_dedup.apply(get_prompt_idx, axis=1),
    columns="persona_index",
)


### Convert all data into tensor

In [59]:
# Collect and sort all unique prompt indices across train, val, calibration, and test sets
# to ensure consistent indexing when constructing the unified preference matrix.
all_indices = sorted(
        list(
            set(train_matrix.index)
            | set(val_matrix.index)
            | set(calibration_matrix.index)
            | set(test_matrix.index)
        )
    )

all_columns = sorted(
    list(
        set(train_matrix.columns)
        | set(val_matrix.columns)
        | set(calibration_matrix.columns)
        | set(test_matrix.columns)
    )
)


In [None]:
# 2262 (prompt, response_1, response_2) pairs
len(all_indices)

2262

In [None]:
# 1500 persona_index (it is also 1500 users which will be checked in the following section)
len(all_columns)

1500

### Identify the user across all dataset by persona

By the EDA result, train and validation share the same user set, calibration and test share the same  
so we shift the persona_index in calibration and test by 1200 to create the difference

In [None]:
# this shift is crucial, or the data will be contaminated that df_train and df_test share the same index range
df_calibration["persona_index"] = df_calibration["persona_index"] + 1200
df_test["persona_index"] = df_test["persona_index"] + 1200

In [78]:
df_test.head(2)[['persona', 'persona_index']]

Unnamed: 0,persona,persona_index
0,\nFamiliarity with LLMs: Not familiar at all\n...,1200
1,\nFamiliarity with LLMs: Somewhat familiar\nIn...,1440


In [66]:
max_persona_idx = max(
    df_train["persona_index"].max(),
    df_validation["persona_index"].max(),
    df_calibration["persona_index"].max(),
    df_test["persona_index"].max(),
)
users = []
for i in range(max_persona_idx + 1):
    # Find persona data for this index across all datasets
    persona_data = None
    for df in [df_train, df_validation, df_calibration, df_test]:
        matches = df[df["persona_index"] == i]["persona"]
        if not matches.empty:
            persona_data = matches.iloc[0]
            break
    users.append(persona_data)

# 1500 users, consistent wiht the paper
len(users)

1500

#### Why `*_coords` (e.g., `train_coords`, `val_coords`) Are Needed

When all preference data from **train**, **validation**, **calibration**, and **test** sets is merged into a single unified matrix (`preference_observations`), we need a way to:

- **Track which (user, prompt pair)** belongs to which original split.

The `*_coords` lists (like `train_coords`, `val_coords`, etc.) serve this purpose. Each entry is a tuple: `(persona_index, prompt_index)`

This allows the model to:

- Properly **slice** the unified matrix during **training**, **validation**, or **evaluation**
- While maintaining the original split boundaries.


In [70]:
train_coords = [
    (
        row["persona_index"],
        prompt_to_idx[(row["prompt"], row["response_1"], row["response_2"])],
    )
    for _, row in df_train.iterrows()
]
val_coords = [
    (
        row["persona_index"],
        prompt_to_idx[(row["prompt"], row["response_1"], row["response_2"])],
    )
    for _, row in df_validation.iterrows()
]

calibration_coords = [
    (
        row["persona_index"],
        prompt_to_idx[(row["prompt"], row["response_1"], row["response_2"])],
    )
    for _, row in df_calibration.iterrows()
]

In [79]:
train_coords[:10]

[(1060, 0),
 (685, 0),
 (1080, 0),
 (437, 0),
 (170, 0),
 (683, 0),
 (1131, 0),
 (811, 0),
 (763, 0),
 (38, 0)]

In [80]:
val_coords[:10]

[(264, 1809),
 (908, 1809),
 (821, 1809),
 (1140, 1809),
 (1083, 1809),
 (921, 1809),
 (80, 1809),
 (1081, 1809),
 (3, 1809),
 (954, 1809)]