# convert `genies` datasets to [open_pref_eval](https://github.com/wassname/open_pref_eval)


Here I'm taking the GENIE datasets, and 
1. converting them to preference (compatible with open_pref_eval)
2. hosting on huggingface


## Setup

```sh
python -m venv .venv --prompt GENIES
. .venv/bin/activate
pip install wheel fire requests
pip install -r requirements.txt
python ./download_data.py
```

In [3]:
%reload_ext autoreload
%autoreload 2

In [5]:
import pandas as pd
import numpy as np
from datasets import load_dataset
import datasets

from pathlib import Path
import json

In [8]:
path_to_distribution_shift_pairs = Path('../distribution_shifts/all.json')
pairs_data = json.load(open(path_to_distribution_shift_pairs))
pairs_data

[{'source': 'alpaca_easy', 'target': 'alpaca_hard'},
 {'source': 'arc_easy', 'target': 'arc_hard'},
 {'source': 'math_easy', 'target': 'math_hard'},
 {'source': 'code_easy', 'target': 'code_hard'},
 {'source': 'ranking_logic_easy', 'target': 'ranking_logic_hard'},
 {'source': 'raven_easy', 'target': 'raven_matrices'},
 {'source': 'alpaca_mmlu', 'target': 'spanish_input'},
 {'source': 'alpaca_mmlu', 'target': 'spanish_output'},
 {'source': 'alpaca_mmlu', 'target': 'comma_separated_input'},
 {'source': 'alpaca_mmlu', 'target': 'comma_separated_output'},
 {'source': 'alpaca_mmlu', 'target': 'ranking_logic'},
 {'source': 'alpaca_mmlu', 'target': 'raven_matrices'},
 {'source': 'alpaca_mmlu', 'target': 'word_swap'},
 {'source': 'code', 'target': 'counterfactual_python'},
 {'source': 'code', 'target': 'us_history'},
 {'source': 'code', 'target': 'change_my_view'},
 {'source': 'cooking', 'target': 'math'},
 {'source': 'cooking', 'target': 'raven_matrices'},
 {'source': 'math', 'target': 'chang

{'id': 'alpaca_easy', 'external_datasets': [], 'overlapping_datasets': []}

In [48]:
from datasets import DatasetInfo, Dataset

def genie2ds(train: list) -> pd.DataFrame:
    """takes the GENIE format and convert it to to a dataframe of preference format."""
    outs = []
    for i, row in enumerate(train):
        s = pd.Series(row['responses'])
        chosen = s[s==1].index[0]
        rejected = s[s==0].index
        outs += [dict(prompt=row['prompt'], chosen=chosen, rejected=r, i=i) for r in rejected]

    df = pd.DataFrame(outs)
    return df



def json2ds(source_dir: Path) -> Dataset:
    test = json.load(open(source_dir / 'test.json'))
    train = json.load(open(source_dir / 'train.json'))
    metadata = json.load(open(source_dir / 'metadata.json'))
    ds_info = DatasetInfo(
        description= f"GENIE:{metadata['id']}",
        citation= """@misc{clymer2023generalizationanalogiestestbedgeneralizing,
        title={Generalization Analogies: A Testbed for Generalizing AI Oversight to Hard-To-Measure Domains}, 
        author={Joshua Clymer and Garrett Baker and Rohan Subramani and Sam Wang},
        year={2023},
        eprint={2311.07723},
        archivePrefix={arXiv},
        primaryClass={cs.AI},
        url={https://arxiv.org/abs/2311.07723}, 
    }""",
        homepage= "https://joshuaclymer.github.io/generalization-analogies-website/",
        license= "MIT",
        config_name=f"{metadata['id']}",
    )


    df_train = genie2ds(train)
    df_test = genie2ds(test)
    dataset2 = datasets.DatasetDict(
            {'train': datasets.Dataset.from_pandas(df_train, info=ds_info),
                'test': datasets.Dataset.from_pandas(df_test, info=ds_info)}
        )
    return dataset2

In [53]:
dist_dir = Path('../distributions')

for pair in  pairs_data:
    for key in ['source', 'target']:
        source_dir = dist_dir / pair[key]

        dataset2 = json2ds(source_dir)
        config_name = dataset2['train'].info.config_name
        print(source_dir, config_name, dataset2)

        dataset2.push_to_hub("wassname/genie_dpo", config_name=config_name)

../distributions/alpaca_easy alpaca_easy


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading metadata:   0%|          | 0.00/679 [00:00<?, ?B/s]

../distributions/alpaca_hard alpaca_hard


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/674 [00:00<?, ?B/s]

../distributions/arc_easy arc_easy


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/1.16k [00:00<?, ?B/s]

../distributions/arc_hard arc_hard


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/1.63k [00:00<?, ?B/s]

../distributions/math_easy math_easy


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/2.11k [00:00<?, ?B/s]

../distributions/math_hard math_hard


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/2.59k [00:00<?, ?B/s]

../distributions/code_easy code_easy


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/3.06k [00:00<?, ?B/s]

../distributions/code_hard code_hard


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/3.54k [00:00<?, ?B/s]

../distributions/ranking_logic_easy ranking_logic_easy


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/4.02k [00:00<?, ?B/s]

../distributions/ranking_logic_hard ranking_logic_hard


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/4.54k [00:00<?, ?B/s]

../distributions/raven_easy raven_easy


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/5.05k [00:00<?, ?B/s]

../distributions/raven_matrices raven_matrices


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/5.53k [00:00<?, ?B/s]

../distributions/alpaca_mmlu alpaca_mmlu


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/6.03k [00:00<?, ?B/s]

../distributions/spanish_input spanish_input


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/6.51k [00:00<?, ?B/s]

../distributions/alpaca_mmlu alpaca_mmlu


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading metadata:   0%|          | 0.00/7.01k [00:00<?, ?B/s]

../distributions/spanish_output spanish_output


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/7.00k [00:00<?, ?B/s]

../distributions/alpaca_mmlu alpaca_mmlu


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading metadata:   0%|          | 0.00/7.50k [00:00<?, ?B/s]

../distributions/comma_separated_input comma_separated_input


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

../distributions/alpaca_mmlu alpaca_mmlu


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading metadata:   0%|          | 0.00/8.03k [00:00<?, ?B/s]

../distributions/comma_separated_output comma_separated_output


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

../distributions/alpaca_mmlu alpaca_mmlu


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading metadata:   0%|          | 0.00/8.56k [00:00<?, ?B/s]

../distributions/ranking_logic ranking_logic


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

../distributions/alpaca_mmlu alpaca_mmlu


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading metadata:   0%|          | 0.00/9.05k [00:00<?, ?B/s]

../distributions/raven_matrices raven_matrices


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

../distributions/alpaca_mmlu alpaca_mmlu


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading metadata:   0%|          | 0.00/9.05k [00:00<?, ?B/s]

../distributions/word_swap word_swap


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

../distributions/code code


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/9.53k [00:00<?, ?B/s]

../distributions/counterfactual_python counterfactual_python


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/9.98k [00:00<?, ?B/s]

../distributions/code code


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading metadata:   0%|          | 0.00/10.5k [00:00<?, ?B/s]

../distributions/us_history us_history


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/10.5k [00:00<?, ?B/s]

../distributions/code code


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading metadata:   0%|          | 0.00/11.0k [00:00<?, ?B/s]

../distributions/change_my_view change_my_view


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

../distributions/cooking cooking


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/11.5k [00:00<?, ?B/s]

../distributions/math math


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Downloading metadata:   0%|          | 0.00/12.0k [00:00<?, ?B/s]

../distributions/cooking cooking


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading metadata:   0%|          | 0.00/12.4k [00:00<?, ?B/s]

../distributions/raven_matrices raven_matrices


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading metadata:   0%|          | 0.00/12.4k [00:00<?, ?B/s]

../distributions/math math


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

../distributions/change_my_view change_my_view


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading metadata:   0%|          | 0.00/12.4k [00:00<?, ?B/s]

../distributions/math math


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading metadata:   0%|          | 0.00/12.4k [00:00<?, ?B/s]

../distributions/cooking cooking


Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]

Pushing dataset shards to the dataset hub:   0%|          | 0/1 [00:00<?, ?it/s]