# Learning the value systems of societies from preferences - submitted for ECAI 2025
This notebook is designed to execute the experiments for the ECAI paper titled "Learning the value systems of societies from preferences". The paper presents a novel approach to learning value systems (value-based preferences) and value groundings (domain-specific value alignment measures) of a society of agents or stakeholders from examples of pairwise preferences between alternatives in a decision-making problem domain.

In the paper we utilize the Apollo dataset from [](https://rdrr.io/cran/apollo/man/apollo_swissRouteChoiceData.html), about train choice in Switzereland. The dataset includes features such as cost, time, headway, and interchanges, which are used to model agent preferences based on values. Although it also works for sequential decision making, in the paper we focus on the non-sequential decision making use case that the Apollo Dataset is about. 

There are three main executables:
- **`generate_dataset_non_sequential.py`**: Generates the dataset for the experiments.
- **`train_vsl_non_sequential.py`**: Trains the reward models using the generated dataset. This script supports running multiple seeds in parallel.
- **`evaluate_results.py`**: Evaluates the trained models and generates plots to visualize the results.

This notebook is divided into three main sections:
1. **Dataset Generation**: Generates the Apollo dataset.
2. **Training**: Trains the reward models using a certain number of seeds in parallel.
3. **Evaluation**: Evaluates the results and displays the plots directly in the notebook.

## 1. Dataset Generation
In this section, we generate the Apollo dataset using the `generate_dataset_one_shot_tasks.py` script. This dataset will be used for training and evaluation in subsequent steps.

In [None]:
!python generate_dataset_one_shot_tasks.py --environment apollo --dataset_name apollo_data --seed 26

## 2. Training
In this section, we train the reward models using the `train_vsl_non_sequential.py` script. We will run the training process with 5 different seeds in parallel to ensure robustness and reproducibility of the results.

In [None]:
N_SEEDS = 1

In [None]:
from multiprocessing import Pool

def train_with_seed(seed):
    import os
    os.system(f"python train_vsl_non_sequential.py -dname apollo_data -ename ecai_apollo_s{seed} --seed {seed} --max_iter 5000 --environment apollo")

# List of seeds to run in parallel
seeds = [26 + i for i in range(N_SEEDS)]

# Run training in parallel
with Pool(len(seeds)) as pool:
    pool.map(train_with_seed, seeds)

## 3. Evaluation
In this section, we evaluate the trained models using the `evaluate_results.py` script. The evaluation will generate plots to visualize the results, and these plots will be displayed directly in the notebook.

In [None]:
!python evaluate_results.py --experiment_name ecai_apollo -lrcfrom  --show_plots