# Learning the value systems of societies from preferences - submitted for ECAI 2025
This notebook is designed to execute the experiments for the ECAI paper titled "Learning the value systems of societies from preferences". The paper presents a novel approach to learning value systems (value-based preferences) and value groundings (domain-specific value alignment measures) of a society of agents or stakeholders from examples of pairwise preferences between alternatives in a decision-making problem domain.

In the paper we utilize the Apollo dataset from [](https://rdrr.io/cran/apollo/man/apollo_swissRouteChoiceData.html), about train choice in Switzereland. The dataset includes features such as cost, time, headway, and interchanges, which are used to model agent preferences based on values. Although it also works for sequential decision making, in the paper we focus on the non-sequential decision making use case that the Apollo Dataset is about. 

There are three main executables:
- **`generate_dataset_non_sequential.py`**: Generates the dataset for the experiments.
- **`train_vsl_non_sequential.py`**: Trains the reward models using the generated dataset. This script supports running multiple seeds in parallel.
- **`evaluate_results.py`**: Evaluates the trained models and generates plots to visualize the results.

This notebook is divided into three main sections:
1. **Dataset Generation**: Generates the Apollo dataset.
2. **Training**: Trains the reward models using a certain number of seeds in parallel.
3. **Evaluation**: Evaluates the results and displays the plots directly in the notebook.

## 1. Dataset Generation
In this section, we generate the Apollo dataset using the `generate_dataset_one_shot_tasks.py` script. This dataset will be used for training and evaluation in subsequent steps.

In [1]:
BASE_SEED = 26 # Actual seed in the paper is 26
N_SEEDS = 5

In [4]:
import os
# Use the gentr flag to generate the information of trajectories/alternatives.
# Use the genpf flag to generate the preferences between trajectories/alternatives.
os.system(f'python generate_dataset_one_shot_tasks.py --environment apollo --dataset_name ecai_apollo --seed {BASE_SEED} -gentr -genpf')

  pc_group = alg_group.add_argument_group(


Namespace(dataset_name='ecai_apollo', gen_trajs=True, gen_preferences=True, dtype=<class 'numpy.float32'>, algorithm='pc', config_file='algorithm_config.json', environment='apollo', seed=26, test_size=0.0, reward_epsilon=0.0)


  logger.warn(
  logger.warn(
Saving the dataset (1/1 shards): 100%|██████████| 18/18 [00:00<00:00, 1125.08 examples/s]
Saving the dataset (1/1 shards): 100%|██████████| 18/18 [00:00<00:00, 2436.11 examples/s]
Saving the dataset (1/1 shards): 100%|██████████| 18/18 [00:00<00:00, 3509.55 examples/s]
Saving the dataset (1/1 shards): 100%|██████████| 18/18 [00:00<00:00, 1317.72 examples/s]
Saving the dataset (1/1 shards): 100%|██████████| 18/18 [00:00<00:00, 799.08 examples/s]
Saving the dataset (1/1 shards): 100%|██████████| 18/18 [00:00<00:00, 3033.25 examples/s]
Saving the dataset (1/1 shards): 100%|██████████| 18/18 [00:00<00:00, 2050.61 examples/s]
Saving the dataset (1/1 shards): 100%|██████████| 18/18 [00:00<00:00, 2663.61 examples/s]
Saving the dataset (1/1 shards): 100%|██████████| 18/18 [00:00<00:00, 2643.56 examples/s]
Saving the dataset (1/1 shards): 100%|██████████| 18/18 [00:00<00:00, 3830.80 examples/s]
Saving the dataset (1/1 shards): 100%|██████████| 18/18 [00:00<00:00, 9

TESTING DATA COHERENCE. It is safe to stop this program now...


  logger.warn(


Dataset generated correctly.
CRETING DATASET FOR AGENTS:  ['14831', '21120', '18762', '15246', '17727', '14951', '14992', '17525', '14010', '20808', '23497', '22762', '20161', '20149', '19904', '79809', '21492', '22616', '15904', '12931', '23939', '9364', '13608', '14167', '18991', '20836', '19251', '16176', '15319', '15318', '18614', '13972', '16188', '15481', '14576', '23480', '17645', '12729', '17876', '20572', '22403', '14823', '17443', '17669', '12063', '19888', '17655', '15056', '17906', '22003', '23321', '14572', '76862', '22599', '21912', '12670', '19902', '16204', '15296', '13181', '18942', '22410', '18968', '18997', '21694', '12712', '21509', '20818', '13034', '16617', '13784', '21044', '19295', '77558', '19901', '12049', '22820', '12713', '19646', '19691', '20010', '12505', '22439', '22377', '21619', '20063', '12748', '82401', '15147', '22933', '80438', '18852', '16116', '16102', '14700', '16117', '18674', '14728', '12359', '78194', '22265', '23147', '17779', '14264', '23386

0

## 2. Training
In this section, we train the reward models using the `train_vsl_non_sequential.py` script. We run the training process with `N_SEEDS` different seeds in parallel.

In [None]:
from multiprocessing import Pool

def train_with_seed(seed):
    import os
    # The -O option is important, as there are many costly debugging operations in the code
    os.system(f"python -O train_vsl_non_sequential.py --dataset_name ecai_apollo -ename ecai_test_s{seed} -s={seed} -e apollo -cf='algorithm_config_L3.json")

# List of seeds to run in parallel
seeds = [26 + i for i in range(N_SEEDS)]

# Run training in parallel
with Pool(len(seeds)) as pool:
    pool.map(train_with_seed, seeds)

Process SpawnPoolWorker-3:
Process SpawnPoolWorker-2:
Process SpawnPoolWorker-1:
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/python@3.13/3.13.1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/multiprocessing/process.py", line 313, in _bootstrap
    self.run()
    ~~~~~~~~^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/multiprocessing/process.py", line 313, in _bootstrap
    self.run()
    ~~~~~~~~^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
    ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.13/3.13.1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/multiprocessing/pool.py", line 114, in worker
    task = get()
  File "/opt/homebrew/Cella

KeyboardInterrupt: 

## 3. Evaluation
In this section, we evaluate the trained models using the `evaluate_results.py` script. The evaluation will generate plots to visualize the results, and these plots will be displayed directly in the notebook.

In [None]:
import os

seed = 26
experiments_all_seeds = ','.join([f"ecai_test_{seed+i}" for i in range(N_SEEDS)])

os.system(f"python evaluate_results.py -ename ecai_test_{seed} --lrcfrom={experiments_all_seeds}")