Pragmatic Policy Development via Interpretable Behavior Cloning

This repository contains the code used for the experiments in our ML4H 2025 paper Pragmatic Policy Development via Interpretable Behavior Cloning.

Installation

First clone the following repositories from GitHub:

git clone https://github.com/antmats/ppdev.git
git clone https://github.com/antmats/ReassessDTR.git

The ReassessDTR repository is used for training the offline reinforcement learning policies included in our experiments.

We use pixi as our package manager. Before installing dependencies, make sure pixi is installed by following the instructions here.

Once pixi is installed, run the following command in each of the cloned repositories:

pixi install

Configuration files

For each experiment (rheumatoid arthritis (RA) and sepsis), a corresponding configuration file is provided in configs. These configuration files specify details such as the path to the dataset, the evaluation metrics to be used, and the directory where the results will be saved.

Data

RA

To preprocess the RA data and extract the relevant patient cohort, run:

pixi run scripts/make_data.py --config_path configs/ra.yml --extract_cohort

Note that the RA data are available from CorEvitas, LLC through a commercial subscription agreement and are not publicly available.

Sepsis

The Sepsis data were preprocessed as decribed in Komorowski et al. (2018). To obtain the preprocessed dataset, follow the instructions provided here.

Experiments

The results and figures presented in the paper are available in this notebook.

We conducted our experiments on the Alvis cluster using Slurm for workload management. If you have access to a cluster that uses Slurm, you can reproduce our experiments by following the steps below. Note that you will need to update some variables at the top of the bash scripts used for launching jobs (e.g., Slurm accounting information).

Containers

We use Apptainer containers to run the code. To create a container, copy the file container.def to a storage directory with sufficient space. Then, assuming the ppdev repository is cloned to your home directory and you are located in the storage directory, run the following command:

apptainer build --bind $HOME:/mnt ppdev_env.sif container.def

To verify the container, run the following command:

apptainer exec ppdev_env.sif python --version

A separate container for the ReassessDTR repository is needed and can be created in the same way.

Reproducing the RA experiment

Run the following commands to reproduce the results for the RA experiment.

cd ~/ppdev
./scripts/slurm/estimator_fit_all.sh seeds.csv configs/ra.yml dt dt_switch dt_stage_switch rnn

The command above creates an experiment directory. Assume its path is stored in the variable experiment_dir.

Next, train the reinforcement learning policies.

sbatch --output="${experiment_dir}/logs/%x_%A_%a.out" --job-name="fit_rl_policy" --array=1-50 scripts/slurm/rl_policy_fit.sh "$experiment_dir"
cd ~/ReassessDTR
sbatch --output="${experiment_dir}/logs/%x_%A_%a.out" --job-name="rl_ra" --array=1-50 run_ra_experiment.sh "${experiment_dir}"

After all jobs have completed, run off-policy evaluation.

cd ~/ppdev
./scripts/slurm/ope_run_all.sh "$experiment_dir" dt_stage_switch

Reproducing the Sepsis experiment

Run the following commands to reproduce the results for the Sepsis experiment.

cd ~/ppdev
./scripts/slurm/estimator_fit_all.sh seeds.csv configs/sepsis.yml dt dt_switch dt_stage_switch rnn

Assume the experiment directory path is stored in the variable experiment_dir. Train the reinforcement learning policies using the following commands.

sbatch --output="${experiment_dir}/logs/%x_%A_%a.out" --job-name="fit_rl_policy" --array=1-50 scripts/slurm/rl_policy_fit.sh "$experiment_dir"
cd ~/ReassessDTR
sbatch --output="${experiment_dir}/logs/%x_%A_%a.out" --job-name="rl_sepsis" --array=1-50 run_mimic_experiment.sh "${experiment_dir}"

After all jobs have completed, run off-policy evaluation.

cd ~/ppdev
./scripts/slurm/ope_run_all.sh "$experiment_dir" dt_stage_switch

Notes

Training reinforcement learning policies using the ReassessDTR repository requires a Weights & Biases account. An API key must be set in the scripts run_ra_experiment.sh and run_mimic_experiment.sh.
For the Sepsis experiment, the data must be saved as a zip file named mimictable.zip in this directory.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
configs		configs
notebooks		notebooks
ppdev		ppdev
scripts		scripts
.gitignore		.gitignore
README.md		README.md
container.def		container.def
pixi.toml		pixi.toml
seeds.csv		seeds.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pragmatic Policy Development via Interpretable Behavior Cloning

Installation

Configuration files

Data

RA

Sepsis

Experiments

Containers

Reproducing the RA experiment

Reproducing the Sepsis experiment

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pragmatic Policy Development via Interpretable Behavior Cloning

Installation

Configuration files

Data

RA

Sepsis

Experiments

Containers

Reproducing the RA experiment

Reproducing the Sepsis experiment

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages