Accepted at ICML 2023.
Paper: https://arxiv.org/abs/2302.14015
Install the basic requirements with conda/mamba, e.g. by running
conda env create -f env.yml
Make sure you are installing the correct torch version (cpu vs gpu).
To reproduce the experiments in the paper follow the instructions below.
This is the example with 4 treatments.
MODEL Each of the four treatments
and reward (outcome) likelihoods:
Run an experiment
The model is implemented as ContinuousContextDiscreteTreatment
class, defined in bayesian_simulators/continuous_scalar_context_discrete_treatment.py
.
The training loop is implemented in
research_experiments/discrete_design_main.py
. In addition to training loop (train_net
), that script also:
- defines a function to compute UCB designs:
compute_ucb_design
; - defines a function to calculate regret:
calculate_regret
. The "real process" isContinuousContextAndTreatment
with a fixed realisation of the parameters$\psi$ , drawn from the prior. It differentiates the REAL WORLD SIMULATOR from the model emulator by denoting the former in ALL CAPS; - defines
main
which defines a prior, sets up logging, runs all the above (train loop, regret evaluation), stores the outputs and creates some plots.
To run CO-BED:
python discrete_design_main.py \
--seed-reps 3 \
--batch-size 512 \
--hidden-dim 512 \
--encoding-dim 16 \
--lr 0.001 \
--gamma 0.9 \
--num-jobs -1 \
--num-steps 500 \
--design-dim 10 \
--seed-reps 5 \
--num-true-models-to-sample 3 \
--tau 5 \
--optimise-design \
--device <cpu/cuda>
To run random baseline, simply remove the optimise-design
flag:
python discrete_design_main.py \
--batch-size 512 \
--hidden-dim 512 \
--encoding-dim 16 \
--lr 0.001 \
--gamma 0.9 \
--num-jobs -1 \
--num-steps 500 \
--design-dim 10 \
--seed-reps 5 \
--num-true-models-to-sample 3 \
--tau 5 \
--device <cpu/cuda>
To run UCB baseline, use ucb-baseline
flag, setting to the appropriate level of
python discrete_design_main.py \
--batch-size 512 \
--hidden-dim 512 \
--encoding-dim 16 \
--lr 0.001 \
--gamma 0.9 \
--num-jobs -1 \
--num-steps 500 \
--design-dim 10 \
--seed-reps 5 \
--num-true-models-to-sample 3 \
--tau 5 \
--ucb-baseline 0.0 \
--device <cpu/cuda>
To run Thompson sampling baseling, use thompson-sampling-baseline
flag:
python discrete_design_main.py \
--batch-size 512 \
--hidden-dim 512 \
--encoding-dim 16 \
--lr 0.001 \
--gamma 0.9 \
--num-jobs -1 \
--num-steps 500 \
--design-dim 10 \
--seed-reps 5 \
--num-true-models-to-sample 3 \
--tau 5 \
--thompson-sampling-baseline \
--device <cpu/cuda>
MODEL For the continuous treatment example we use the following model:
Run an experiment
The model is implemented as ContinuousContextAndTreatment
class, defined in bayesian_simulators/continuous_scalar_context_scalar_treatment.py
.
The training loop is implemented in
research_experiments/continuous_design_main.py
. In addition to training loop (train_net
), that script also:
- defines a function to compute UCB designs:
compute_ucb_design
; - defines a function to calculate regret:
calculate_regret
. The "real process" isContinuousContextAndTreatment
with a fixed realisation of the parameters$\psi$ , drawn from the prior.It differentiates the REAL WORLD SIMULATOR from the model emulator by denoting the former in ALL CAPS; - defines
main
which defines a prior, sets up logging, runs all the above (train loop, regret evaluation), stores the outputs and creates some plots.
To run CO-BED:
python continuous_design_main.py \
--batch-size 512 \
--hidden-dim 512 \
--encoding-dim 16 \
--lr 0.001 \
--gamma 0.9 \
--num-jobs -1 \
--num-steps 500 \
--design-dim 10 \
--seed-reps 5 \
--num-true-models-to-sample 3 \
--optimise-design \
--device <cpu/cuda>
To run random baseline, simply remove the optimise-design
flag:
python continuous_design_main.py \
--batch-size 512 \
--hidden-dim 512 \
--encoding-dim 16 \
--lr 0.001 \
--gamma 0.9 \
--num-jobs -1 \
--num-steps 500 \
--design-dim 10 \
--seed-reps 5 \
--num-true-models-to-sample 3 \
--device <cpu/cuda>
To run UCB baseline, use ucb-baseline
flag, setting to the appropriate level of
python continuous_design_main.py \
--batch-size 512 \
--hidden-dim 512 \
--encoding-dim 16 \
--lr 0.001 \
--gamma 0.9 \
--num-jobs -1 \
--num-steps 500 \
--design-dim 10 \
--seed-reps 5 \
--num-true-models-to-sample 3 \
--ucb-baseline 0.0 \
--device <cpu/cuda>
To run Thompson sampling baseling, use thompson-sampling-baseline
flag:
python continuous_design_main.py \
--batch-size 512 \
--hidden-dim 512 \
--encoding-dim 16 \
--lr 0.001 \
--gamma 0.9 \
--num-jobs -1 \
--num-steps 500 \
--design-dim 10 \
--seed-reps 5 \
--num-true-models-to-sample 3 \
--thompson-sampling-baseline \
--device <cpu/cuda>
If you want to learn a batch of D=60
experiments modify design_dim
accordingly. e.g:
python
design_dim: 60
Logging is done with MlFlow, to start a local server, run
mlflow ui
and navigate to the appropriate experiment to view all logged metrics.
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.