This is the code base accompanying our ICLR 2025 paper q-exponential family for policy optimization.
We included the Gaussian, heavy-tailed and light-tailed distributions.
All the run statistics are logged to a MySQL database server. The schema can be found in configs/schema/default-schema.yaml
-
This codebase contains some features that are only available in Python3.10+
-
Install requirements:
pip install -r requirments.txt
-
Add the correct db credentials to
configs/db/credentials.yaml
. You can log the results to the local database or to a google cloud server (recommended). I would recommend makingcredentials-local.yaml
and adding it to.gitignore
. -
Set
db_prefix
to your username inconfigs/config.yaml
. This is especially required if hosting on CC because they only allow you to make databases that start with your CC username. -
Run the following command for a sweep of 2400 runs as defined in
configs/config.yaml
. Theargs.run
argument is the run id for the first experiment in the sweep. Other experiments are automatically assigned a run id of args.run + sweep_id in the database.
python main.py run=0
Generate scripts:
cd configs/
python config_v0.py
chmod +x scripts/tasks_*
cd ..
Run the script from the root.
Test one run:
./configs/scripts/tasks_0.sh
All runs:
parallel ./configs/scripts/tasks_{}.sh ::: $(seq 0 6224)
If you find this code helpful, please consider citing our paper
@inproceedings{Zhu2025-qExpPolicy,
title={q-exponential Family for Policy Optimization},
author={Lingwei Zhu and Haseeb Shah and Han Wang and Yukie Nagai and Martha White},
booktitle={International Conference on Learning Representations (ICLR)},
year={2025},
}