q-exponential family for policy optimization

This is the code base accompanying our ICLR 2025 paper q-exponential family for policy optimization.

We included the Gaussian, heavy-tailed and light-tailed distributions.

online results

All the run statistics are logged to a MySQL database server. The schema can be found in configs/schema/default-schema.yaml

How to run

This codebase contains some features that are only available in Python3.10+
Install requirements:

pip install -r requirments.txt

Add the correct db credentials to configs/db/credentials.yaml. You can log the results to the local database or to a google cloud server (recommended). I would recommend making credentials-local.yaml and adding it to .gitignore.
Set db_prefix to your username in configs/config.yaml. This is especially required if hosting on CC because they only allow you to make databases that start with your CC username.
Run the following command for a sweep of 2400 runs as defined in configs/config.yaml. The args.run argument is the run id for the first experiment in the sweep. Other experiments are automatically assigned a run id of args.run + sweep_id in the database.

python main.py run=0

offline results

Generate scripts:

cd configs/
python config_v0.py
chmod +x scripts/tasks_*
cd ..

Run the script from the root.

Test one run:

./configs/scripts/tasks_0.sh

All runs:

parallel ./configs/scripts/tasks_{}.sh ::: $(seq 0 6224)

citing

If you find this code helpful, please consider citing our paper

@inproceedings{Zhu2025-qExpPolicy,
title={q-exponential Family for Policy Optimization},
author={Lingwei Zhu and Haseeb Shah and Han Wang and Yukie Nagai and Martha White},
booktitle={International Conference on Learning Representations (ICLR)},
year={2025},
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
q-exp offline		q-exp offline
q-exp online		q-exp online
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

q-exponential family for policy optimization

online results

How to run

offline results

citing

About

Uh oh!

Releases

Packages

Languages

lingweizhu/qexp

Folders and files

Latest commit

History

Repository files navigation

q-exponential family for policy optimization

online results

How to run

offline results

citing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages