Skip to content

lingweizhu/qexp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

q-exponential family for policy optimization

This is the code base accompanying our ICLR 2025 paper q-exponential family for policy optimization.

We included the Gaussian, heavy-tailed and light-tailed distributions.

online results

All the run statistics are logged to a MySQL database server. The schema can be found in configs/schema/default-schema.yaml

How to run

  1. This codebase contains some features that are only available in Python3.10+

  2. Install requirements:

pip install -r requirments.txt
  1. Add the correct db credentials to configs/db/credentials.yaml. You can log the results to the local database or to a google cloud server (recommended). I would recommend making credentials-local.yaml and adding it to .gitignore.

  2. Set db_prefix to your username in configs/config.yaml. This is especially required if hosting on CC because they only allow you to make databases that start with your CC username.

  3. Run the following command for a sweep of 2400 runs as defined in configs/config.yaml. The args.run argument is the run id for the first experiment in the sweep. Other experiments are automatically assigned a run id of args.run + sweep_id in the database.

python main.py run=0

offline results

Generate scripts:

cd configs/
python config_v0.py
chmod +x scripts/tasks_*
cd ..

Run the script from the root.

Test one run:

./configs/scripts/tasks_0.sh

All runs:

parallel ./configs/scripts/tasks_{}.sh ::: $(seq 0 6224)

citing

If you find this code helpful, please consider citing our paper

@inproceedings{Zhu2025-qExpPolicy,
title={q-exponential Family for Policy Optimization},
author={Lingwei Zhu and Haseeb Shah and Han Wang and Yukie Nagai and Martha White},
booktitle={International Conference on Learning Representations (ICLR)},
year={2025},
}

About

This repo contains the code for the paper q-exponential family for policy optimization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages