Reflex: Reinforcement Learning with Reflection Symmetry Exploitation in State-Based Continuous Control
This project follows the CleanRL style and uses modules from the CleanRL ecosystem.
- CleanRL repository: https://github.com/vwxyzjn/cleanrl
- CleanRL documentation: https://docs.cleanrl.dev/
Please follow the instructions in cleanrl to install dependencies.
Notes:
- If you run
dm_control/*tasks, make sureshimmy[dm-control]is installed.
Note: CPU-only is recommended for faster and more stable experiment, since most runtime is spent on environment interaction in state-based RL rather than policy optimization. We used an Mac M4 (10 CPU cores, 32GB RAM) in practice.
run_exp.py is the experiment launcher. By default, it uses multiprocessing to train multiple random seeds in parallel.
Run experiments with:
python run_exp.py -c configs/baseline_sac.yaml
python run_exp.py -c configs/reflex_sac.yamlUse different config files to run different experiment settings.
You can disable multiprocessing and run with a single process:
# Use the first seed listed in the config file
python run_exp.py -c configs/baseline_sac.yaml --single-processrun_exp.py reads these fields from the selected config file:
model: training script pathenvironments: environment listseeds: random seed listtotal_timesteps: per-environment training steps
So selecting a different file in configs/ will run a different training script and experiment setup.