We are currently designing safety tasks for safe RL based on MuJoCo environments and implementing more algorithms in the benchmark.
Figure.1 Safety-MuJoCo Environments: SafetyWalker-4 (a), SafetyHumanoidStandup-v4 (b), SafetyReacher-v4 (c), SafetyHopper-v4 (d), SafetyAnt-v4 (e), SafeHalfCheetah-v4 (f), SafetyPusher-v4 (g) and SafetyHumanoid-v4 (h).
If you find the repository useful, please cite the paper:
@inproceedings{gu2024balance,
title={Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation},
author={Gu, Shangding and Sel, Bilgehan and Ding, Yuhao and Wang, Lu and Lin, Qingwei and Jin, Ming and Knoll, Alois},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={38},
number={19},
pages={21099--21106},
year={2024}
}