New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug Report] Maze/AntMaze issues #155
Comments
Your test for sparse rewards, is not correct, since using random actions is very unlikely to get you to the terminal state Does this also affect the |
Thanks for taking a look at this issue.
Gymnasium-Robotics/gymnasium_robotics/envs/maze/ant_maze.py Lines 277 to 289 in d0a10dc
and comparing to the PointMaze step: Gymnasium-Robotics/gymnasium_robotics/envs/maze/point_maze.py Lines 379 to 391 in d0a10dc
This issue previously existed for PointMaze. I’m proposing to make the same fix as commit ace181e, but for AntMaze.
|
These changes would requite a new revision ("AntMaze_UMaze-v4") |
@alexdavey feel free to make a PR |
Hi, I believe I have found a couple of issues in the Maze/AntMaze environments. I have resolved both of these issues in commit 5573d5e, and I’m happy to submit a PR.
1) AntMaze sparse reward always zero
For continuing tasks in AntMazeEnv, the sparse reward is always zero. This is because at each step,
.compute_terminated()
resets the goal when the Ant is sufficiently close, before the reward is calculated.Here’s a code example to test this:
Currently this returns exactly zero since no reward is collected, but placing
.compute_reward()
above.compute_terminated()
gives a non-zero reward. In PointMaze, this issue was fixed in commit ace181e.2) AntMaze can reset into a terminal state
The Ant will sometimes start within the goal radius. This is because there is a
maze_size_scaling
factor missing in the distance check inMazeEnv.generate_reset_pos()
.In AntMaze
maze_size_scaling = 4
, so the xy position noise can be up to 1.0 in each direction. An unfortunate combination of goal noise and reset noise can cause the Ant to start within 0.45 ofself.goal
. This issue does not affect PointMaze because theremaze_size_scaling = 1
.Here’s a code example to test this:
Checklist
The text was updated successfully, but these errors were encountered: