-
-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] Algorithm / Parameters for Ant Maze #193
Comments
Hey, it sounds like your agent is incapable of locomotion, can you solve PointEnv?, can you locomote the Ant environment by itself? @rodrigodelazcano may know which algorithms can solve it |
Hi, However, I have to add that I changed the reward a bit. I remember that SAC with sparse rewards and HER went really well with the Ant4Rooms environment as in the hierarchical actor-critic paper by Levy et al https://www.youtube.com/watch?v=TY3gr4SRmPk&ab_channel=AndrewLevy After I found that your original version of AntMaze with the original reward function did not work with SAC, no matter if sparse or dense rewards, I tried the reward [-1,0], but without improvement. Also, I played around with My team's general goal is to add a working config for AntMaze to our Scilab-RL framework you can find here: https://scilab-rl.github.io/Scilab-RL/ Any hint is appreciated! |
Oh, and yes, the locomotion itself works. At least, sometimes, and only until it flips over. |
Hey, I am familiar with this trained locomotion behavior of the Ant model from the Where:
This was addressed in You could try implementing that by creating a wrapper or forking the environment class. |
You could also check |
Question
I tried hard to train an agent to solve any of the AntMaze environments. I tried the stable baselines 3 implementations of SAC (dense and sparse) and PPO, but could not solve even a small open AntMaze environment. I tried random goal and starting position and fixed positions. Has anyone successfully trained an agent one of the maze envs so far? If so what were the parameters and algorithm used, and which variant of the environment?
What happens a lot in my case is that the Ant flips upside-down. It cannot walk on its "elbows" in that position, so that it remains stuck.
Thanks a lot!
Manfred (Hamburg University of Technology, Germany)
The text was updated successfully, but these errors were encountered: