Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsuccessful SAC training for demo generation #11

Closed
harryzhangOG opened this issue Oct 10, 2021 · 9 comments
Closed

Unsuccessful SAC training for demo generation #11

harryzhangOG opened this issue Oct 10, 2021 · 9 comments

Comments

@harryzhangOG
Copy link

harryzhangOG commented Oct 10, 2021

Hi, we made a new environment with just a floating gripper, and we are trying to generate demo data with a SAC agent. However, we found that for opening-door task, when the handle is "horizontal", SAC agent fails to succeed after training (as seen in video below, on env 1001 and 1002). I have also attached the command I was using. I switched the seed from 0 to 10 and still didn't work. Please advise if we have done anything wrong. Thanks.

python -m tools.run_rl configs/sac/sac_mani_skill_state_1M_train.py --seed=10 --cfg-options \"env_cfg.env_name={}\" \"rollout_cfg.type=Rollout\" \"rollout_cfg.num_procs=1\" \"eval_cfg.num_procs=1\" --gpu-ids=1".format(gripper_env)

ezgif com-gif-maker

@xuanlinli17
Copy link
Collaborator

could you repost the video? I can't see it

@harryzhangOG
Copy link
Author

sorry, just updated with a new gif.

@xuanlinli17
Copy link
Collaborator

What was your reward function?

@harryzhangOG
Copy link
Author

harryzhangOG commented Oct 10, 2021

I just used the default dense reward. Other environments worked fine, but there are some that the agent just converged to the wrong move.

@harryzhangOG
Copy link
Author

Is it possible to evaluate on a certain level? Just as a sanity check to see if the agent memorizes the demo data?

@xuanlinli17
Copy link
Collaborator

You can use env.reset(level=some_level) to reset the level to a level in demo.

(this is similar to https://github.com/haosulab/ManiSkill-Learn/blob/main/tools/convert_state.py)

@harryzhangOG
Copy link
Author

Is there a way to do it in evaluation? We have a demo trajectory and want to train on the demo trajectory and test on the same level as a sanity check in eval. Thanks.

@xuanlinli17
Copy link
Collaborator

modify evaluation.py, like

self.recent_obs = self.env.reset()

@harryzhangOG
Copy link
Author

ok, thanks. That's what we thought.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants