Unsuccessful SAC training for demo generation #11

harryzhangOG · 2021-10-10T03:36:50Z

Hi, we made a new environment with just a floating gripper, and we are trying to generate demo data with a SAC agent. However, we found that for opening-door task, when the handle is "horizontal", SAC agent fails to succeed after training (as seen in video below, on env 1001 and 1002). I have also attached the command I was using. I switched the seed from 0 to 10 and still didn't work. Please advise if we have done anything wrong. Thanks.

python -m tools.run_rl configs/sac/sac_mani_skill_state_1M_train.py --seed=10 --cfg-options \"env_cfg.env_name={}\" \"rollout_cfg.type=Rollout\" \"rollout_cfg.num_procs=1\" \"eval_cfg.num_procs=1\" --gpu-ids=1".format(gripper_env)

The text was updated successfully, but these errors were encountered:

xuanlinli17 · 2021-10-10T03:38:54Z

could you repost the video? I can't see it

harryzhangOG · 2021-10-10T04:15:16Z

sorry, just updated with a new gif.

xuanlinli17 · 2021-10-10T07:34:30Z

What was your reward function?

harryzhangOG · 2021-10-10T18:18:06Z

I just used the default dense reward. Other environments worked fine, but there are some that the agent just converged to the wrong move.

harryzhangOG · 2021-10-11T22:46:02Z

Is it possible to evaluate on a certain level? Just as a sanity check to see if the agent memorizes the demo data?

xuanlinli17 · 2021-10-11T23:45:09Z

You can use env.reset(level=some_level) to reset the level to a level in demo.

(this is similar to https://github.com/haosulab/ManiSkill-Learn/blob/main/tools/convert_state.py)

harryzhangOG · 2021-10-12T01:23:03Z

Is there a way to do it in evaluation? We have a demo trajectory and want to train on the demo trajectory and test on the same level as a sanity check in eval. Thanks.

xuanlinli17 · 2021-10-12T01:25:32Z

modify evaluation.py, like

ManiSkill-Learn/mani_skill_learn/env/evaluation.py

Line 115 in 8d968d2

self.recent_obs = self.env.reset()

harryzhangOG · 2021-10-12T01:29:47Z

ok, thanks. That's what we thought.

lz1oceani closed this as completed Feb 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unsuccessful SAC training for demo generation #11

Unsuccessful SAC training for demo generation #11

harryzhangOG commented Oct 10, 2021 •

edited

xuanlinli17 commented Oct 10, 2021

harryzhangOG commented Oct 10, 2021

xuanlinli17 commented Oct 10, 2021

harryzhangOG commented Oct 10, 2021 •

edited

harryzhangOG commented Oct 11, 2021

xuanlinli17 commented Oct 11, 2021

harryzhangOG commented Oct 12, 2021

xuanlinli17 commented Oct 12, 2021

harryzhangOG commented Oct 12, 2021

Unsuccessful SAC training for demo generation #11

Unsuccessful SAC training for demo generation #11

Comments

harryzhangOG commented Oct 10, 2021 • edited

xuanlinli17 commented Oct 10, 2021

harryzhangOG commented Oct 10, 2021

xuanlinli17 commented Oct 10, 2021

harryzhangOG commented Oct 10, 2021 • edited

harryzhangOG commented Oct 11, 2021

xuanlinli17 commented Oct 11, 2021

harryzhangOG commented Oct 12, 2021

xuanlinli17 commented Oct 12, 2021

harryzhangOG commented Oct 12, 2021

harryzhangOG commented Oct 10, 2021 •

edited

harryzhangOG commented Oct 10, 2021 •

edited