Reproducing Figure 11 and reporting success rate #357

rasoolfa · 2022-01-09T22:19:25Z

Hi all and @avnishn,

I've been trying to reproduce results from Figure 11 in https://arxiv.org/pdf/1910.10897.pdf using https://github.com/rlworkgroup/garage/blob/08492007d6e2d9ead9beb83a8a4247e52019ac7d/metaworld_examples/sac_metaworld.py and hyper-parameters reported in Table 3. Should I use Table 3 for hyper-parameters?

One thing which is not clear to me is how the success rate is reported. I notice the env.step returns 'success' but want to verify here that is what reported in the paper. Here is the code the I use to report results ( random action is used for simplicity):

from metaworld.envs import ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLE
env_cls = ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLE['hammer-v2-goal-observable']
eval_env= env_cls(seed=0)
eval_env.seed(0)
avg_reward = 0 
success_rate = 0 
num_evals = 2

for _ in range(num_evals):
    obs = eval_env.reset()
    done = False
    stp = 0
    while not done and stp < eval_env.max_path_length:
        obs, reward, done, info = eval_env.step(eval_env.action_space.sample())
        avg_reward += reward
        stp += 1
        if 'success' in info:
            success_rate += info['success']
avg_reward /= num_evals
success_rate /= num_evals

Is this the right way to report the success rate like Figure 11?
Thanks for your help.
Rasool

The text was updated successfully, but these errors were encountered:

rasoolfa · 2022-01-12T02:46:10Z

I have to add, my results are way worse than reported results.
Any help would be highly appreciated. Thanks.

avnishn · 2022-01-12T03:33:59Z

Hi @rasoolfa,

sorry for the late response.

This is the correct way of computing success:

num_evals = 10
num_successful_eval_trajectories = 0

for _ in range(num_evals):
    obs = w.reset()
    done = False
    success_curr_time_step = False
    stp = 0
    while not done and stp < eval_env.max_path_length:
        obs, reward, done, info = eval_env.step(eval_env.action_space.sample())
        stp += 1
        success_curr_time_step |= info['success']
    num_successful_eval_trajectories += int(success_curr_time_step)

success_rate = num_successful_eval_trajectories/num_evals

The code hasn't shifted, or the environments, so its unlikely that a performance regression happened, unless there was a performance regression caused by one of the dependencies (e.g. an upgraded version of torch. I used 1.8)

Thanks,
@avnishn

rasoolfa · 2022-01-12T04:27:24Z

Thanks @avnishn
Can you also please comment on the following questions?

So the value of 'success' in info, i.e. info = { 'success':val}, is not important then and should be ignored?
In addition, with your mentioned approach, the success is always 100% as "success" is already in the "info" (see below), isn't it? I've checked with 8 different environments and all of them return "info" always containing the "success" flag.

"info" contains the followings:

{'success': 0.0,
 'near_object': 0.919886337009916,
 'grasp_success': False,
 'grasp_reward': 0.018269567161941357,
 'in_place_reward': 0.07328687668107409,
 'obj_to_target': 0,
 'unscaled_reward': 0.43810542967701377}

Also should I use Table 3 as a reference for the hyper-parameters?
ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLE should be used to create an env for single task experiments, is that right? e.g.

from metaworld.envs import ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLE
env_cls = ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLE['hammer-v2-goal-observable']
eval_env= env_cls(seed=0)
eval_env.seed(0)

Appreciated for your help.

avnishn · 2022-01-12T15:16:30Z

whoops sorry, I edited my answer. Gave you the wrong answer the first time.
yes, and they should be the same as the hparams in the launcher that you linked.
3)yes.

rasoolfa · 2022-01-13T02:05:01Z

thanks again for your help.

avnishn · 2022-01-13T04:21:17Z

No problem! I'm gonna go ahead and close this for now, but if you have more questions, I'd recommend joining our slack community (link on the readme), where a lot of questions like these have been answered, but of course feel free to post here again if you'd like.

rasoolfa · 2022-01-13T07:13:00Z

Thanks again @avnishn. One last thing, do you happen to have learning curves or logs files for these experiments (i.e. Figure 11)? I just want to compare as I still can't reproduce paper results.

krzentner · 2022-01-23T21:49:40Z

To be clear, since the above conversation was unclear to me: In MetaWorld, an episode is considered successful if the info['success'] ever becomes 1.0 during that episode. SuccessRate therefore needs to be computed across many episodes to be meaningful.

AnukritiSinghh · 2024-01-12T14:44:13Z

I wanted to know that success_rate is calculated in evaluation part? and that is what is reported in the paper?

avnishn closed this as completed Jan 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing Figure 11 and reporting success rate #357

Reproducing Figure 11 and reporting success rate #357

rasoolfa commented Jan 9, 2022 •

edited

rasoolfa commented Jan 12, 2022

avnishn commented Jan 12, 2022 •

edited

rasoolfa commented Jan 12, 2022 •

edited

avnishn commented Jan 12, 2022

rasoolfa commented Jan 13, 2022

avnishn commented Jan 13, 2022

rasoolfa commented Jan 13, 2022

krzentner commented Jan 23, 2022 •

edited

AnukritiSinghh commented Jan 12, 2024

Reproducing Figure 11 and reporting success rate #357

Reproducing Figure 11 and reporting success rate #357

Comments

rasoolfa commented Jan 9, 2022 • edited

rasoolfa commented Jan 12, 2022

avnishn commented Jan 12, 2022 • edited

rasoolfa commented Jan 12, 2022 • edited

avnishn commented Jan 12, 2022

rasoolfa commented Jan 13, 2022

avnishn commented Jan 13, 2022

rasoolfa commented Jan 13, 2022

krzentner commented Jan 23, 2022 • edited

AnukritiSinghh commented Jan 12, 2024

rasoolfa commented Jan 9, 2022 •

edited

avnishn commented Jan 12, 2022 •

edited

rasoolfa commented Jan 12, 2022 •

edited

krzentner commented Jan 23, 2022 •

edited