Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix eval max episodes length bug in mtsac #1952

Closed
wants to merge 1 commit into from

Conversation

TianhongDai
Copy link
Contributor

@TianhongDai TianhongDai commented Aug 20, 2020

Hi - I think you forget to add max episode length in mtsac evaluation: https://github.com/rlworkgroup/garage/blob/master/src/garage/torch/algos/mtsac.py#L183-L186 . If you don't add max episode length, it will generate errors when run mtsac examples: python examples/torch/mtsac_metaworld_mt10.py.

here is the log:

2020-08-20 23:41:08 | [mtsac_metaworld_mt10] Obtaining samples...
Traceback (most recent call last):
  File "examples/torch/mtsac_metaworld_mt10.py", line 102, in <module>
    mtsac_metaworld_mt10()
  File "/home/tianhong/anaconda3/envs/test-garage/lib/python3.6/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/tianhong/anaconda3/envs/test-garage/lib/python3.6/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/tianhong/anaconda3/envs/test-garage/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/tianhong/anaconda3/envs/test-garage/lib/python3.6/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/experiment/experiment.py", line 362, in __call__
    result = self.function(ctxt, **kwargs)
  File "examples/torch/mtsac_metaworld_mt10.py", line 99, in mtsac_metaworld_mt10
    runner.train(n_epochs=epochs, batch_size=batch_size)
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/experiment/local_runner.py", line 501, in train
    average_return = self._algo.train(self)
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/torch/algos/sac.py", line 209, in train
    last_return = self._evaluate_policy(runner.step_itr)
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/torch/algos/mtsac.py", line 186, in _evaluate_policy
    num_eps=self._num_evaluation_episodes))
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/np/_functions.py", line 57, in obtain_evaluation_episodes
    deterministic=True)
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/sampler/utils.py", line 61, in rollout
    es = env.step(a)
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/envs/multi_env_wrapper.py", line 206, in step
    es = self._env.step(action)
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/envs/normalized_env.py", line 102, in step
    es = self._env.step(scaled_action)
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/envs/gym_env.py", line 214, in step
    observation, reward, done, info = self._env.step(action)
  File "/home/tianhong/anaconda3/envs/test-garage/lib/python3.6/site-packages/metaworld/envs/mujoco/multitask_env.py", line 161, in step
    obs, reward, done, info = self.active_env.step(action)
  File "/home/tianhong/anaconda3/envs/test-garage/lib/python3.6/site-packages/metaworld/envs/mujoco/sawyer_xyz/sawyer_reach_push_pick_place.py", line 147, in step
    self.do_simulation([action[-1], -action[-1]])
  File "/home/tianhong/anaconda3/envs/test-garage/lib/python3.6/site-packages/metaworld/envs/mujoco/mujoco_env.py", line 118, in do_simulation
    raise ValueError('Maximum path length allowed by the benchmark has been exceeded')
ValueError: Maximum path length allowed by the benchmark has been exceeded

@TianhongDai TianhongDai requested a review from a team as a code owner August 20, 2020 22:34
@TianhongDai TianhongDai requested review from krzentner and removed request for a team August 20, 2020 22:34
@mergify mergify bot requested review from a team, maliesa96 and nicolengsy and removed request for a team August 20, 2020 22:34
@TianhongDai TianhongDai changed the title fix eval max episodes length eval in mtsac fix eval max episodes length bug in mtsac Aug 20, 2020
@ryanjulian
Copy link
Member

@TianhongDai thank you so much for the PR! I think @avnishn just barely beat you to it, with this PR which was just merged on master: cc163fb

Does that fix your issue?

@TianhongDai
Copy link
Contributor Author

TianhongDai commented Aug 20, 2020

@TianhongDai thank you so much for the PR! I think @avnishn just barely beat you to it, with this PR which was just merged on master: cc163fb

Does that fix your issue?

@ryanjulian Hi - It doesn't, still not add max_episode_length in the mtsac:

obtain_evaluation_episodes(
self.policy,
self._eval_env,
num_eps=self._num_evaluation_episodes))
. It will still generate this error:

Traceback (most recent call last):
  File "examples/torch/mtsac_metaworld_mt10.py", line 102, in <module>
    mtsac_metaworld_mt10()
  File "/home/tianhong/anaconda3/envs/test-garage/lib/python3.6/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/tianhong/anaconda3/envs/test-garage/lib/python3.6/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/tianhong/anaconda3/envs/test-garage/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/tianhong/anaconda3/envs/test-garage/lib/python3.6/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/experiment/experiment.py", line 362, in __call__
    result = self.function(ctxt, **kwargs)
  File "examples/torch/mtsac_metaworld_mt10.py", line 99, in mtsac_metaworld_mt10
    runner.train(n_epochs=epochs, batch_size=batch_size)
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/experiment/local_runner.py", line 501, in train
    average_return = self._algo.train(self)
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/torch/algos/sac.py", line 209, in train
    last_return = self._evaluate_policy(runner.step_itr)
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/torch/algos/mtsac.py", line 186, in _evaluate_policy
    num_eps=self._num_evaluation_episodes))
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/np/_functions.py", line 57, in obtain_evaluation_episodes
    deterministic=True)
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/sampler/utils.py", line 61, in rollout
    es = env.step(a)
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/envs/multi_env_wrapper.py", line 206, in step
    es = self._env.step(action)
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/envs/normalized_env.py", line 102, in step
    es = self._env.step(scaled_action)
  File "/home/tianhong/bicv-projects/imitation-learning/test-fix/garage/src/garage/envs/gym_env.py", line 214, in step
    observation, reward, done, info = self._env.step(action)
  File "/home/tianhong/anaconda3/envs/test-garage/lib/python3.6/site-packages/metaworld/envs/mujoco/multitask_env.py", line 161, in step
    obs, reward, done, info = self.active_env.step(action)
  File "/home/tianhong/anaconda3/envs/test-garage/lib/python3.6/site-packages/metaworld/envs/mujoco/sawyer_xyz/sawyer_reach_push_pick_place.py", line 147, in step
    self.do_simulation([action[-1], -action[-1]])
  File "/home/tianhong/anaconda3/envs/test-garage/lib/python3.6/site-packages/metaworld/envs/mujoco/mujoco_env.py", line 118, in do_simulation
    raise ValueError('Maximum path length allowed by the benchmark has been exceeded')
ValueError: Maximum path length allowed by the benchmark has been exceeded

@TianhongDai TianhongDai reopened this Aug 20, 2020
@avnishn
Copy link
Member

avnishn commented Aug 20, 2020

Hi @TianhongDai you're correct to say that cc163fb doesn't catch your issue, it actually only corrects the underlying algorithm SAC. Thanks for catching this!

@TianhongDai
Copy link
Contributor Author

Hi @TianhongDai you're correct to say that cc163fb doesn't catch your issue, it actually only corrects the underlying algorithm SAC. Thanks for catching this!

You're welcome!

@ryanjulian
Copy link
Member

@avnishn can you please help @TianhongDai get this change merged and backported?

@mergify mergify bot requested a review from a team August 20, 2020 23:36
@avnishn
Copy link
Member

avnishn commented Aug 20, 2020

@Mergifyio rebase

@mergify
Copy link
Contributor

mergify bot commented Aug 20, 2020

Command rebase: success

Branch has been successfully rebased

@maliesa96
Copy link
Contributor

Thanks for this!

@mergify mergify bot requested a review from a team August 24, 2020 20:08
@ryanjulian
Copy link
Member

@Mergifyio rebase

@mergify
Copy link
Contributor

mergify bot commented Aug 26, 2020

Command rebase: success

Branch has been successfully rebased

@avnishn
Copy link
Member

avnishn commented Aug 27, 2020

Hi @TianhongDai, it seems that we're having some problem with running pr's that are based on branches from forks. For this reason, I've opened a separate pr that has your changes, and have listed you as a co-author on the change. That pr is #1975. Let me know if you have any questions.

@avnishn avnishn closed this Aug 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants