Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various MTSAC bug fixes #1975

Merged
merged 1 commit into from Aug 28, 2020
Merged

Various MTSAC bug fixes #1975

merged 1 commit into from Aug 28, 2020

Conversation

avnishn
Copy link
Member

@avnishn avnishn commented Aug 27, 2020

fixes to Examples to use the correct num_tasks

fixes to max_episode_length_eval being used by the algorithm

Co-authored-by: Tianhong Dai tianhongdai914@gmail.com

@avnishn avnishn requested a review from a team as a code owner August 27, 2020 17:15
@avnishn avnishn requested review from ahtsan and removed request for a team August 27, 2020 17:15
@mergify mergify bot requested review from a team, AiRuiChen and nicolengsy and removed request for a team August 27, 2020 17:15
@avnishn
Copy link
Member Author

avnishn commented Aug 27, 2020

@TianhongDai

@TianhongDai
Copy link
Contributor

TianhongDai commented Aug 27, 2020

@TianhongDai

@avnishn Thanks Avnishn! I think I forget to read contribution guidelines when submit the PR, sorry for making so much troubles to you.

@codecov
Copy link

codecov bot commented Aug 27, 2020

Codecov Report

Merging #1975 into master will decrease coverage by 0.10%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1975      +/-   ##
==========================================
- Coverage   93.50%   93.40%   -0.11%     
==========================================
  Files         192      192              
  Lines       10182    10184       +2     
  Branches     1268     1269       +1     
==========================================
- Hits         9521     9512       -9     
- Misses        438      446       +8     
- Partials      223      226       +3     
Impacted Files Coverage Δ
src/garage/torch/algos/sac.py 98.23% <ø> (ø)
src/garage/torch/algos/mtsac.py 93.33% <100.00%> (+0.31%) ⬆️
src/garage/plotter/plotter.py 59.77% <0.00%> (-3.45%) ⬇️
...rage/tf/optimizers/conjugate_gradient_optimizer.py 83.16% <0.00%> (-2.05%) ⬇️
src/garage/misc/tensor_utils.py 78.94% <0.00%> (-1.76%) ⬇️
src/garage/sampler/multiprocessing_sampler.py 89.26% <0.00%> (-1.35%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a2fb966...c83d34e. Read the comment docs.

env_spec=ml1_train_envs.spec,
num_tasks=50,
steps_per_epoch=epoch_cycles,
replay_buffer=replay_buffer,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please only call args as args and kwargs as kwargs.

'The correct number of tasks?')
obs = torch.Tensor([env.reset()[0]] * buffer_batch_size)
with pytest.raises(ValueError, match=error_string):
mtsac._get_log_alpha(dict(observation=obs))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to test a private method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true, however, the tests are in place to verify the correctness of this implementation. I feel more comfortable having the tests than not. At the same time, it makes no sense to have this as a publicly exposed field because it has no use outside of the algorithm.

@ryanjulian
Copy link
Member

Please reference the issues you're fixing

@avnishn avnishn linked an issue Aug 27, 2020 that may be closed by this pull request
runner.setup(algo=mtsac,
env=mt10_train_envs,
sampler_cls=LocalSampler,
n_workers=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why limit the number of workers here, and can't we use ray?

@mergify mergify bot requested a review from a team August 28, 2020 01:35
@avnishn
Copy link
Member Author

avnishn commented Aug 28, 2020

@maliesa96 we can't. Tldr; using the ray sampler with the old metaworld envs uses more memory than we have on lab machines.

@maliesa96
Copy link
Contributor

Damn alright, LGTM then.

@mergify mergify bot requested a review from a team August 28, 2020 06:19
fixes to Examples to use the correct num_tasks

fixes to max_episode_length_eval being used by the algorithm

Co-authored-by: Tianhong Dai <tianhongdai914@gmail.com>
@mergify mergify bot requested a review from a team August 28, 2020 06:26
@mergify mergify bot merged commit fc3ddc6 into master Aug 28, 2020
@mergify mergify bot deleted the Avnish-mtsac-bug-fixes branch August 28, 2020 07:26
avnishn pushed a commit that referenced this pull request Sep 3, 2020
Backport #1905, #1975, #1908 to fix problems
with max_eval_path_length being not used by
mtsac and sac, and add checking for incorrect
num_tasks being set in mtsac.
avnishn pushed a commit that referenced this pull request Sep 9, 2020
Backport #1905, #1975, #1908 to fix problems
with max_eval_path_length being not used by
mtsac and sac, and add checking for incorrect
num_tasks being set in mtsac.

Timelimit.truncated modified only when necessary

This issue occurs when there are multiple garage
envs that are nested or timelimit truncated = False
is included in the environment keys.
Previously, our timelimit
truncated logic was written with the idea in mind
that the key was only added when a time limit
truncation occured. If an environment already
has timelimit truncated = False in its keys
then the previous behavior was to set Done = True
which is the incorrect behavior.

That was causing performance degradation
in MTSAC and MTPPO/TRPO.

Now Done is only true in the normal/trivial case,
never if timelimit truncated is False.
avnishn pushed a commit that referenced this pull request Sep 11, 2020
Backport #1905, #1975, #1908 to fix problems
with max_eval_path_length being not used by
mtsac and sac, and add checking for incorrect
num_tasks being set in mtsac.

Timelimit.truncated modified only when necessary

This issue occurs when there are multiple garage
envs that are nested or timelimit truncated = False
is included in the environment keys.
Previously, our timelimit
truncated logic was written with the idea in mind
that the key was only added when a time limit
truncation occured. If an environment already
has timelimit truncated = False in its keys
then the previous behavior was to set Done = True
which is the incorrect behavior.

That was causing performance degradation
in MTSAC and MTPPO/TRPO.

Now Done is only true in the normal/trivial case,
never if timelimit truncated is False.
avnishn pushed a commit that referenced this pull request Sep 11, 2020
Backport #1905, #1975, #1908 to fix problems
with max_eval_path_length being not used by
mtsac and sac, and add checking for incorrect
num_tasks being set in mtsac.

Timelimit.truncated modified only when necessary

This issue occurs when there are multiple garage
envs that are nested or timelimit truncated = False
is included in the environment keys.
Previously, our timelimit
truncated logic was written with the idea in mind
that the key was only added when a time limit
truncation occured. If an environment already
has timelimit truncated = False in its keys
then the previous behavior was to set Done = True
which is the incorrect behavior.

That was causing performance degradation
in MTSAC and MTPPO/TRPO.

Now Done is only true in the normal/trivial case,
never if timelimit truncated is False.
mergify bot pushed a commit that referenced this pull request Sep 11, 2020
Backport #1905, #1975, #1908 to fix problems
with max_eval_path_length being not used by
mtsac and sac, and add checking for incorrect
num_tasks being set in mtsac.

Timelimit.truncated modified only when necessary

This issue occurs when there are multiple garage
envs that are nested or timelimit truncated = False
is included in the environment keys.
Previously, our timelimit
truncated logic was written with the idea in mind
that the key was only added when a time limit
truncation occured. If an environment already
has timelimit truncated = False in its keys
then the previous behavior was to set Done = True
which is the incorrect behavior.

That was causing performance degradation
in MTSAC and MTPPO/TRPO.

Now Done is only true in the normal/trivial case,
never if timelimit truncated is False.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

mtsac_metaworld_mt50.py sets num_tasks=10
4 participants