Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a typo to allow evaluating algos deterministically #1617

Merged
merged 3 commits into from Jul 6, 2020

Conversation

maciejwolczyk
Copy link
Contributor

When running some experiments with SAC I have discovered that my algorithm does not act deterministically during the evaluation (i.e. the action is sampled from the policy distribution instead of taking the mean/mode of the distribution). The code for obtaining evaluation samples uses the rollout function from sampler.utils with the argument deterministic=True.

The rollout function is then supposed to look into agent_info dictionary and use the mean value stored there. Unfortunately, currently in the code it looks into the agent_infos (with s at the end), which is a list containing agent_info dictionaries and as such obviously does not contain the mean key. This means that the stochastic, sampled action is used instead. My pull request solves this issue by fixing the typo.

Technical sidenote - maybe there should be an exception raised if deterministic=True and there is no mean key in the dict?

@krzentner krzentner added the backport-to-2019.10 Backport this PR to release-2019.10 label Jun 25, 2020
@krzentner
Copy link
Contributor

Oh, I thought we had a test case for this code path. I guess not.

Honestly, looking for mean in the first place is kinda a hack. It only really makes sense for gaussian policies. I've wondered for some time if we should add a argument to get_action/get_actions that makes the actions deterministic.

@ryanjulian ryanjulian added the backport-to-2020.06 Backport this PR to release-2020.06 label Jun 27, 2020
Copy link
Member

@avnishn avnishn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this!

@avnishn
Copy link
Member

avnishn commented Jun 29, 2020

We should probably add a corresponding test for this bug somewhere in this file:

https://github.com/rlworkgroup/garage/blob/master/tests/garage/sampler/test_utils.py

@ryanjulian
Copy link
Member

@Mergifyio rebase

@mergify
Copy link
Contributor

mergify bot commented Jun 30, 2020

Command rebase: success

Branch has been successfully rebased

@ryanjulian
Copy link
Member

@maciejwolczyk thanks so much for the PR!

Can you add a test which would have detected this bug (as suggested by @avnishn )?

After that, I think it's ready to merge.

@ryanjulian ryanjulian added this to In Progress in v2019.10 backports via automation Jul 2, 2020
@ryanjulian ryanjulian added this to TODO in v2020.06 backports via automation Jul 2, 2020
@maciejwolczyk
Copy link
Contributor Author

I'm glad I could help and sorry for the delay!

Is this test okay? I've run the test on the old code with the typo and it does fail, and of course it should pass after the fix.

Copy link
Member

@ryanjulian ryanjulian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test looks good to me! please just make the CI pass and we will merge it :)

thank you!

v2020.06 backports automation moved this from TODO to In Progress Jul 2, 2020
@ryanjulian
Copy link
Member

@Mergifyio rebase

@mergify
Copy link
Contributor

mergify bot commented Jul 2, 2020

Command rebase: success

Branch has been successfully rebased

@ryanjulian
Copy link
Member

@Mergifyio rebase

@mergify
Copy link
Contributor

mergify bot commented Jul 3, 2020

Command rebase: success

@mergify mergify bot requested a review from a team July 3, 2020 19:12
@ryanjulian
Copy link
Member

@Mergifyio rebase

@mergify
Copy link
Contributor

mergify bot commented Jul 5, 2020

Command rebase: success

@ryanjulian
Copy link
Member

@maciejwolczyk it looks like the CI is failing due to a formatting issue. Can you correct it and re-upload?

Note that this should be handled for you if you setup pre-commit, as detailed in CONTRIBUTING.md

@ryanjulian ryanjulian mentioned this pull request Jul 5, 2020
5 tasks
@maciejwolczyk
Copy link
Contributor Author

maciejwolczyk commented Jul 5, 2020

That's a bit weird, earlier I've checked the commits with the pre-commit hooks enabled and also the CI passed when it was ran for the first time... I guess maybe the failure now is because the pre-commit script changed recently?

I've pushed a commit which passes the pre-commit scripts. It was actually detecting a formatting error in lines that I haven't touched, so I guess it just checks the whole file. If you run yapf --recursive . in the repository dir then it finds some more formatting inconsistencies, but I think a separate pull request would be more appropriate for fixing that.

Anyway, I hope it will pass now.

@maciejwolczyk
Copy link
Contributor Author

It's failing now, but on a later stage (normal tests) and the failing test is test_time_limit_env for the BulletEnv. I'm not sure if this issue is connected with the current pull request, I've checked that some of the CI for different PRs fail on this test as well.

@AiRuiChen
Copy link
Contributor

It's failing now, but on a later stage (normal tests) and the failing test is test_time_limit_env for the BulletEnv. I'm not sure if this issue is connected with the current pull request, I've checked that some of the CI for different PRs fail on this test as well.

Hi, just sent #1710 to fix this failing test. Once this is merged please try rebasing the latest master branch and check CI again.

@maciejwolczyk
Copy link
Contributor Author

@Mergifyio rebase

@mergify
Copy link
Contributor

mergify bot commented Jul 6, 2020

@maciejwolczyk is not allowed to run commands

@ryanjulian
Copy link
Member

@Mergifyio rebase

@mergify
Copy link
Contributor

mergify bot commented Jul 6, 2020

Command rebase: success

@ryanjulian ryanjulian merged commit f8377f5 into rlworkgroup:master Jul 6, 2020
v2019.10 backports automation moved this from In Progress to Done Jul 6, 2020
v2020.06 backports automation moved this from In Progress to Done Jul 6, 2020
@ryanjulian
Copy link
Member

@Mergifyio backport release-2020.06

@ryanjulian
Copy link
Member

@Mergifyio backport release-2019.10

@mergify
Copy link
Contributor

mergify bot commented Jul 7, 2020

Command backport release-2020.06: failure

No backport have been created

  • Backport to branch release-2020.06 failed

Cherry-pick of f8377f5 has failed:

On branch mergify/bp/release-2020.06/pr-1617
Your branch is up to date with 'origin/release-2020.06'.

You are currently cherry-picking commit f8377f5f.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:

	modified:   src/garage/sampler/utils.py
	modified:   tests/garage/sampler/test_utils.py

Unmerged paths:
  (use "git add <file>..." to mark resolution)

	both modified:   tests/fixtures/policies/dummy_policy.py

@mergify
Copy link
Contributor

mergify bot commented Jul 7, 2020

Command backport release-2019.10: failure

No backport have been created

  • Backport to branch release-2019.10 failed

Cherry-pick of f8377f5 has failed:

On branch mergify/bp/release-2019.10/pr-1617
Your branch is up to date with 'origin/release-2019.10'.

You are currently cherry-picking commit f8377f5f.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:

	modified:   tests/garage/sampler/test_utils.py

Unmerged paths:
  (use "git add <file>..." to mark resolution)

	both modified:   src/garage/sampler/utils.py
	both modified:   tests/fixtures/policies/dummy_policy.py

ryanjulian pushed a commit that referenced this pull request Jul 7, 2020
* Fix deterministic policy evaluation
* Add tests for deterministic policy eval
* Fix formatting in dummy policy init
ryanjulian pushed a commit that referenced this pull request Jul 7, 2020
* Fix deterministic policy evaluation
* Add tests for deterministic policy eval
* Fix formatting in dummy policy init
ryanjulian added a commit that referenced this pull request Jul 7, 2020
* Fix deterministic policy evaluation
* Add tests for deterministic policy eval
* Fix formatting in dummy policy init

Co-authored-by: Maciej Wołczyk <raihid888@gmail.com>
yeukfu added a commit that referenced this pull request Aug 5, 2020
mergify bot pushed a commit that referenced this pull request Aug 5, 2020
* Backport #1617

* Fix docstring

* Fix test_off_policy_vec_sampler_integration

Co-authored-by: ruofu <ruofuwan@usc.edu>
irisliucy pushed a commit that referenced this pull request Aug 18, 2020
* Fix deterministic policy evaluation
* Add tests for deterministic policy eval
* Fix formatting in dummy policy init

Co-authored-by: Maciej Wołczyk <raihid888@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-to-2019.10 Backport this PR to release-2019.10 backport-to-2020.06 Backport this PR to release-2020.06 ready-to-merge
Projects
Development

Successfully merging this pull request may close these issues.

None yet

6 participants