Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New worker for PEARL #1124

Merged
merged 12 commits into from
Mar 13, 2020
Merged

New worker for PEARL #1124

merged 12 commits into from
Mar 13, 2020

Conversation

lywong92
Copy link
Member

@lywong92 lywong92 commented Jan 8, 2020

Adding a new worker for PEARL sampling because it needs to be able to store context and resample belief in the policy when obtaining samples.

@lywong92 lywong92 requested a review from a team as a code owner January 8, 2020 13:13
@codecov
Copy link

codecov bot commented Jan 8, 2020

Codecov Report

Merging #1124 into master will increase coverage by 0.04%.
The diff coverage is 97.56%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1124      +/-   ##
==========================================
+ Coverage   88.00%   88.04%   +0.04%     
==========================================
  Files         188      189       +1     
  Lines        8885     8926      +41     
  Branches     1124     1131       +7     
==========================================
+ Hits         7819     7859      +40     
  Misses        862      862              
- Partials      204      205       +1     
Impacted Files Coverage Δ
src/garage/torch/algos/pearl.py 97.56% <97.56%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e26a7f6...c918641. Read the comment docs.

@lywong92 lywong92 changed the title New sampler for PEARL New sampler and buffers for PEARL Jan 9, 2020
@ryanjulian
Copy link
Member

Assuming everything passes muster, I'm OK merging this now if KR is okay with it. Clearly, unifying sampling for meta-RL is an area we need to work on.

Copy link
Member

@avnishn avnishn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Based on comments and code similarities, not sure if we need a new meta-rl replay buffer. Perhaps a refactor using an existing replay buffer can come later if we decide.
  2. What is the reasoning behind a multi task replay buffer for PEARL? I saw your comment:

The algorithm keeps a buffer for each task, and I think this looks cleaner. I just followed the implementation on oyster, but I suppose I can keep a list of buffers in the algo without this buffer class.

I'm with you on keeping a list of buffers as opposed to a separate MT replay buffer

  1. Can you create your own rollout function separate from the sampler/utils rollout function.
    It seems that you have modified the return types of the rollout function and added some other features. A question I find myself asking is "Will the added functionality be used by algorithms besides PEARL?"

src/garage/sampler/pearl_sampler.py Outdated Show resolved Hide resolved
src/garage/sampler/pearl_sampler.py Outdated Show resolved Hide resolved

# save the latent context that generated this trajectory
if accum_context:
path['context'] = self.policy.z.detach().cpu().numpy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does self.policy.z ever update during the duration of the sampling loop? If not, can we store this value instead of querying ever time.

src/garage/sampler/utils.py Outdated Show resolved Hide resolved
src/garage/sampler/utils.py Outdated Show resolved Hide resolved
@lywong92 lywong92 changed the title New sampler and buffers for PEARL New sampler for PEARL Mar 10, 2020
'WorkerFactory',
'Worker',
'DefaultWorker',
'PEARLSampler',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be achieved by adding a PEARLWorker class @krzentner ?

Anson does a similar pattern for RL2: He implements an RL2 worker which lives in garage.tf.algos.rl2 rather than the central sampler package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can definitely be achieved with that.
Alternatively, just put z in agent_infos. It's there for a reason.

@@ -0,0 +1,156 @@
# pylint: disable=unnecessary-pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can deal with this by just omitting pass from your empty methods.

@lywong92 lywong92 changed the title New sampler for PEARL New worker for PEARL Mar 11, 2020
'WorkerFactory',
'Worker',
'DefaultWorker',
'PEARLWorker',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please put this class in src/garage/torch/algos/pearl.py -- since it is only used by the PEARL algorithm.

Its import path will be from garage.torch.algos.pearl import PEARLWorker

@ryanjulian
Copy link
Member

Please address my comments before submit. @krzentner should also probably take a look.

pass
self._agent_infos['context'] = [self.agent.z.detach().cpu().numpy()
] * self._max_path_length
self.agent.sample_from_belief()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make more sense to put this at the start, since agent updates happen in between calls to this method.

Copy link
Contributor

@krzentner krzentner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good. There's one small thing you should check, but feel free to submit.

@lywong92 lywong92 merged commit 776d6e4 into master Mar 13, 2020
@ryanjulian ryanjulian deleted the add_in_place_sampler branch April 8, 2020 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants