-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue108 unflatten rollouts #141
Conversation
…rvations and memory structure independent of the number of agents
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, unflattening these tensors required more changes than I expected. It looks really nice. I didn't read the last few files super closely (especially the model architectures) but if you've verified it returns reasonable output then I'm happy.
* Addressed initial reviews from PR141 * Memory and obser ations exclude agent dimension if no agent is present * ActorCritic outputs, sotred rewards, etc always have agent dimension * Hacks to make ronthor work without dataset * Single-agent robothor, minigrid, babyai working
… + partly adapted pointnav baseline models
…ttened names (only really used for rnn_memory viz)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice changes. Everything looks reasonable but the number of changes is large enough that I didn't give everything a close reading. As you had mentioned, we'll likely want some tests in the future to ensure the model changes work as there were quite a few of them and I suspect it may have been easy for a bug to creep in. I need to read storage.py
a little more closely but I was hoping you could first respond to my comment there as it would help me ground myself when reading.
I actually thought I ha addressed your comment by adding typing. I'll take a second look! |
Oh I meant my most recent (new) comment about flat->unflat conversions which it looks like you've already answered 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments about typing and a possible suggestion for simplifying something in storage.py
.
Below I've added the summary of the breaking changes introduced in this PR. Breaking changes Unflattened tensors What? Why? What do you have to do? More flexible memory What? Why? What do you have to do? def _recurrent_memory_specification(self):
return {
"rnn": (
(
("layer", self.num_recurrent_layers),
("sampler", None),
("hidden", self.recurrent_hidden_state_size),
),
torch.float32,
)
} where we specify a dimension for layers, a placeholder dimension for sampler (batch), a dimension for def _recurrent_memory_specification(self):
return None if the def forward(self, observations, memory, prev_actions, masks):
rnn_out, new_rnn_hidden = self.state_encoder(
x=observations[self.input_uuid],
hidden_states=memory.tensor("rnn"),
masks=masks,
)
out, _ = self.ac_nonrecurrent_head(
observations={self.head_uuid: rnn_out},
memory=None,
prev_actions=prev_actions,
masks=masks,
)
return (
out,
memory.set_tensor("rnn", new_rnn_hidden),
) |
I cleaned up some details about recurrent memory specification and moved the automatic addition of a |
Merging as agreed. |
Still missing support for most plugin tasks and viz, but just to make sure I'm going in the right direction.
The main idea is that all tensors I will move around the engine will have step, sampler and agent dimensions, and I'm trying to make more extensive use of memory functionality in storage. Eventually all storage could be grouped in memories, but I'm going step by step.