Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue108 unflatten rollouts #141

Merged
merged 27 commits into from Aug 22, 2020
Merged

Issue108 unflatten rollouts #141

merged 27 commits into from Aug 22, 2020

Conversation

jordis-ai2
Copy link
Collaborator

@jordis-ai2 jordis-ai2 commented Aug 15, 2020

Still missing support for most plugin tasks and viz, but just to make sure I'm going in the right direction.

The main idea is that all tensors I will move around the engine will have step, sampler and agent dimensions, and I'm trying to make more extensive use of memory functionality in storage. Eventually all storage could be grouped in memories, but I'm going step by step.

@jordis-ai2 jordis-ai2 added this to the 0.1 milestone Aug 15, 2020
@jordis-ai2 jordis-ai2 linked an issue Aug 15, 2020 that may be closed by this pull request
Copy link
Collaborator

@Lucaweihs Lucaweihs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, unflattening these tensors required more changes than I expected. It looks really nice. I didn't read the last few files super closely (especially the model architectures) but if you've verified it returns reasonable output then I'm happy.

core/algorithms/onpolicy_sync/light_engine.py Outdated Show resolved Hide resolved
core/algorithms/onpolicy_sync/light_engine.py Outdated Show resolved Hide resolved
core/algorithms/onpolicy_sync/light_engine.py Outdated Show resolved Hide resolved
core/algorithms/onpolicy_sync/light_engine.py Outdated Show resolved Hide resolved
core/algorithms/onpolicy_sync/light_engine.py Show resolved Hide resolved
core/algorithms/onpolicy_sync/storage.py Outdated Show resolved Hide resolved
core/algorithms/onpolicy_sync/storage.py Show resolved Hide resolved
core/algorithms/onpolicy_sync/storage.py Outdated Show resolved Hide resolved
core/base_abstractions/task.py Outdated Show resolved Hide resolved
core/models/basic_models.py Show resolved Hide resolved
* Addressed initial reviews from PR141
* Memory and obser ations exclude agent dimension if no agent is present
* ActorCritic outputs, sotred rewards, etc always have agent dimension
* Hacks to make ronthor work without dataset
* Single-agent robothor, minigrid, babyai working
…ttened names (only really used for rnn_memory viz)
@jordis-ai2 jordis-ai2 marked this pull request as ready for review August 20, 2020 00:36
Copy link
Collaborator

@Lucaweihs Lucaweihs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice changes. Everything looks reasonable but the number of changes is large enough that I didn't give everything a close reading. As you had mentioned, we'll likely want some tests in the future to ensure the model changes work as there were quite a few of them and I suspect it may have been easy for a bug to creep in. I need to read storage.py a little more closely but I was hoping you could first respond to my comment there as it would help me ground myself when reading.

core/algorithms/onpolicy_sync/light_engine.py Outdated Show resolved Hide resolved
core/algorithms/onpolicy_sync/light_engine.py Outdated Show resolved Hide resolved
core/algorithms/onpolicy_sync/storage.py Show resolved Hide resolved
core/models/basic_models.py Outdated Show resolved Hide resolved
plugins/robothor_plugin/robothor_models.py Outdated Show resolved Hide resolved
@jordis-ai2
Copy link
Collaborator Author

I need to read storage.py a little more closely but I was hoping you could first respond to my comment there as it would help me ground myself when reading.

I actually thought I ha addressed your comment by adding typing. I'll take a second look!

@Lucaweihs
Copy link
Collaborator

Oh I meant my most recent (new) comment about flat->unflat conversions which it looks like you've already answered 👍

Copy link
Collaborator

@Lucaweihs Lucaweihs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments about typing and a possible suggestion for simplifying something in storage.py.

core/algorithms/onpolicy_sync/engine.py Outdated Show resolved Hide resolved
core/algorithms/onpolicy_sync/engine.py Outdated Show resolved Hide resolved
core/algorithms/onpolicy_sync/storage.py Outdated Show resolved Hide resolved
core/models/basic_models.py Show resolved Hide resolved
plugins/babyai_plugin/babyai_models.py Show resolved Hide resolved
plugins/minigrid_plugin/minigrid_models.py Outdated Show resolved Hide resolved
plugins/robothor_plugin/robothor_models.py Outdated Show resolved Hide resolved
plugins/robothor_plugin/robothor_models.py Outdated Show resolved Hide resolved
projects/objectnav_baselines/models/object_nav_models.py Outdated Show resolved Hide resolved
projects/objectnav_baselines/models/object_nav_models.py Outdated Show resolved Hide resolved
@Lucaweihs
Copy link
Collaborator

Below I've added the summary of the breaking changes introduced in this PR.

Breaking changes

Unflattened tensors

What?


When passing observations to an actor-critic model were have been previously flattening the dimension corresponding to the time step and task sampler index. This meant that a batch of, for example, RGB images, was previously given to your model with dimensions ((# steps in rollout) x (# tasks running in parallel), 3, height, width). This makes it easy to run a CNN on the observations but then very annoying when you want to plug these observations into an RNN (as you need to implicitly remember how things are ordered). We've done away with this flattening so now you'll be, for example, given a tensor with shape (# steps in rollout, # tasks running in parallel, 3, height, width).

Why?


This means you may need to do some more processing (i.e. flattening the first two dimensions before running a CNN) but will make things much more consistent and, from my experience, make debugging a lot easier.

What do you have to do?


Update the forward pass of your models to account for these new dimensions.

More flexible memory

What?


We're making some breaking changes how allenact handles memory as we gear up to the release.

Why?


We built allenact from a library that assumed that the only type of recurrent networks you'd ever care about are LSTMs or GRUs. Because of this, a lot of naming/code conventions were designed with these RNNs in mind and it became a bit awkward to, for example, define a recurrent map-based memory when confined to these conventions. For instance, it was previously required that an actor-critic model define num_recurrent_layers and recurrent_hidden_state_size attributes which make little sense for non-LSTM/GRU models.

What do you have to do?


For an ActorCriticModel to work, we now require that you:

Define an explicit memory specification. For example, for a LSTM/GRU memory in a model with a single agent, we can just write something like:

    def _recurrent_memory_specification(self):  
        return {  
            "rnn": (  
                (  
                    ("layer", self.num_recurrent_layers),  
                    ("sampler", None),  
                    ("hidden", self.recurrent_hidden_state_size),  
                ),  
                torch.float32,  
            )  
        }

where we specify a dimension for layers, a placeholder dimension for sampler (batch), a dimension for hidden-dims and a data type, or

    def _recurrent_memory_specification(self):  
        return None

if the ActorCriticModel is memory-less.

Update your forward definitions. In forward, we now receive a memory (not a recurrent_hidden_state), so if you were using LSTM/GRU-based memory you need to extract the hidden state via memory.tensor("rnn") (this assumes you've used the above LSTM/GRU definition of _recurrent_memory_specification above). Once we're done, we need to pack the resulting hidden state into a memory, which can be done by memory.set_tensor("rnn", new_hidden_state) . All together, it should look similar to:

    def forward(self, observations, memory, prev_actions, masks):  
        rnn_out, new_rnn_hidden = self.state_encoder(  
            x=observations[self.input_uuid],  
            hidden_states=memory.tensor("rnn"),  
            masks=masks,  
        )
        out, _ = self.ac_nonrecurrent_head(  
            observations={self.head_uuid: rnn_out},  
            memory=None,  
            prev_actions=prev_actions,  
            masks=masks,  
        )
        return (  
            out,  
            memory.set_tensor("rnn", new_rnn_hidden),  
        )

@jordis-ai2
Copy link
Collaborator Author

I cleaned up some details about recurrent memory specification and moved the automatic addition of a step dim to storage (which is now the only object used during training that sees/uses that dim).

@jordis-ai2
Copy link
Collaborator Author

Merging as agreed.

@jordis-ai2 jordis-ai2 merged commit 6cf9603 into master Aug 22, 2020
@jordis-ai2 jordis-ai2 deleted the issue108_unflatten_rollouts branch August 22, 2020 00:35
@jordis-ai2 jordis-ai2 mentioned this pull request Aug 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unflatten tensors in RolloutStorage
2 participants