Support multi-modal policies #108

eric-heiden · 2018-05-30T20:35:02Z

In order to support visuomotor control learning and other problems, we need to implement a way to use policies that consist of submodules which handle certain input modalities, such as images and vectors. OpenAI Gym already has support for a tuple_space that is a tuple of different spaces. The most common use-case of such multi-modal observation spaces are combinations of 2d images and vectors.

Exact specification needs to be done but for now the task items look as follows:

add a new space representing 2d images
implement a test environment that has a tuple_space as observation space consisting of an image and a vector (e.g. reacher with top-down view image and 2d endeffector position)
additionally a wrapper would be useful that adds a visual output to an existing environment (renders user-defined camera to 2d pixel array and adds it to the tuple space, or makes a tuple space if environment was unimodal before)
implement a multi-modal policy that builds convolutional submodules for image spaces and MLPs for vectors, and merges the top layers from these submodules via an MLP that computes the final output

Your feedback on this issue is most welcome so that we can split up this feature into smaller tasks.

The text was updated successfully, but these errors were encountered:

ryanjulian · 2018-06-11T05:05:36Z

See rlworkgroup/garage#6

eric-heiden added the big big multi-feature projects label May 30, 2018

ryanjulian mentioned this issue Jun 11, 2018

Support multi-modal policies rlworkgroup/garage#6

Closed

ryanjulian closed this as completed Jun 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support multi-modal policies #108

Support multi-modal policies #108

eric-heiden commented May 30, 2018

ryanjulian commented Jun 11, 2018

Support multi-modal policies #108

Support multi-modal policies #108

Comments

eric-heiden commented May 30, 2018

ryanjulian commented Jun 11, 2018