Skip to content

Conversation

@asolano
Copy link

@asolano asolano commented Jan 17, 2018

Add preliminary support for multiple observations in the PPO module. Current implementation has a problem for multiple observations in the trainer class.

This issue is resolved by creating a separate set of convolution layers and then merging all the streams in the first fully connected layer. The history buffer implementation is also updated to create keys for observations dynamically.

@awjuliani
Copy link
Contributor

Hi @asolano,

This is awesome! Thanks for taking the time to write these changes to fully-enable multiple observations in the PPO code. Would it be possible to make a PR into our development-0.3 branch? This way we can roll it up into the other features we are adding and release it with the next version release.

@awjuliani
Copy link
Contributor

awjuliani commented Jan 18, 2018

Hi @asolano,

The internal team had a discussion around the inclusion of this, and we've decided to merge it into master. There are incompatible changes we will be making in the next release (specifically reworking how our experience buffer works), but we will re-implement the relevant parts of code here to ensure this continues working going forward. The benefit of allowing people to use this now outweighs the small extra effort on our part.

Very excited to see what kinds of multi-camera agent scenarios you and others come up with in the future!

@asolano
Copy link
Author

asolano commented Jan 18, 2018

Hi @awjuliani ,

That's amazing, thank you very much! We have just started using it ourselves too, looking forward to what the community does with it 👍

height_size, width_size = brain.camera_resolutions[i]['height'], brain.camera_resolutions[i]['width']
bw = brain.camera_resolutions[i]['blackAndWhite']
encoders.append(self.create_visual_encoder(height_size, width_size, bw, h_size, 2, tf.nn.tanh, num_layers))
hidden_visual = [tf.concat(encoders, axis=1)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this will work in the case of continuous state : encoders is a list of list of tensors.
[num_streams, num_observations, h_size]
This tf.concat will not work in these conditions

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! We were focusing on discrete control so we missed it 😅

The continuous control can be fixed by changing the axis argument from 1 to 2 and not making a list. Please refer to this commit. We added a couple of cameras to the 3DBall environment and it seems to be working.

@awjuliani
Copy link
Contributor

Thanks @asolano!

@awjuliani awjuliani merged commit a1d35bf into Unity-Technologies:master Jan 19, 2018
@asolano asolano deleted the dev-multiple-observations branch January 19, 2018 01:32
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants