-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Initial support for multiple observations #256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial support for multiple observations #256
Conversation
|
Hi @asolano, This is awesome! Thanks for taking the time to write these changes to fully-enable multiple observations in the PPO code. Would it be possible to make a PR into our |
|
Hi @asolano, The internal team had a discussion around the inclusion of this, and we've decided to merge it into master. There are incompatible changes we will be making in the next release (specifically reworking how our experience buffer works), but we will re-implement the relevant parts of code here to ensure this continues working going forward. The benefit of allowing people to use this now outweighs the small extra effort on our part. Very excited to see what kinds of multi-camera agent scenarios you and others come up with in the future! |
|
Hi @awjuliani , That's amazing, thank you very much! We have just started using it ourselves too, looking forward to what the community does with it 👍 |
python/ppo/models.py
Outdated
| height_size, width_size = brain.camera_resolutions[i]['height'], brain.camera_resolutions[i]['width'] | ||
| bw = brain.camera_resolutions[i]['blackAndWhite'] | ||
| encoders.append(self.create_visual_encoder(height_size, width_size, bw, h_size, 2, tf.nn.tanh, num_layers)) | ||
| hidden_visual = [tf.concat(encoders, axis=1)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this will work in the case of continuous state : encoders is a list of list of tensors.
[num_streams, num_observations, h_size]
This tf.concat will not work in these conditions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! We were focusing on discrete control so we missed it 😅
The continuous control can be fixed by changing the axis argument from 1 to 2 and not making a list. Please refer to this commit. We added a couple of cameras to the 3DBall environment and it seems to be working.
|
Thanks @asolano! |
Add preliminary support for multiple observations in the PPO module. Current implementation has a problem for multiple observations in the trainer class.
This issue is resolved by creating a separate set of convolution layers and then merging all the streams in the first fully connected layer. The history buffer implementation is also updated to create keys for observations dynamically.