Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds SubprocessUnityEnvironment for parallel envs #1751

Merged
merged 2 commits into from
Apr 3, 2019

Conversation

harperj
Copy link
Contributor

@harperj harperj commented Feb 22, 2019

This commit adds support for running Unity environments in parallel.
An abstract base class was created for UnityEnvironment which a new
SubprocessUnityEnvironment inherits from.

SubprocessUnityEnvironment communicates through a pipe in order to
send commands which will be run in parallel to its workers.

A few significant changes needed to be made as a side-effect:

  • UnityEnvironments are created via a factory method (a closure)
    rather than being directly created by the main process.
  • In mlagents-learn "worker-id" has been replaced by "base-port"
    and "num-envs", and worker_ids are automatically assigned across runs.
  • BrainInfo objects now convert all fields to Python lists to avoid
    serialization issues.

@Sohojoe
Copy link
Contributor

Sohojoe commented Feb 24, 2019

Hi @harperj - I have been playing around with spawning environments in my fork of marathon-envs and can share some thoughts and insights:

You can see my code here - https://github.com/Sohojoe/marathon-envs/tree/feature/multiple_physics_scenes

To switch between a single physics scene and multiple physics scenes, flip line 76 of Assets/ML-Agents/Scripts/EnvSpawner.cs

          bool multiverse = true;

to change the number of agents update the InferenceNumEnvs. Select the agent from the pulldown menu (Terrain agents dont work yet in multi-physics as I've not updated the ray trace code)

c32d16f3-bcc5-4939-85f6-e0397d4f9253

  1. It is much easier to maintain multiple environment instances via a setting or command line that managing this manually in a scene. I regularly train with 128 instances and it is way too much to handle this by hand. I implemented a setting for training and for inference that can be overridden on the command line num_envs=128

  2. I have experimented with using a single physics scene vs using a unique physics scene per environment; here are the pros/cons of each

SinglePhysicsScene Pros:

  • Runs Faster (only tested in inference mode - 75fps)

7289d5c6-bab9-43ca-938e-07d0f7b29190

SinglePhysicsScene Cons:

  • Can be hard to adapt environment and code - I offset the spawn position of each environment, therefore, the code needs to be updated to work from the spawn position. This was painful and can be hard to debug.

MultiPhysicsScene Pros:

  • Easier for users to implement as one does not have to change the spawn position. The one change users will need to make is for ray tracing to use the correct scene physics. Note: I have not tested with visual observations, I assume that a camera within a sub scene will only render that scene, but this should be checked

SinglePhysicsScene Cons:

  • Runs slower (only tested in inference mode - 50fps so 1/3rd slower)

6f0fa487-c7b9-4d48-b830-d8932e0cee8e

So it would be ideal to support both so users can manage the tradeoff

  1. Regarding the performance boost when training with multiple agents - I have some insights in my paper from the AAAI workshop

  2. Let me know how I can support - I would love to see this in the core ml-agents and implemented properly (my code is not great as I didn't spend the time to learn how ml-agents is passing between python and unity beyond hacking something together). I think I need 2018.2 or .3 as I take advantage of the new prefabs (but this may only be for the implementation of my environments. I also implemented support multiple environments from a list - this will enable me to ship binaries for marathon-envs so that users can specify the environment from the command line (however, this is not finished)

@harperj
Copy link
Contributor Author

harperj commented Feb 25, 2019

Hi @Sohojoe --

It sounds like you're working on multiple training environments (we have been calling them "areas") within a single Unity executable. I am definitely interested in the separation using different scenes, since as you mentioned, with a single scene adapting the code can be tricky (esp. if we are taking an existing codebase). It may be worth us setting that up as a separate feature to consider in the future.

This PR is dealing with an independent issue of launching multiple Unity executables, each of which might have many spaces within it. I think it's still an open question how to balance more areas within a single executable with more parallel executables.

H

@harperj harperj changed the base branch from develop-jh-actors to develop February 25, 2019 23:11
@Unity-Technologies Unity-Technologies deleted a comment Feb 25, 2019
@Unity-Technologies Unity-Technologies deleted a comment Feb 25, 2019
@Unity-Technologies Unity-Technologies deleted a comment Feb 25, 2019
@harperj harperj requested a review from eshvk March 5, 2019 21:35
@Unity-Technologies Unity-Technologies deleted a comment Mar 25, 2019
@Unity-Technologies Unity-Technologies deleted a comment Mar 26, 2019
@Unity-Technologies Unity-Technologies deleted a comment Mar 26, 2019
@Unity-Technologies Unity-Technologies deleted a comment Mar 26, 2019
@Unity-Technologies Unity-Technologies deleted a comment Mar 26, 2019
@Unity-Technologies Unity-Technologies deleted a comment Mar 26, 2019
@Unity-Technologies Unity-Technologies deleted a comment Mar 26, 2019
@Unity-Technologies Unity-Technologies deleted a comment Mar 26, 2019
@harperj
Copy link
Contributor Author

harperj commented Mar 26, 2019

Feedback addressed and tests fixed -- PTAL.

@eshvk
Copy link
Contributor

eshvk commented Mar 29, 2019

👍

@harperj
Copy link
Contributor Author

harperj commented Apr 2, 2019

Squashed and rebased. Note that subprocess environments won't support custom observations at this time. Just a heads up if there are any final comments -- cc @vincentpierre

@harperj
Copy link
Contributor Author

harperj commented Apr 2, 2019

@Sohojoe also interested in whether you've had a chance to take another look at this. We are close to making our next release branch and I wonder how this change interacts with the improvements you've made for the marathon environments.

@eshvk
Copy link
Contributor

eshvk commented Apr 3, 2019

@harperj Please rebase and also fix the Codacy thing. @vincentpierre: Can you review the note reg. custom observations as well.

This commit adds support for running Unity environments in parallel.
An abstract base class was created for UnityEnvironment which a new
SubprocessUnityEnvironment inherits from.

SubprocessUnityEnvironment communicates through a pipe in order to
send commands which will be run in parallel to its workers.

A few significant changes needed to be made as a side-effect:
* UnityEnvironments are created via a factory method (a closure)
  rather than being directly created by the main process.
* In mlagents-learn "worker-id" has been replaced by "base-port"
  and "num-envs", and worker_ids are automatically assigned across runs.
* BrainInfo objects now convert all fields to numpy arrays or lists to
  avoid serialization issues.
@harperj
Copy link
Contributor Author

harperj commented Apr 3, 2019

@eshvk the codacy complaint is that the subclass supports an additional optional parameter (related to custom protos). I think this should be fine.

@Unity-Technologies Unity-Technologies deleted a comment Apr 3, 2019
@Unity-Technologies Unity-Technologies deleted a comment Apr 3, 2019
@Unity-Technologies Unity-Technologies deleted a comment Apr 3, 2019
@Unity-Technologies Unity-Technologies deleted a comment Apr 3, 2019
@harperj harperj merged commit e59eff4 into develop Apr 3, 2019
@awjuliani awjuliani deleted the develop-jh-parallel-envs branch July 23, 2019 20:18
LeSphax pushed a commit to LeSphax/ml-agents-1 that referenced this pull request May 3, 2020
…#1751)

This commit adds support for running Unity environments in parallel.
An abstract base class was created for UnityEnvironment which a new
SubprocessUnityEnvironment inherits from.

SubprocessUnityEnvironment communicates through a pipe in order to
send commands which will be run in parallel to its workers.

A few significant changes needed to be made as a side-effect:
* UnityEnvironments are created via a factory method (a closure)
  rather than being directly created by the main process.
* In mlagents-learn "worker-id" has been replaced by "base-port"
  and "num-envs", and worker_ids are automatically assigned across runs.
* BrainInfo objects now convert all fields to numpy arrays or lists to
  avoid serialization issues.
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 18, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants