From 77850b057083d587dfdc2b41b756b770073fad4f Mon Sep 17 00:00:00 2001 From: Tom Thompson Date: Thu, 4 Jun 2020 18:30:03 -0400 Subject: [PATCH 1/2] doc updates getting started page now uses consistent run-id re-order create-new docs to have less back/forth between unity and text editor --- docs/Getting-Started.md | 2 +- docs/Learning-Environment-Create-New.md | 59 +++++++++++++------------ 2 files changed, 32 insertions(+), 29 deletions(-) diff --git a/docs/Getting-Started.md b/docs/Getting-Started.md index 9762e25835..d9352e72b7 100644 --- a/docs/Getting-Started.md +++ b/docs/Getting-Started.md @@ -236,7 +236,7 @@ If you've quit the training early using `Ctrl+C` and want to resume training, run the same command again, appending the `--resume` flag: ```sh -mlagents-learn config/ppo/3DBall.yaml --run-id=firstRun --resume +mlagents-learn config/ppo/3DBall.yaml --run-id=first3DBallRun --resume ``` Your trained model will be at `results//.nn` where diff --git a/docs/Learning-Environment-Create-New.md b/docs/Learning-Environment-Create-New.md index 863b7cae7e..b2adae0c44 100644 --- a/docs/Learning-Environment-Create-New.md +++ b/docs/Learning-Environment-Create-New.md @@ -269,7 +269,7 @@ component, `rBody`, using the `Rigidbody.AddForce` function: Vector3 controlSignal = Vector3.zero; controlSignal.x = action[0]; controlSignal.z = action[1]; -rBody.AddForce(controlSignal * speed); +rBody.AddForce(controlSignal * forceMultiplier); ``` #### Rewards @@ -313,14 +313,14 @@ With the action and reward logic outlined above, the final version of the `OnActionReceived()` function looks like: ```csharp -public float speed = 10; +public float forceMultiplier = 10; public override void OnActionReceived(float[] vectorAction) { // Actions, size = 2 Vector3 controlSignal = Vector3.zero; controlSignal.x = vectorAction[0]; controlSignal.z = vectorAction[1]; - rBody.AddForce(controlSignal * speed); + rBody.AddForce(controlSignal * forceMultiplier); // Rewards float distanceToTarget = Vector3.Distance(this.transform.localPosition, Target.localPosition); @@ -340,33 +340,9 @@ public override void OnActionReceived(float[] vectorAction) } ``` -Note the `speed` class variable is defined before the function. Since `speed` is +Note the `forceMultiplier` class variable is defined before the function. Since `forceMultiplier` is public, you can set the value from the Inspector window. -## Final Editor Setup - -Now, that all the GameObjects and ML-Agent components are in place, it is time -to connect everything together in the Unity Editor. This involves changing some -of the Agent Component's properties so that they are compatible with our Agent -code. - -1. Select the **RollerAgent** GameObject to show its properties in the Inspector - window. -1. Add the `Decision Requester` script with the Add Component button from the - RollerAgent Inspector. -1. Change **Decision Period** to `10`. -1. Drag the Target GameObject from the Hierarchy window to the RollerAgent - Target field. -1. Add the `Behavior Parameters` script with the Add Component button from the - RollerAgent Inspector. -1. Modify the Behavior Parameters of the Agent : - - `Behavior Name` to _RollerBall_ - - `Vector Observation` > `Space Size` = 8 - - `Vector Action` > `Space Type` = **Continuous** - - `Vector Action` > `Space Size` = 2 - -Now you are ready to test the environment before training. - ## Testing the Environment It is always a good idea to first test your environment by controlling the Agent @@ -392,6 +368,31 @@ the platform. Make sure that there are no errors displayed in the Unity Editor Console window and that the Agent resets when it reaches its target or falls from the platform. +## Final Editor Setup + +Now, that all the GameObjects and ML-Agent components are in place, it is time +to connect everything together in the Unity Editor. This involves changing some +of the Agent Component's properties so that they are compatible with our Agent +code. + +1. Select the **RollerAgent** GameObject to show its properties in the Inspector + window. +1. Add the `Decision Requester` script with the Add Component button from the + RollerAgent Inspector. + +1. Change **Decision Period** to `10`. +1. Drag the Target GameObject from the Hierarchy window to the RollerAgent + Target field. +1. Add the `Behavior Parameters` script with the Add Component button from the + RollerAgent Inspector. +1. Modify the Behavior Parameters of the Agent : + - `Behavior Name` to _RollerBall_ + - `Vector Observation` > `Space Size` = 8 + - `Vector Action` > `Space Type` = **Continuous** + - `Vector Action` > `Space Size` = 2 + +Now you are ready to test the environment before training. + ## Training the Environment The process is the same as described in the @@ -427,6 +428,8 @@ behaviors: summary_freq: 10000 ``` +Hyperparameters are explained in [the training configuration file documentation](Training-Configuration-File.md) + Since this example creates a very simple training environment with only a few inputs and outputs, using small batch and buffer sizes speeds up the training considerably. However, if you add more complexity to the environment or change From 779140faaf46f939ea93cfc83a61a3eab9dcfb9d Mon Sep 17 00:00:00 2001 From: Tom Thompson Date: Thu, 4 Jun 2020 18:46:17 -0400 Subject: [PATCH 2/2] add link explaining decisions where we tell the reader to modify its parameter --- docs/Learning-Environment-Create-New.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/Learning-Environment-Create-New.md b/docs/Learning-Environment-Create-New.md index b2adae0c44..b34605d0ed 100644 --- a/docs/Learning-Environment-Create-New.md +++ b/docs/Learning-Environment-Create-New.md @@ -379,8 +379,7 @@ code. window. 1. Add the `Decision Requester` script with the Add Component button from the RollerAgent Inspector. - -1. Change **Decision Period** to `10`. +1. Change **Decision Period** to `10`. For more information on decisions, see [the Agent documentation](Learning-Environment-Design-Agents.md#decisions) 1. Drag the Target GameObject from the Hierarchy window to the RollerAgent Target field. 1. Add the `Behavior Parameters` script with the Add Component button from the