Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/Getting-Started.md
Original file line number Diff line number Diff line change
Expand Up @@ -236,7 +236,7 @@ If you've quit the training early using `Ctrl+C` and want to resume training,
run the same command again, appending the `--resume` flag:

```sh
mlagents-learn config/ppo/3DBall.yaml --run-id=firstRun --resume
mlagents-learn config/ppo/3DBall.yaml --run-id=first3DBallRun --resume
```

Your trained model will be at `results/<run-identifier>/<behavior_name>.nn` where
Expand Down
58 changes: 30 additions & 28 deletions docs/Learning-Environment-Create-New.md
Original file line number Diff line number Diff line change
Expand Up @@ -269,7 +269,7 @@ component, `rBody`, using the `Rigidbody.AddForce` function:
Vector3 controlSignal = Vector3.zero;
controlSignal.x = action[0];
controlSignal.z = action[1];
rBody.AddForce(controlSignal * speed);
rBody.AddForce(controlSignal * forceMultiplier);
```

#### Rewards
Expand Down Expand Up @@ -313,14 +313,14 @@ With the action and reward logic outlined above, the final version of the
`OnActionReceived()` function looks like:

```csharp
public float speed = 10;
public float forceMultiplier = 10;
public override void OnActionReceived(float[] vectorAction)
{
// Actions, size = 2
Vector3 controlSignal = Vector3.zero;
controlSignal.x = vectorAction[0];
controlSignal.z = vectorAction[1];
rBody.AddForce(controlSignal * speed);
rBody.AddForce(controlSignal * forceMultiplier);

// Rewards
float distanceToTarget = Vector3.Distance(this.transform.localPosition, Target.localPosition);
Expand All @@ -340,33 +340,9 @@ public override void OnActionReceived(float[] vectorAction)
}
```

Note the `speed` class variable is defined before the function. Since `speed` is
Note the `forceMultiplier` class variable is defined before the function. Since `forceMultiplier` is
public, you can set the value from the Inspector window.

## Final Editor Setup

Now, that all the GameObjects and ML-Agent components are in place, it is time
to connect everything together in the Unity Editor. This involves changing some
of the Agent Component's properties so that they are compatible with our Agent
code.

1. Select the **RollerAgent** GameObject to show its properties in the Inspector
window.
1. Add the `Decision Requester` script with the Add Component button from the
RollerAgent Inspector.
1. Change **Decision Period** to `10`.
1. Drag the Target GameObject from the Hierarchy window to the RollerAgent
Target field.
1. Add the `Behavior Parameters` script with the Add Component button from the
RollerAgent Inspector.
1. Modify the Behavior Parameters of the Agent :
- `Behavior Name` to _RollerBall_
- `Vector Observation` > `Space Size` = 8
- `Vector Action` > `Space Type` = **Continuous**
- `Vector Action` > `Space Size` = 2

Now you are ready to test the environment before training.

## Testing the Environment

It is always a good idea to first test your environment by controlling the Agent
Expand All @@ -392,6 +368,30 @@ the platform. Make sure that there are no errors displayed in the Unity Editor
Console window and that the Agent resets when it reaches its target or falls
from the platform.

## Final Editor Setup

Now, that all the GameObjects and ML-Agent components are in place, it is time
to connect everything together in the Unity Editor. This involves changing some
of the Agent Component's properties so that they are compatible with our Agent
code.

1. Select the **RollerAgent** GameObject to show its properties in the Inspector
window.
1. Add the `Decision Requester` script with the Add Component button from the
RollerAgent Inspector.
1. Change **Decision Period** to `10`. For more information on decisions, see [the Agent documentation](Learning-Environment-Design-Agents.md#decisions)
1. Drag the Target GameObject from the Hierarchy window to the RollerAgent
Target field.
1. Add the `Behavior Parameters` script with the Add Component button from the
RollerAgent Inspector.
1. Modify the Behavior Parameters of the Agent :
- `Behavior Name` to _RollerBall_
- `Vector Observation` > `Space Size` = 8
- `Vector Action` > `Space Type` = **Continuous**
- `Vector Action` > `Space Size` = 2

Now you are ready to test the environment before training.

## Training the Environment

The process is the same as described in the
Expand Down Expand Up @@ -427,6 +427,8 @@ behaviors:
summary_freq: 10000
```

Hyperparameters are explained in [the training configuration file documentation](Training-Configuration-File.md)

Since this example creates a very simple training environment with only a few
inputs and outputs, using small batch and buffer sizes speeds up the training
considerably. However, if you add more complexity to the environment or change
Expand Down