diff --git a/README.md b/README.md index cec91e6cb6..40b1eadf84 100644 --- a/README.md +++ b/README.md @@ -38,7 +38,7 @@ developer communities. * For more information, in addition to installation and usage instructions, see our [documentation home](docs/Readme.md). * If you are a researcher interested in a discussion of Unity as an AI platform, see a pre-print of our [reference paper on Unity and the ML-Agents Toolkit](https://arxiv.org/abs/1809.02627). Also, see below for instructions on citing this paper. -* If you have used a version of the ML-Agents toolkit prior to v0.6, we strongly +* If you have used an earlier version of the ML-Agents toolkit, we strongly recommend our [guide on migrating from earlier versions](docs/Migrating.md). ## Additional Resources diff --git a/docs/Basic-Guide.md b/docs/Basic-Guide.md index e55f0434e9..626e36d844 100644 --- a/docs/Basic-Guide.md +++ b/docs/Basic-Guide.md @@ -10,10 +10,8 @@ the basic concepts of Unity. ## Setting up the ML-Agents Toolkit within Unity -In order to use the ML-Agents toolkit within Unity, you need to change some -Unity settings first. You will also need to have appropriate inference backends -installed in order to run your models inside of Unity. See [here](Inference-Engine.md) -for more information. +In order to use the ML-Agents toolkit within Unity, you first need to change a few +Unity settings. 1. Launch Unity 2. On the Projects dialog, choose the **Open** option at the top of the window. @@ -22,26 +20,43 @@ for more information. 4. Go to **Edit** > **Project Settings** > **Player** 5. For **each** of the platforms you target (**PC, Mac and Linux Standalone**, **iOS** or **Android**): - 1. Option the **Other Settings** section. + 1. Expand the **Other Settings** section. 2. Select **Scripting Runtime Version** to **Experimental (.NET 4.6 Equivalent or .NET 4.x Equivalent)** 6. Go to **File** > **Save Project** +## Setting up the Inference Engine + +We provide pre-trained models for all the agents in all our demo environments. +To be able to run those models, you'll first need to set-up the Inference +Engine. The Inference Engine is a general API to +run neural network models in Unity that leverages existing inference libraries such +as TensorFlowSharp and Apple's Core ML. Since the ML-Agents Toolkit uses TensorFlow +for training neural network models, the output model format is TensorFlow and +the model files include a `.tf` extension. Consequently, you need to install +the TensorFlowSharp backend to be able to run these models within the Unity +Editor. You can find instructions +on how to install the TensorFlowSharp backend [here](Inference-Engine.md). +Once the backend is installed, you will need to reimport the models : Right click +on the `.tf` model and select `Reimport`. + + ## Running a Pre-trained Model 1. In the **Project** window, go to `Assets/ML-Agents/Examples/3DBall/Scenes` folder and open the `3DBall` scene file. 2. In the **Project** window, go to `Assets/ML-Agents/Examples/3DBall/Prefabs` folder and select the `Game/Platform` prefab. -3. In the `Ball 3D Agent` Component: Drag the **3DBallLearning** located into +3. In the `Ball 3D Agent` Component: Drag the **3DBallLearning** Brain located in `Assets/ML-Agents/Examples/3DBall/Brains` into the `Brain` property of the `Ball 3D Agent`. 4. Make sure that all of the Agents in the Scene now have **3DBallLearning** as `Brain`. - __Note__ : You can modify multiple game objects in a scene by selecting them all at once using the search bar in the Scene Hierarchy. + __Note__ : You can modify multiple game objects in a scene by selecting them all at + once using the search bar in the Scene Hierarchy. 5. In the **Project** window, locate the `Assets/ML-Agents/Examples/3DBall/TFModels` folder. -6. Drag the `3DBall` model file from the `Assets/ML-Agents/Examples/3DBall/TFModels` - folder to the **Model** field of the **3DBallLearning**. +6. Drag the `3DBallLearning` model file from the `Assets/ML-Agents/Examples/3DBall/TFModels` + folder to the **Model** field of the **3DBallLearning** Brain. 7. Click the **Play** button and you will see the platforms balance the balls using the pretrained model. @@ -63,15 +78,26 @@ More information and documentation is provided in the ### Adding a Brain to the training session -Since we are going to build this environment to conduct training, we need to add -the Brain to the training session. This allows the Agents linked to that Brain -to communicate with the external training process when making their decisions. - -1. Assign the **3DBallLearning** to the agents you would like to train and the **3DBallPlayer** Brain to the agents you want to control manually. - __Note:__ You can only perform training with an `Learning Brain`. +To set up the environment for training, you will need to specify which agents are contributing +to the training and which Brain is being trained. You can only perform training with +a `Learning Brain`. + +1. Assign the **3DBallLearning** Brain to the agents you would like to train. + __Note:__ You can assign the same Brain to multiple agents at once : To do so, you can + use the prefab system. When an agent is created from a prefab, modifying the prefab + will modify the agent as well. If the agent does not synchronize with the prefab, you + can hit the Revert button on top of the Inspector. + Alternatively, you can select multiple agents in the scene and modify their `Brain` + property all at once. 2. Select the **Ball3DAcademy** GameObject and make sure the **3DBallLearning** Brain is in the Broadcast Hub. In order to train, you need to toggle the `Control` checkbox. + +__Note:__ Assigning a Brain to an agent (dragging a Brain into the `Brain` property of +the agent) means that the Brain will be making decision for that agent. Whereas dragging +a Brain into the Broadcast Hub means that the Brain will be exposed to the Python process. +The `Control` checkbox means that in addition to being exposed to Python, the Brain will +be controlled by the Python process (required for training). ![Set Brain to External](images/mlagents-SetBrainToTrain.png) diff --git a/docs/FAQ.md b/docs/FAQ.md index e58fc5589a..da73c5ed09 100644 --- a/docs/FAQ.md +++ b/docs/FAQ.md @@ -15,7 +15,7 @@ Unity](Installation.md#setting-up-ml-agent-within-unity) for solution. ## Cannot drag Model into Learning Brain -You migh not have the appropriate backend required to import the model. Refer to the +You might not have the appropriate backend required to import the model. Refer to the [Inference Engine](Inference-Engine.md) for more information on how to import backends and reimport the asset. diff --git a/docs/Getting-Started-with-Balance-Ball.md b/docs/Getting-Started-with-Balance-Ball.md index bf885de0f3..1bf4221c9f 100644 --- a/docs/Getting-Started-with-Balance-Ball.md +++ b/docs/Getting-Started-with-Balance-Ball.md @@ -53,10 +53,10 @@ to speed up training since all twelve agents contribute to training in parallel. The Academy object for the scene is placed on the Ball3DAcademy GameObject. When you look at an Academy component in the inspector, you can see several properties that control how the environment works. -The **Broadcast Hub** keeps track of which brains will send data during training, -If a brain is added to the hub, his data will be sent to the external training +The **Broadcast Hub** keeps track of which Brains will send data during training, +If a Brain is added to the hub, his data will be sent to the external training process. If the `Control` checkbox is checked, the training process will be able to -control the agents linked to the brain to train them. +control the agents linked to the Brain to train them. The **Training** and **Inference Configuration** properties set the graphics and timescale properties for the Unity application. The Academy uses the **Training Configuration** during training and the @@ -89,16 +89,16 @@ environment around the Agents. ### Brain Brains are assets that exist in your project folder. The Ball3DAgents are connected -to a brain, for example : the **3DBallLearning**. +to a Brain, for example : the **3DBallLearning**. A Brain doesn't store any information about an Agent, it just routes the Agent's collected observations to the decision making process and returns the chosen action to the Agent. Thus, all Agents can share the same Brain, but act independently. The Brain settings tell you quite a bit about how an Agent works. -You can create brain objects by selecting `Assets -> -Create -> ML-Agents -> Brain`. There are 3 kinds of brains : -The **Learning Brain** is a brain that uses a Neural Network to take decisions. +You can create Brain objects by selecting `Assets -> +Create -> ML-Agents -> Brain`. There are 3 kinds of Brains : +The **Learning Brain** is a Brain that uses a Neural Network to take decisions. When the Brain is checked as `Control` in the Academy **Broadcast Hub**, the external process will be taking decisions for the agents and generate a neural network when the training is over. You can also use the @@ -225,7 +225,7 @@ environment first. The `--train` flag tells the ML-Agents toolkit to run in training mode. **Note**: You can train using an executable rather than the Editor. To do so, -follow the intructions in +follow the instructions in [Using an Executable](Learning-Environment-Executable.md). ### Observing Training Progress diff --git a/docs/Learning-Environment-Best-Practices.md b/docs/Learning-Environment-Best-Practices.md index f52b466ac3..09c90f54bb 100644 --- a/docs/Learning-Environment-Best-Practices.md +++ b/docs/Learning-Environment-Best-Practices.md @@ -3,7 +3,7 @@ ## General * It is often helpful to start with the simplest version of the problem, to - ensure the agent can learn it. From there increase complexity over time. This + ensure the agent can learn it. From there, increase complexity over time. This can either be done manually, or via Curriculum Learning, where a set of lessons which progressively increase in difficulty are presented to the agent ([learn more here](Training-Curriculum-Learning.md)). diff --git a/docs/Learning-Environment-Create-New.md b/docs/Learning-Environment-Create-New.md index f6370af47b..df236c8f2a 100644 --- a/docs/Learning-Environment-Create-New.md +++ b/docs/Learning-Environment-Create-New.md @@ -157,11 +157,11 @@ in the Inspector window. ## Add Brains The Brain object encapsulates the decision making process. An Agent sends its -observations to its Brain and expects a decision in return. The type of the brain +observations to its Brain and expects a decision in return. The type of the Brain (Learning, Heuristic or player) determines how the Brain makes decisions. To create the Brain: -1. Go to `Assets -> Create -> ML-Agents` and select the type of brain you want to +1. Go to `Assets -> Create -> ML-Agents` and select the type of Brain you want to create. In this tutorial, we will create a **Learning Brain** and a **Player Brain**. 2. Name them `RollerBallBrain` and `RollerBallPlayer` respectively. @@ -466,7 +466,7 @@ Brain asset to the Agent, changing some of the Agent Components properties, and setting the Brain properties so that they are compatible with our Agent code. 1. In the Academy Inspector, add the `RollerBallBrain` and `RollerBallPlayer` - brains to the **Broadcast Hub**. + Brains to the **Broadcast Hub**. 2. Select the RollerAgent GameObject to show its properties in the Inspector window. 3. Drag the Brain `RollerBallPlayer` from the Project window to the @@ -478,7 +478,7 @@ setting the Brain properties so that they are compatible with our Agent code. Also, drag the Target GameObject from the Hierarchy window to the RollerAgent Target field. -Finally, select the the `RollerBallBrain` and `RollerBallPlayer` brains assets +Finally, select the the `RollerBallBrain` and `RollerBallPlayer` Brain assets so that you can edit their properties in the Inspector window. Set the following properties on both of them: @@ -493,16 +493,16 @@ Now you are ready to test the environment before training. ## Testing the Environment It is always a good idea to test your environment manually before embarking on -an extended training run. The reason we have created the `RollerBallPlayer` brain +an extended training run. The reason we have created the `RollerBallPlayer` Brain is so that we can control the Agent using direct keyboard control. But first, you need to define the keyboard to action mapping. Although the RollerAgent only has an `Action Size` of two, we will use one key to specify positive values and one to specify negative values for each action, for a total of four keys. -1. Select the `RollerBallPlayer` brain to view its properties in the Inspector. +1. Select the `RollerBallPlayer` Brain to view its properties in the Inspector. 2. Expand the **Continuous Player Actions** dictionary (only visible when using - a player brain). + a **PlayerBrain**). 3. Set **Size** to 4. 4. Set the following mappings: diff --git a/docs/Learning-Environment-Design-Academy.md b/docs/Learning-Environment-Design-Academy.md index 775eaa9f50..c5f129a945 100644 --- a/docs/Learning-Environment-Design-Academy.md +++ b/docs/Learning-Environment-Design-Academy.md @@ -50,9 +50,9 @@ logic for creating them in the `AcademyStep()` function. ## Academy Properties ![Academy Inspector](images/academy.png) -* `Broadcast Hub` - Gathers the brains that will communicate with the external - process. Any brain added to the Broadcast Hub will be visible from the external - process. In addition, if the checkbox `Control` is checked, the brain will be +* `Broadcast Hub` - Gathers the Brains that will communicate with the external + process. Any Brain added to the Broadcast Hub will be visible from the external + process. In addition, if the checkbox `Control` is checked, the Brain will be controllable from the external process and will thus be trainable. * `Max Steps` - Total number of steps per-episode. `0` corresponds to episodes without a maximum number of steps. Once the step counter reaches maximum, the diff --git a/docs/Learning-Environment-Design-Agents.md b/docs/Learning-Environment-Design-Agents.md index bcf95efbfa..3a7dacf9f7 100644 --- a/docs/Learning-Environment-Design-Agents.md +++ b/docs/Learning-Environment-Design-Agents.md @@ -19,7 +19,7 @@ The Brain class abstracts out the decision making logic from the Agent itself so that you can use the same Brain in multiple Agents. How a Brain makes its decisions depends on the kind of Brain it is. A Player Brain allows you to directly control the agent. A Heuristic Brain allows you to create a -decision script to control the agent with a set of rules. These two brains +decision script to control the agent with a set of rules. These two Brains do not involve neural networks but they can be useful for debugging. The Learning Brain allows you to train and use neural network models for your Agents. See [Brains](Learning-Environment-Design-Brains.md). diff --git a/docs/Learning-Environment-Design-Brains.md b/docs/Learning-Environment-Design-Brains.md index c6460d21fc..60bb1ae1bf 100644 --- a/docs/Learning-Environment-Design-Brains.md +++ b/docs/Learning-Environment-Design-Brains.md @@ -5,9 +5,9 @@ assigned a Brain, but you can use the same Brain with more than one Agent. You can also create several Brains, attach each of the Brain to one or more than one Agent. -There are 3 kinds of brains you can use: +There are 3 kinds of Brains you can use: -* [Learning](Learning-Environment-Learning-Brains.md) – Use a +* [Learning](Learning-Environment-Design-Learning-Brains.md) – Use a **LearningBrain** to make use of a trained model or train a new model. * [Heuristic](Learning-Environment-Design-Heuristic-Brains.md) – Use a **HeuristicBrain** to hand-code the Agent's logic by extending the Decision class. @@ -55,7 +55,7 @@ to a Brain component: * `Action Descriptions` - A list of strings used to name the available actions for the Brain. -The other properties of the brain depend on the type of Brain you are using. +The other properties of the Brain depend on the type of Brain you are using. ## Using the Broadcast Feature diff --git a/docs/Learning-Environment-Design.md b/docs/Learning-Environment-Design.md index 1480187d5c..0b40f97f84 100644 --- a/docs/Learning-Environment-Design.md +++ b/docs/Learning-Environment-Design.md @@ -72,7 +72,7 @@ information. To train and use the ML-Agents toolkit in a Unity scene, the scene must contain a single Academy subclass and as many Agent subclasses -as you need. The brain assets are present in the project and should be grouped +as you need. The Brain assets are present in the project and should be grouped together and named according to the type of agents they are compatible with. Agent instances should be attached to the GameObject representing that Agent. @@ -114,18 +114,18 @@ the Academy properties and their uses. The Brain encapsulates the decision making process. Every Agent must be assigned a Brain, but you can use the same Brain with more than one Agent. -__Note__:You can assign the same brain to multiple agents by using prefabs -or by selecting all the agents you want to attach the brain to using the +__Note__:You can assign the same Brain to multiple agents by using prefabs +or by selecting all the agents you want to attach the Brain to using the search bar on top of the Scene Hierarchy window. To Create a Brain, go to `Assets -> Create -> Ml-Agents` and select the -type of brain you want to use. During training, use a **Learning Brain** +type of Brain you want to use. During training, use a **Learning Brain** and drag it into the Academy's `Broadcast Hub` with the `Control` checkbox checked. When you want to use the trained model, import the model file into the Unity project, add it to the **Model** property of the **Learning Brain** and uncheck the `Control` checkbox of the `Broadcast Hub`. See [Brains](Learning-Environment-Design-Brains.md) for details on using the -different types of Brains. You can create new kinds of brains if the three +different types of Brains. You can create new kinds of Brains if the three built-in don't do what you need. The Brain class has several important properties that you can set using the diff --git a/docs/Learning-Environment-Examples.md b/docs/Learning-Environment-Examples.md index f40036261e..690ed4adf0 100644 --- a/docs/Learning-Environment-Examples.md +++ b/docs/Learning-Environment-Examples.md @@ -115,11 +115,11 @@ If you would like to contribute environments, please see our * Set-up: A platforming environment where the agent can push a block around. * Goal: The agent must push the block to the goal. -* Agents: The environment contains one agent linked to a single brain. +* Agents: The environment contains one agent linked to a single Brain. * Agent Reward Function: * -0.0025 for every step. * +1.0 if the block touches the goal. -* Brains: One brain with the following observation/action space. +* Brains: One Brain with the following observation/action space. * Vector Observation space: (Continuous) 70 variables corresponding to 14 ray-casts each detecting one of three possible objects (wall, goal, or block). @@ -161,7 +161,7 @@ If you would like to contribute environments, please see our ![Reacher](images/reacher.png) * Set-up: Double-jointed arm which can move to target locations. -* Goal: The agents must move it's hand to the goal location, and keep it there. +* Goal: The agents must move its hand to the goal location, and keep it there. * Agents: The environment contains 10 agent linked to a single Brain. * Agent Reward Function (independent): * +0.1 Each step agent's hand is in goal location. diff --git a/docs/Learning-Environment-Executable.md b/docs/Learning-Environment-Executable.md index 22210f2ea5..09918062b1 100644 --- a/docs/Learning-Environment-Executable.md +++ b/docs/Learning-Environment-Executable.md @@ -28,7 +28,7 @@ environment: ![3DBall Scene](images/mlagents-Open3DBall.png) Make sure the Brains in the scene have the right type. For example, if you want -to be able to control your agents from Python, you will need to put the brain +to be able to control your agents from Python, you will need to put the Brain controlling the Agents to be a **Learning Brain** and drag it into the Academy's `Broadcast Hub` with the `Control` checkbox checked. diff --git a/docs/ML-Agents-Overview.md b/docs/ML-Agents-Overview.md index 64faea827e..5d727ca6ad 100644 --- a/docs/ML-Agents-Overview.md +++ b/docs/ML-Agents-Overview.md @@ -224,7 +224,7 @@ inference can proceed. As mentioned previously, the ML-Agents toolkit ships with several implementations of state-of-the-art algorithms for training intelligent agents. -In this mode, the only brain used is a **Learning Brain**. More +In this mode, the only Brain used is a **Learning Brain**. More specifically, during training, all the medics in the scene send their observations to the Python API through the External Communicator (this is the behavior with an External Brain). The Python API @@ -409,7 +409,7 @@ training process. observations for all its Agents to the Python API when dragged into the Academy's `Broadcast Hub` with the `Control` checkbox checked. This is helpful for training and later inference. Broadcasting is a feature which can be - enabled all types of brains (Player, Learning, Heuristic) where the Agent + enabled all types of Brains (Player, Learning, Heuristic) where the Agent observations and actions are also sent to the Python API (despite the fact that the Agent is **not** controlled by the Python API). This feature is leveraged by Imitation Learning, where the observations and actions for a diff --git a/docs/Migrating.md b/docs/Migrating.md index 683c96518f..37962bf226 100644 --- a/docs/Migrating.md +++ b/docs/Migrating.md @@ -1,22 +1,48 @@ # Migrating ## Migrating from ML-Agents toolkit v0.5 to v0.6 + ### Important -* Brains are now Scriptable Objects instead of MonoBehaviors. This will - allow you to set Brains into prefabs and use the same brains across - scenes. + +* Brains are now Scriptable Objects instead of MonoBehaviors. +* You can no longer modify the type of a Brain. If you want to switch + between `PlayerBrain` and `LearningBrain` for multiple agents, + you will need to assign a new Brain to each agent separately. + __Note:__ You can pass the same Brain to multiple agents in a scene by +leveraging Unity's prefab system or look for all the agents in a scene +using the search bar of the `Hierarchy` window with the word `Agent`. * To update a scene from v0.5 to v0.6, you must: - * Remove the `Brain` GameObjects in the scene + * Remove the `Brain` GameObjects in the scene. (Delete all of the + Brain GameObjects under Academy in the scene.) * Create new `Brain` Scriptable Objects using `Assets -> Create -> - ML-Agents` + ML-Agents` for each type of the Brain you plan to use, and put + the created files under a folder called Brains within your project. * Edit their `Brain Parameters` to be the same as the parameters used - in the `Brain` GameObjects + in the `Brain` GameObjects. * Agents have a `Brain` field in the Inspector, you need to drag the - appropriate Brain asset in it. + appropriate Brain ScriptableObject in it. -__Note:__ You can pass the same brain to multiple agents in a scene by -leveraging Unity's prefab system or look for all the agents in a scene -using the search bar of the `Hierarchy` window with the word `Agent`. + __Note:__ You will need to delete the previous TensorFlowSharp package + and install the new one to do inference. To correctly delete the previous + TensorFlowSharp package, Delete all of the files under `ML-Agents/Plugins` + folder except the files under `ML-Agents/Plugins/ProtoBuffer`. + +* We replaced the **Internal** and **External** Brain with **Learning Brain**. + When you need to train a model, you need to drag it into the `Training Hub` + inside the `Academy` and check the `Control` checkbox. +* We removed the `Broadcast` checkbox of the Brain, to use the broadcast + functionality, you need to drag the Brain into the `Broadcast Hub`. +* When training multiple Brains at the same time, each model is now stored + into a separate model file rather than in the same file under different + graph scopes. +* We have changed the way ML-Agents models perform inference. All previous `.bytes` + files can no longer be used (you will have to retrain them). The models + produced by the training process and the shipped models have now a `.tf` + extension and use TensorflowSharp as a backend for the + [Inference Engine](Inference-Engine.md). +* To use a `.tf` model, drag it inside the `Model` property of the `Learning Brain` + + ## Migrating from ML-Agents toolkit v0.4 to v0.5 @@ -72,7 +98,7 @@ using the search bar of the `Hierarchy` window with the word `Agent`. [curriculum learning documentation](Training-Curriculum-Learning.md) for detailed information. In summary: * Curriculum files for the same environment must now be placed into a folder. - Each curriculum file should be named after the brain whose curriculum it + Each curriculum file should be named after the Brain whose curriculum it specifies. * `min_lesson_length` now specifies the minimum number of episodes in a lesson and affects reward thresholding. diff --git a/docs/Training-Imitation-Learning.md b/docs/Training-Imitation-Learning.md index 2fb7125ec6..b7ae760e60 100644 --- a/docs/Training-Imitation-Learning.md +++ b/docs/Training-Imitation-Learning.md @@ -51,7 +51,7 @@ With offline behavioral cloning, we can use demonstrations (`.demo` files) gener 6. Launch `mlagent-learn`, and providing `./config/offline_bc_config.yaml` as the config parameter, and your environment as the `--env` parameter. 7. (Optional) Observe training performance using Tensorboard. -This will use the demonstration file to train a nerual network driven agent to directly imitate the actions provided in the demonstration. The environment will launch and be used for evaluating the agent's performance during training. +This will use the demonstration file to train a neural network driven agent to directly imitate the actions provided in the demonstration. The environment will launch and be used for evaluating the agent's performance during training. ### Online Training @@ -59,14 +59,14 @@ It is also possible to provide demonstrations in realtime during training, witho 1. First create two Brains, one which will be the "Teacher," and the other which will be the "Student." We will assume that the names of the Brain - `Assets`s are "Teacher" and "Student" respectively. + Assets are "Teacher" and "Student" respectively. 2. The "Teacher" Brain must be a **Player Brain**. You must properly configure the inputs to map to the corresponding actions. 3. The "Student" Brain must be a **Learning Brain**. -4. The Brain Parameters of both the "Teacher" and "Student" brains must be +4. The Brain Parameters of both the "Teacher" and "Student" Brains must be compatible with the agent. -5. Drag both the "Teacher" and "Student" brain into the Academy's `Broadcast Hub` - and check the `Control` checkbox on the "Student" brain. +5. Drag both the "Teacher" and "Student" Brain into the Academy's `Broadcast Hub` + and check the `Control` checkbox on the "Student" Brain. 4. Link the Brains to the desired Agents (one Agent as the teacher and at least one Agent as a student). 5. In `config/online_bc_config.yaml`, add an entry for the "Student" Brain. Set diff --git a/docs/Training-ML-Agents.md b/docs/Training-ML-Agents.md index 43891d18df..30ae9d9913 100644 --- a/docs/Training-ML-Agents.md +++ b/docs/Training-ML-Agents.md @@ -168,7 +168,7 @@ environments are included in the provided config file. | brain\_to\_imitate | For online imitation learning, the name of the GameObject containing the Brain component to imitate. | (online)BC | | demo_path | For offline imitation learning, the file path of the recorded demonstration file | (offline)BC | | buffer_size | The number of experiences to collect before updating the policy model. | PPO | -| curiosity\_enc\_size | The size of the encoding to use in the forward and inverse models in the Curioity module. | PPO | +| curiosity\_enc\_size | The size of the encoding to use in the forward and inverse models in the Curiosity module. | PPO | | curiosity_strength | Magnitude of intrinsic reward generated by Intrinsic Curiosity Module. | PPO | | epsilon | Influences how rapidly the policy can evolve during training. | PPO | | gamma | The reward discount rate for the Generalized Advantage Estimator (GAE). | PPO | diff --git a/docs/Training-PPO.md b/docs/Training-PPO.md index 7f9047ff5f..2735207974 100644 --- a/docs/Training-PPO.md +++ b/docs/Training-PPO.md @@ -188,7 +188,7 @@ Typical Range: `64` - `512` The below hyperparameters are only used when `use_curiosity` is set to true. -### Curioisty Encoding Size +### Curiosity Encoding Size `curiosity_enc_size` corresponds to the size of the hidden layer used to encode the observations within the intrinsic curiosity module. This value should be @@ -202,7 +202,7 @@ Typical Range: `64` - `256` `curiosity_strength` corresponds to the magnitude of the intrinsic reward generated by the intrinsic curiosity module. This should be scaled in order to -ensure it is large enough to not be overwhelmed by extrnisic reward signals in +ensure it is large enough to not be overwhelmed by extrinsic reward signals in the environment. Likewise it should not be too large to overwhelm the extrinsic reward signal. diff --git a/docs/Training-on-Microsoft-Azure.md b/docs/Training-on-Microsoft-Azure.md index 6651fc89b2..861b35de49 100644 --- a/docs/Training-on-Microsoft-Azure.md +++ b/docs/Training-on-Microsoft-Azure.md @@ -9,7 +9,7 @@ support. ## Pre-Configured Azure Virtual Machine A pre-configured virtual machine image is available in the Azure Marketplace and -is nearly compltely ready for training. You can start by deploying the +is nearly completely ready for training. You can start by deploying the [Data Science Virtual Machine for Linux (Ubuntu)](https://azuremarketplace.microsoft.com/marketplace/apps/microsoft-ads.linux-data-science-vm-ubuntu) into your Azure subscription. Once your VM is deployed, SSH into it and run the following command to complete dependency installation: @@ -75,7 +75,7 @@ mlagents-learn --env= --run-id= --train ``` Where `` is the path to your app (i.e. -`~/unity-volume/3DBallHeadless`) and `` is an identifer you would like +`~/unity-volume/3DBallHeadless`) and `` is an identifier you would like to identify your training run with. If you've selected to run on a N-Series VM with GPU support, you can verify that diff --git a/docs/images/demo_component.png b/docs/images/demo_component.png index 489e77bd10..6cc78380bb 100644 Binary files a/docs/images/demo_component.png and b/docs/images/demo_component.png differ diff --git a/docs/images/demo_inspector.png b/docs/images/demo_inspector.png index ab69f2f764..9cb7a60980 100644 Binary files a/docs/images/demo_inspector.png and b/docs/images/demo_inspector.png differ