Skip to content

Conversation

shihzy
Copy link
Contributor

@shihzy shihzy commented Aug 29, 2018

Only the checked items. There are some structure and flow issues to be resolved based on the directory changes. I will tie off with @dericp tomorrow.

Wanted to get these into review before I start the bigger changes.

Please see Github Documentation Pre Release Checklist for 0.5 on what has been addressed

@shihzy shihzy changed the base branch from master to release-v0.5 August 29, 2018 20:16
This tutorial walks through the process of creating a Unity Environment. A Unity
Environment is an application built using the Unity Engine which can be used to
train Reinforcement Learning agents.
train Reinforcement Learning Agents.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be lowercased.

steps:

1. Create an environment for your agents to live in. An environment can range
1. Create an environment for your Agents to live in. An environment can range
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.

The Agent sends the information we collect to the Brain, which uses it to make a
decision. When you train the agent (or use a trained model), the data is fed
into a neural network as a feature vector. For an agent to successfully learn a
decision. When you train the Agent (or use a trained model), the data is fed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.


* Position of the agent itself within the confines of the floor. This data is
collected as the agent's distance from each edge of the floor.
* Position of the Agent itself within the confines of the floor. This data is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.

the task. For example, the RollerAgent reward system provides a small reward if
the agent moves closer to the target in a step and a small negative reward at
each step which encourages the agent to complete its task quickly.
the Agent moves closer to the target in a step and a small negative reward at
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.

Heuristics or Internal brains game sessions. You can then use this data to train
an agent in a supervised context.
Heuristics or Internal Brains game sessions. You can then use this data to train
an Agent in a supervised context.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.

that you can use with the Internal Brain type.

A __model__ is a mathematical relationship mapping an agent's observations to
A __model__ is a mathematical relationship mapping an Agent's observations to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.

Reinforcement learning is an artificial intelligence technique that trains
_agents_ to perform tasks by rewarding desirable behavior. During reinforcement
learning, an agent explores its environment, observes the state of things, and,
learning, an Agent explores its environment, observes the state of things, and,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.

state, the agent receives a positive reward. If it leads to a less desirable
state, then the agent receives no reward or a negative reward (punishment). As
the agent learns during training, it optimizes its decision making so that it
state, the Agent receives a positive reward. If it leads to a less desirable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.

[Proximal Policy Optimization (PPO)](https://blog.openai.com/openai-baselines-ppo/).
PPO uses a neural network to approximate the ideal function that maps an agent's
observations to the best action an agent can take in a given state. The
PPO uses a neural network to approximate the ideal function that maps an Agent's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.


**Note:** if you aren't studying machine and reinforcement learning as a subject
and just want to train agents to accomplish tasks, you can treat PPO training as
and just want to train Agents to accomplish tasks, you can treat PPO training as
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.

a _black box_. There are a few training-related parameters to adjust inside
Unity as well as on the Python training side, but you do not need in-depth
knowledge of the algorithm itself to successfully create and train agents.
knowledge of the algorithm itself to successfully create and train Agents.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.

class. The Academy works with Agent and Brain objects in the scene to step
through the simulation. When either the Academy has reached its maximum number
of steps or all agents in the scene are _done_, one training episode is
of steps or all Agents in the scene are _done_, one training episode is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.


An _environment_ in the ML-Agents toolkit can be any scene built in Unity. The
Unity scene provides the environment in which agents observe, act, and learn.
Unity scene provides the environment in which Agents observe, act, and learn.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.

* You can put your executable on a remote machine for faster training.
* You can use `Headless` mode for faster training.
* You can keep using the Unity Editor for other tasks while the agents are
* You can keep using the Unity Editor for other tasks while the Agents are
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.

- **Observations** - what the medic perceives about the environment.
Observations can be numeric and/or visual. Numeric observations measure
attributes of the environment from the point of view of the agent. For our
attributes of the environment from the point of view of the Agent. For our
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.


- Single-Agent. A single Agent linked to a single Brain, with its own reward
signal. The traditional way of training an agent. An example is any
signal. The traditional way of training an Agent. An example is any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.

- **Monitoring Agent’s Decision Making** - Since communication in ML-Agents is a
two-way street, we provide an agent Monitor class in Unity which can display
aspects of the trained agent, such as the agents perception on how well it is
two-way street, we provide an Agent Monitor class in Unity which can display
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep as it is. All others in this file should be lowercased.

packages, `mlagents.env` and `mlagents.trainers`. `mlagents.env` can be used
to interact directly with a Unity environment, while `mlagents.trainers`
contains the classes for training agents.
contains the classes for training Agents.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.

The ML-Agents toolkit conducts training using an external Python training
process. During training, this external process communicates with the Academy
object in the Unity scene to generate a block of agent experiences. These
object in the Unity scene to generate a block of Agent experiences. These
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All in this file should be lowercased.

[Proximal Policy Optimization (PPO)](https://blog.openai.com/openai-baselines-ppo/).
PPO uses a neural network to approximate the ideal function that maps an agent's
observations to the best action an agent can take in a given state. The
PPO uses a neural network to approximate the ideal function that maps an Agent's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be lowercased.

@shihzy
Copy link
Contributor Author

shihzy commented Sep 5, 2018

@awjuliani hopefully last round of capitalizations :) . let me know if any last min changes needed.

@awjuliani
Copy link
Contributor

Looks good @unityjeffrey! Thanks for making all these changes.

@awjuliani awjuliani merged commit e8393d5 into release-v0.5 Sep 5, 2018
@shihzy shihzy deleted the develop-doc-check-0.5 branch September 6, 2018 16:48
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants