Pre-release
Pre-release

@awjuliani awjuliani released this Sep 25, 2018

Assets 2

Fixes and Improvements

  • Fixes typo on documentation.
  • Removes unnecessary gitignore line.
  • Fixes imitation learning scenes.
  • Fixes BananaCollector environment.
  • Enables gym_unity with multiple visual observations.

Acknowledgements

Thanks to everyone at Unity who contributed to v0.5.0a, as well as: @Sohojoe, @fengredrum, and @xiaodi-faith.

Pre-release
Pre-release

@vincentpierre vincentpierre released this Sep 11, 2018 · 230 commits to master since this release

Assets 2

Important

We have reorganized the project repository. Please see Migrating from v0.4 to v0.5 documentation for more information. Highlighted changes to repository structure include:

  • The python folder has been renamed ml-agents. It now contains a python package called mlagents.
  • The unity-environment folder, containing the Unity project, has been renamed UnitySDK.
  • The protobuf definitions used for communication have been added to a new protobuf-definitions folder.
  • Example curricula and the trainer configuration file have been moved to a new config sub-directory.

Environments

To learn more about new and improved environments, see our Example Environments page.

Improved

The following environments have been changes to use Multi Discrete Action:

  • WallJump
  • BananaCollector

The following environment has been modified to use Action Masking:

  • GridWorld

New Features

  • [Gym] New package gym-unity which provides gym interface to wrap UnityEnvironment. More information here.

  • [Training] Can now run multiple concurent training sessions with the --num-runs=<n> command line option. (Training sessions are independent, and do not improve learning performance.)

  • [Unity] Meta-Curriculum. Supports curriculum learning in multi-brain environments.

  • [Unity] Action Masking for Discrete Control - It is now possible to mask invalid actions each step to limit the actions an agent can take.

  • [Unity] Action Branches for Discrete Control - It is now possible to define discrete action spaces which contain multiple branches, each with its own space size.

Changes

  • Can now visualize value estimates when using models trained with PPO from Unity with GetValueEstimate().
  • It is now possible to specify which camera the Monitor displays to.
  • Console summaries will now be displayed even when running inference mode from python.
  • Minimum supported Unity version is now 2017.4.

Fixes & Performance Improvements

  • Replaced some activation functions to swish.
  • Visual Observations use PNG instead of JPEG to avoid compression losses.
  • Improved python unit tests.
  • Fix to enable multiple training sessions on single GPU.
  • Curriculum lessons are now tracked correctly.

Known Issues

  • Ending training early using CTL+C does not save the model on Windows.
  • Sequentially opening and closing multiple instances of UnityEnvironment within a single process is not possible.

Acknowledgements

Thanks to everyone at Unity who contributed to v0.5.0, as well as: @sterling000, @bartlomiejwolk, @Sohojoe, @Phantomb.

Pre-release

@awjuliani awjuliani released this Jul 24, 2018 · 231 commits to master since this release

Assets 2

Fixes & Performance Improvements

  • Corrects observation space description for PushBlock environment.
  • Fixes bug preventing using environments with python multi-processing.
  • Fixes bug preventing agents to be initialized without a brain.
Pre-release

@awjuliani awjuliani released this Jun 29, 2018 · 238 commits to master since this release

Assets 2

Environments

  • Changes to example environments for visual consistency.

Documentation

  • Adjustments to Windows installation documentation.
  • Updates documentation to refer to project as a toolkit.

Changes

  • New Amazon Web Service AMI.
  • Uses swish for continuous control activation function.
  • Corrected version number in setup.py.

Fixes & Performance Improvements

  • Fixes memory leak bug when using visual observations.
  • Fixes use of behavioral cloning with visual observations.
  • Fixes use of curiosity-driven exploration with on-demand decision making.
  • Optimize visual observations when using internal brain.

Acknowledgements

Thanks to everyone at Unity who contributed to v0.4.0a, as well as: @tcmxx

Pre-release
Pre-release

@vincentpierre vincentpierre released this Jun 16, 2018 · 286 commits to master since this release

Assets 2

Environments

To learn more about new and improved environments, see our Example Environments page.

New

  • Walker - Humanoid physics based agent. The agents must move its body toward the goal direction as quickly as possible without falling.

  • Pyramids - Sparse reward environment. The agent must press a button, then topple a pyramid of blocks to get the golden brick at the top. Used to demonstrate Curiosity.

Improved

  • Revamped the Crawler environment

  • Added visual observation based scenes for :

    • BananaCollector
    • PushBlock
    • Hallway
    • Pyramids
  • Added Imitation Learning based scenes for :

    • Tennis
    • Bouncer
    • PushBlock
    • Hallway
    • Pyramids

New Features

  • [Unity] In Editor Training - It is now possible to train agents directly in the editor without building the scene. For more information, see here.

  • [Training] Curiosity-Driven Exploration - Addition of curiosity-based intrinsic reward signal when using PPO. Enable by setting use_curiosity brain training hyperparameter to true.

  • [Unity] Support for providing player input using axes within the Player Brain.

  • [Unity] TensorFlowSharp Plugin has been upgraded to version 1.7.1.

Changes

  • Main ML-Agents code now within MLAgents namespace. Ensure that the MLAgents namespace is added to necessary project scripts such as Agent classes.
  • ASCII art added to learn.py script.
  • Communication now uses gRPC and Protobuf. JSON libraries removed.
  • TensorBoard now reports mean absolute loss as opposed to total loss update loop.
  • PPO algorithm now uses wider gaussian output for Continuous Control models (increasing performance).

Documentation

  • Added Quick Start and & FAQ sections to the documentation.
  • Added documentation explaining how to use ML-Agents on Microsoft Azure.
  • Added benchmark reward thresholds for example environments.

Fixes & Performance Improvements

  • Episode length is now properly reported in TensorBoard in the first episode.
  • Behavioral Cloning now works with LSTM models.

Known Issues

  • Curiosity-driven exploration does not function with On-Demand Decision Making. Expect a fix in v0.4.0a.

Acknowledgements

Thanks to everyone at Unity who contributed to v0.4, as well as: @sterlingcrispin, @ChrisRisner, @akmadian, @animaleja32, @LeighS, and @5665tm.

Pre-release
Pre-release

@vincentpierre vincentpierre released this Apr 19, 2018 · 375 commits to master since this release

Assets 2

Fixes

  • Behavioral cloning fix (use stored info rather than previous info)
  • Value Bootstrap fixed for ppo
Pre-release
Pre-release

@vincentpierre vincentpierre released this Apr 16, 2018 · 376 commits to master since this release

Assets 2

Fixes

  • Remove links to out of date Unity Packages
  • Fix to the CoreInternalBrain for discrete vector observations
  • Retraining of the Basic Environment
  • Fixed the normalization of images in the internal brain
Pre-release
Pre-release

@awjuliani awjuliani released this Apr 13, 2018 · 377 commits to master since this release

Assets 2

Features

  • We have upgraded our Docker contain, which now supports Brains which contain camera-based Visual Observations.

Documentation

  • We have added a partial Chinese translation of our documentation. It is available here.

Fixes & Performance Improvements

  • Missing component reference in BananaRL environment.
  • Neural Network for multiple visual observations was not properly generated.
  • Episode time-out value estimate bootstrapping used incorrect observation as input.

Acknowledgements

Thanks to everyone at Unity who contributed to v0.3.1, as well as to the following community contributors:

@sterlingcrispin, @andersonaddo, @palomagr, @imankgoyal, @luchris429.

Pre-release
Pre-release

@awjuliani awjuliani released this Mar 21, 2018 · 439 commits to master since this release

Assets 2

Fixes

  • Fixes internal brain for Banana Imitation.
  • Fixes Discrete Control training for Imitation Learning.
  • Fixes Visual Observations in internal brain with non-square inputs.
Pre-release
Pre-release

@vincentpierre vincentpierre released this Mar 16, 2018 · 443 commits to master since this release

Assets 2

Fixes

Added the missing Ray Perception components to the agents in the BananaImitation scene.