From 84d7b48c1aef890377a079e0101938ac6c0c9b97 Mon Sep 17 00:00:00 2001 From: Ervin Teng Date: Fri, 26 Jul 2019 13:48:16 -0700 Subject: [PATCH] Wrote Migrating docs --- docs/Migrating.md | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/docs/Migrating.md b/docs/Migrating.md index 7c99341e12..4b00fcc5e2 100644 --- a/docs/Migrating.md +++ b/docs/Migrating.md @@ -1,5 +1,26 @@ # Migrating +## Migrating from ML-Agents toolkit v0.8 to v0.9 + +### Important Changes +* We have changed the way reward signals (including Curiosity) are defined in the +`trainer_config.yaml`. +* When using multiple environments, every "step" as recorded in TensorBoard and +printed in the command line now corresponds to a single step of a single environment. +Previously, each step corresponded to one step for all environments (i.e., `num_envs` steps). + +#### Steps to Migrate +* If you were overriding any of these following parameters in your config file, remove them +from the top-level config and follow the steps below: + * `gamma` - Define a new `extrinsic` reward signal and set it's `gamma` to your new gamma. + * `use_curiosity`, `curiosity_strength`, `curiosity_enc_size` - Define a `curiosity` reward signal + and set its `strength` to `curiosity_strength`, and `encoding_size` to `curiosity_enc_size`. Give it + the same `gamma` as your `extrinsic` signal to mimic previous behavior. +See [Reward Signals](Training-RewardSignals.md) for more information on defining reward signals. +* TensorBoards generated when running multiple environments in v0.8 are not comparable to those generated in +v0.9 in terms of step count. Multiply your v0.8 step count by `num_envs` for an approximate comparison. +You may need to change `max_steps` in your config as appropriate as well. + ## Migrating from ML-Agents toolkit v0.7 to v0.8 ### Important Changes