Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions docs/Migrating.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,26 @@
# Migrating

## Migrating from ML-Agents toolkit v0.8 to v0.9

### Important Changes
* We have changed the way reward signals (including Curiosity) are defined in the
`trainer_config.yaml`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of saying "it has changed in the trainer_config.yaml" (which is one of the yaml file that doesn't have any gail configured), say something like "we have changed the way reward signals (..) are defined in the configuration file, you can refer to gail_config.yaml as an example".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please ignore my previous comment. I just realized that this isn't explaining anything regarding the new features, just how to migrate from using the old features.

* When using multiple environments, every "step" as recorded in TensorBoard and
printed in the command line now corresponds to a single step of a single environment.
Previously, each step corresponded to one step for all environments (i.e., `num_envs` steps).

#### Steps to Migrate
* If you were overriding any of these following parameters in your config file, remove them
from the top-level config and follow the steps below:
* `gamma` - Define a new `extrinsic` reward signal and set it's `gamma` to your new gamma.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* `gamma` - Define a new `extrinsic` reward signal and set it's `gamma` to your new gamma.
* `gamma` - Define a new `extrinsic` reward signal and set its `gamma` to your new gamma.

* `use_curiosity`, `curiosity_strength`, `curiosity_enc_size` - Define a `curiosity` reward signal
and set its `strength` to `curiosity_strength`, and `encoding_size` to `curiosity_enc_size`. Give it
the same `gamma` as your `extrinsic` signal to mimic previous behavior.
See [Reward Signals](Training-RewardSignals.md) for more information on defining reward signals.
* TensorBoards generated when running multiple environments in v0.8 are not comparable to those generated in
v0.9 in terms of step count. Multiply your v0.8 step count by `num_envs` for an approximate comparison.
You may need to change `max_steps` in your config as appropriate as well.

## Migrating from ML-Agents toolkit v0.7 to v0.8

### Important Changes
Expand Down