Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing TensorFlow Trainers #4707

Merged
merged 11 commits into from
Dec 15, 2020
1 change: 1 addition & 0 deletions com.unity.ml-agents/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ and this project adheres to
### Major Changes
#### com.unity.ml-agents (C#)
#### ml-agents / ml-agents-envs / gym-unity (Python)
- TensorFlow trainers have been deprecated, please use the Torch trainers instead. (#4707)
vincentpierre marked this conversation as resolved.
Show resolved Hide resolved

### Minor Changes
#### com.unity.ml-agents / com.unity.ml-agents.extensions (C#)
Expand Down
4 changes: 1 addition & 3 deletions docs/ML-Agents-Overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -372,7 +372,7 @@ your agent's behavior:
below).
- `rnd`: represents an intrinsic reward signal that encourages exploration
in sparse-reward environments that is defined by the Curiosity module (see
below). (Not available for TensorFlow trainers)
below).

### Deep Reinforcement Learning

Expand Down Expand Up @@ -437,8 +437,6 @@ of the trained model is used as intrinsic reward. The more an Agent visits a sta
more accurate the predictions and the lower the rewards which encourages the Agent to
explore new states with higher prediction errors.

__Note:__ RND is not available for TensorFlow trainers (only PyTorch trainers)

### Imitation Learning

It is often more intuitive to simply demonstrate the behavior we want an agent
Expand Down
2 changes: 1 addition & 1 deletion docs/Training-Configuration-File.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ choice of the trainer (which we review on subsequent sections).
| `time_horizon` | (default = `64`) How many steps of experience to collect per-agent before adding it to the experience buffer. When this limit is reached before the end of an episode, a value estimate is used to predict the overall expected reward from the agent's current state. As such, this parameter trades off between a less biased, but higher variance estimate (long time horizon) and more biased, but less varied estimate (short time horizon). In cases where there are frequent rewards within an episode, or episodes are prohibitively large, a smaller number can be more ideal. This number should be large enough to capture all the important behavior within a sequence of an agent's actions. <br><br> Typical range: `32` - `2048` |
| `max_steps` | (default = `500000`) Total number of steps (i.e., observation collected and action taken) that must be taken in the environment (or across all environments if using multiple in parallel) before ending the training process. If you have multiple agents with the same behavior name within your environment, all steps taken by those agents will contribute to the same `max_steps` count. <br><br>Typical range: `5e5` - `1e7` |
| `keep_checkpoints` | (default = `5`) The maximum number of model checkpoints to keep. Checkpoints are saved after the number of steps specified by the checkpoint_interval option. Once the maximum number of checkpoints has been reached, the oldest checkpoint is deleted when saving a new checkpoint. |
| `checkpoint_interval` | (default = `500000`) The number of experiences collected between each checkpoint by the trainer. A maximum of `keep_checkpoints` checkpoints are saved before old ones are deleted. Each checkpoint saves the `.onnx` (and `.nn` if using TensorFlow) files in `results/` folder.|
| `checkpoint_interval` | (default = `500000`) The number of experiences collected between each checkpoint by the trainer. A maximum of `keep_checkpoints` checkpoints are saved before old ones are deleted. Each checkpoint saves the `.onnx` files in `results/` folder.|
| `init_path` | (default = None) Initialize trainer from a previously saved model. Note that the prior run should have used the same trainer configurations as the current run, and have been saved with the same version of ML-Agents. <br><br>You should provide the full path to the folder where the checkpoints were saved, e.g. `./models/{run-id}/{behavior_name}`. This option is provided in case you want to initialize different behaviors from different runs; in most cases, it is sufficient to use the `--initialize-from` CLI parameter to initialize all models from the same run. |
| `threaded` | (default = `true`) By default, model updates can happen while the environment is being stepped. This violates the [on-policy](https://spinningup.openai.com/en/latest/user/algorithms.html#the-on-policy-algorithms) assumption of PPO slightly in exchange for a training speedup. To maintain the strict on-policyness of PPO, you can disable parallel updates by setting `threaded` to `false`. There is usually no reason to turn `threaded` off for SAC. |
| `hyperparameters -> learning_rate` | (default = `3e-4`) Initial learning rate for gradient descent. Corresponds to the strength of each gradient descent update step. This should typically be decreased if training is unstable, and the reward does not consistently increase. <br><br>Typical range: `1e-5` - `1e-3` |
Expand Down
3 changes: 0 additions & 3 deletions docs/Training-ML-Agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -317,9 +317,6 @@ behaviors:
save_steps: 50000
swap_steps: 2000
team_change: 100000

# use TensorFlow backend
framework: tensorflow
```

Here is an equivalent file if we use an SAC trainer instead. Notice that the
Expand Down
17 changes: 1 addition & 16 deletions docs/Unity-Inference-Engine.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,19 +19,6 @@ Graphics Emulation is set to **OpenGL(ES) 3.0 or 2.0 emulation**. Also there
might be non-fatal build time errors when target platform includes Graphics API
that does not support **Unity Compute Shaders**.

## Supported formats

There are currently two supported model formats:

- Barracuda (`.nn`) files use a proprietary format produced by the
[`tensorflow_to_barracuda.py`]() script.
- ONNX (`.onnx`) files use an
[industry-standard open format](https://onnx.ai/about.html) produced by the
[tf2onnx package](https://github.com/onnx/tensorflow-onnx).

Export to ONNX is used if using PyTorch (the default). To enable it
while using TensorFlow, make sure `tf2onnx>=1.6.1` is installed in pip.

## Using the Unity Inference Engine

When using a model, drag the model file into the **Model** field in the
Expand All @@ -56,7 +43,5 @@ If you wish to run inference on an externally trained model, you should use
Barracuda directly, instead of trying to run it through ML-Agents.

## Model inference outside of Unity
We do not provide support for inference anywhere outside of Unity. The
`frozen_graph_def.pb` and `.onnx` files produced by training are open formats
for TensorFlow and ONNX respectively; if you wish to convert these to another
We do not provide support for inference anywhere outside of Unity. The `.onnx` files produced by training use the open format ONNX; if you wish to convert a `.onnx` file to another
format or run inference with them, refer to their documentation.
4 changes: 0 additions & 4 deletions ml-agents/mlagents/tf_utils/__init__.py

This file was deleted.

60 changes: 0 additions & 60 deletions ml-agents/mlagents/tf_utils/tf.py

This file was deleted.

21 changes: 18 additions & 3 deletions ml-agents/mlagents/trainers/cli_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,21 @@
from mlagents.trainers.exception import TrainerConfigError
from mlagents_envs.environment import UnityEnvironment
import argparse
from mlagents_envs import logging_util

logger = logging_util.get_logger(__name__)


class RaiseDeprecationWarning(argparse.Action):
"""
Internal custom Action to raise warning when argument is called.
"""

def __init__(self, nargs=0, **kwargs):
super().__init__(nargs=nargs, **kwargs)

def __call__(self, arg_parser, namespace, values, option_string=None):
logger.warning(f"The command line argument {option_string} was deprecated")


class DetectDefault(argparse.Action):
Expand Down Expand Up @@ -171,14 +186,14 @@ def _create_parser() -> argparse.ArgumentParser:
argparser.add_argument(
"--torch",
default=False,
action=DetectDefaultStoreTrue,
help="Use the PyTorch framework. Note that this option is not required anymore as PyTorch is the"
action=RaiseDeprecationWarning,
help="(Deprecated) Use the PyTorch framework. Note that this option is not required anymore as PyTorch is the"
"default framework, and will be removed in the next release.",
)
argparser.add_argument(
"--tensorflow",
default=False,
action=DetectDefaultStoreTrue,
action=RaiseDeprecationWarning,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove this one altogether? (Deprecated) usually means that it still works but will be removed in the future, whereas this flag won't do anything

Copy link
Contributor Author

@vincentpierre vincentpierre Dec 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see 3 options:

  • Remove --tensorflow and --torch (I think users will complain that a command line that used to work does not anymore)
  • Raise a specific error when --tensorflow is used to signal to the user that the argument has been removed
  • Only raise a warning when used (the warning can say "feature removed" instead of "feature deprecated"

I have a preference for option 2 but not a strong one. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think keeping it around with a warning for one more release is good practice. For the wording, "tensorflow" is "removed", "--tensorflow" and "--torch" are deprecated and have no effect.

help="(Deprecated) Use the TensorFlow framework instead of PyTorch. Install TensorFlow "
"before using this option.",
)
Expand Down
7 changes: 1 addition & 6 deletions ml-agents/mlagents/trainers/learn.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@

import mlagents.trainers
import mlagents_envs
from mlagents import tf_utils
from mlagents.trainers.trainer_controller import TrainerController
from mlagents.trainers.environment_parameter_manager import EnvironmentParameterManager
from mlagents.trainers.trainer import TrainerFactory
Expand All @@ -21,7 +20,7 @@
GaugeWriter,
ConsoleWriter,
)
from mlagents.trainers.cli_utils import parser, DetectDefault
from mlagents.trainers.cli_utils import parser
from mlagents_envs.environment import UnityEnvironment
from mlagents.trainers.settings import RunOptions

Expand Down Expand Up @@ -135,8 +134,6 @@ def run_training(run_seed: int, options: RunOptions) -> None:
param_manager=env_parameter_manager,
init_path=maybe_init_path,
multi_gpu=False,
force_torch="torch" in DetectDefault.non_default_args,
force_tensorflow="tensorflow" in DetectDefault.non_default_args,
)
# Create controller and begin training.
tc = TrainerController(
Expand Down Expand Up @@ -242,8 +239,6 @@ def run_cli(options: RunOptions) -> None:
log_level = logging_util.DEBUG
else:
log_level = logging_util.INFO
# disable noisy warnings from tensorflow
tf_utils.set_warnings_enabled(False)

logging_util.set_log_level(log_level)

Expand Down
Loading