Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

2019.10.3

Fixed

Better parameters for example her_ddpg_fetchreach (#1764)
Bug in DiscreteQfDerivedPolicy in which parameters were not returned (#1847)
Bug which made it impossible to evaluate stochastic policies deterministically (#1715)

2019.10.2

Fixed

Use a GitHub Token in the CI to retrieve packages to avoid hitting GitHub API rate limit (#1250)
Avoid installing dev extra dependencies during the conda check (#1296)
Install dm_control from PyPI (#1406)
Pin tfp to 0.8.x to avoid breaking pipenv (#1480)
Force python 3.5 in CI (#1522)
Separate terminal and completion signal in vectorized sampler (#1581)
Disable certicate check for roboti.us (#1595)
Fix advantages shape in compute_advantage() in torch tree (#1209)
Fix plotting using tf.plotter (#1292)
Fix duplicate window rendering when using garage.Plotter (#1299)
Fix setting garage.model parameters (#1363)
Fix two example jupyter notebook (#1584)
Fix collecting samples in RaySampler (#1583)

2019.10.1

Added

Integration tests which cover all example scripts ( #1078, #1090)
Deterministic mode support for PyTorch (#1068)
Install script support for macOS 10.15.1 (#1051)
PyTorch modules now support either functions or modules for specifying their non-linearities (#1038)

Fixed

Errors in the documentation on implementing new algorithms (#1074)
Broken example for DDPG+HER in TensorFlow (#1070)
Error in the documentation for using garage with conda (#1066)
Broken pickling of environment wrappers (#1061)
garage.torch was not included in the PyPI distribution (#1037)
A few broken examples for garage.tf (#1032)

2019.10.0

Added

Algorithms
- (D)DQN in TensorFlow (#582)
- Maximum-entropy and entropy regularization for policy gradient algorithms in TensorFlow (#632)
- DDPG in PyTorch (#815)
- VPG (i.e. policy gradients) in PyTorch (#883)
- TD3 in TensorFlow (#458)
APIs
- Runner API for executing experiments and LocalRunner implementation for executing them on the local machine ( #541, #593, #602, #816, )
- New Logger API, provided by a sister project dowel (#464, #660)
Environment wrappers for pixel-based algorithms, especially DQN (#556)
Example for how to use garage with Google Colab (#476)
Advantage normalization for recurrent policies in TF (#626)
PyTorch support (#725, #764)
Autogenerated API docs on garage.readthedocs.io (#802)
GPU version of the pip package (#834)
PathBuffer, a trajectory-oriented replay buffer (#838)
RaySampler, a remote and/or multiprocess sampler based on ray (#793)
Garage is now distributed on PyPI (#870)
rollout option to only sample policies deterministically (#896)
MultiEnvWrapper, which wraps multiple gym.Env environments into a discrete multi-task environment (#946)

Changed

Optimized Dockerfiles for fast rebuilds (#557)
Random seed APIs moved to garage.experiment.deterministic (#578)
Experiment wrapper script is now an ordinary module (#586)
numpy-based modules and algorithms moved to garage.np (#604)
Algorithm constructors now use EnvSpec rather than gym.Env (#575)
Snapshotter API moved from garage.logger to garage.experiment (#658)
Moved process_samples API from the Sampler to algorithms (#652)
Updated Snapshotter API (#699)
Updated Resume API (#777)
All algorithms now have a default sampler (#832)
Experiment lauchers now require an explicit snapshot_config to their run_task function (#860)
Various samplers moved from garage.tf.sampler to garage.sampler (#836, #840)
Dockerfiles are now based on Ubuntu 18.04 LTS by default (#763)
dm_control is now an optional dependency, installed using the extra garage[dm_control] (#828)
MuJoCo is now an optional dependency, installed using the extra garage[mujoco] (#848)
Samplers no longer flatten observations and actions (#930, #938, #967)
Implementations, tests, and benchmarks for all TensorFlow primitives, which are now based on garage.tf.Model (#574, #606, #615, #616, #618, #641, #642, #656, #662, #668, #672, #677, #730, #722, #765, #855, #878, #888, #898, #892, #897, #893, #890, #903, #916, #891, #922, #931, #933, #906, #945, #944, #943, #972)
Dependency upgrades:
- mujoco-py to 2.0 (#661)
- gym to 0.12.4 (#661)
- dm_control to 7a36377879c57777e5d5b4da5aae2cd2a29b607a (#661)
- akro to 0.0.6 (#796)
- pycma to 2.7.0 (#861)
- tensorflow to 1.15 (#953)
- pytorch to 1.3.0 (#952)

Removed

garage.misc.autoargs, a tool for decorating classes with autogenerated command-line arguments (#573)
garage.misc.ext, a module with several unrelated utilities (#578)
config_personal.py module, replaced by environment variables where relevant (#578, #747)
contrib.rllab_hyperopt, an experimental module for using hyperopt to tune hyperparameters (#684)
contrib.bichenchao, a module of example launchers (#683)
contrib.alexbeloi, a module with an importance-sampling sampler and examples (there were merged into garage) (#717)
EC2 cluster documentation and examples (#835)
DeterministicMLPPolicy, because it duplicated ContinuousMLPPolicy (#929)
garage.tf.layers, a custom high-level neural network definition API, was replaced by garage.tf.models (#939)
Parameterized, which was replaced by garage.tf.Model (#942)
garage.misc.overrides, whose features are no longer needed due proper ABC support in Python 3 and sphinx-autodoc (#974)
Serializable, which became a maintainability burden and has now been replaced by regular pickle protocol (__getstate__/__setstate__) implementations, where necessary (#982)
garage.misc.special, a library of mostly-unused math subroutines (#986)
garage.envs.util, superceded by features in akro (#986)
garage.misc.console, a library of mostly-unused helper functions for writing shell scripts (#988)

Fixed

Bug in ReplayBuffer #554
Bug in setup_linux.sh #560
Bug in examples/sim_policy.py (#691)
Bug in FiniteDifferenceHvp (#745)
Determinism bug for some samplers (#880)
use_gpu in the experiment runner (#918)

2019.02.2

Fixed

Bug in entropy regularization in TensorFlow PPO/TRPO (#579)
Bug in which advantage normalization was broken for recurrent policies (#626)
Bug in examples/sim_policy.py (#691)
Bug in FiniteDifferenceHvp (#745)

2019.02.1

Fixed

Fix overhead in GaussianMLPRegressor by optionally creating assign operations (#622)

2019.02.0

Added

Epsilon-greedy exploration strategy, DiscreteMLPModel, and QFunctionDerivedPolicy (all needed by DQN)
Base Model class for TensorFlow-based primitives
Dump plots generated with matplotlib to TensorBoard
Relative Entropy Policy Search (REPS) algorithm
GaussianConvBaseline and GaussianConvRegressor primitives
New Dockerfiles, docker-compose files, and Makefiles for running garage using Docker
Vanilla policy gradient loss to NPO
Truncated Natural Policy Gradient (TNPG) algorithm for TensorFlow
Episodic Reward Weighted Regression (ERWR) algorithm for TensorFlow
gym.Env wrappers used for pixel environments
Convolutional Neural Network primitive

Changed

Move dependencies from environment.yml to setup.py
Update dependencies:
- tensorflow-probability to 0.5.x
- dm_control to commit 92f9913
- TensorFlow to 1.12
- MuJoCo to 2.0
- gym to 0.10.11
Move dm_control tests into the unit test tree
Use GitHub standard .gitignore
Improve the implementation of RandomizedEnv (Dynamics Randomization)
Decouple TensorBoard from the logger
Move files from garage/misc/instrument to garage/experiment
setup.py to be canonical in format and use automatic versioning

Removed

Move some garage subpackages into their own repositories:
- garage.viskit to rlworkgroup/viskit
- garage.spaces to rlworkgroup/akro
Remove Theano backend, algorithms, and dependencies
Custom environments which duplicated openai/gym
Some dead files from garage/misc (meta.py and viewer2d.py)
Remove all code coverage tracking providers except CodeCov

Fixed

Clean up warnings in the test suite
Pickling bug in GaussianMLPolicyWithModel
Namescope in LbfgsOptimizer
Correctly sample paths in OffPolicyVectorizedSampler
Implementation bugs in tf/VPG
Bug when importing Box
Bug in test_benchmark_her

2018.10.1

Fixed

Avoid importing Theano when using the TensorFlow branch
Avoid importing MuJoCo when not required
Implementation bugs in tf/VPG
Bug when importing Box
Bug in test_benchmark_her
Bug in the CI scripts which produced false positives

2018.10.0

Added

PPO and DDPG for the TensorFlow branch
HER for DDPG
Recurrent Neural Network policy support for NPO, PPO and TRPO
Base class for ReplayBuffer, and two implementations: SimpleReplayBuffer and HerReplayBuffer
Sampler classes OffPolicyVectorizedSampler and OnPolicyVectorizedSampler
Base class for offline policies OffPolicyRLAlgorithm
Benchmark tests for TRPO, PPO and DDPG to compare their performance with those produced by OpenAI Baselines
Dynamics randomization for MuJoCo environments
Support for dm_control environments
DictSpace support for garage environments
PEP8 checks enforced in the codebase
Support for Python imports: maintain correct ordering and remove unused imports or import errors
Test on TravisCI using Docker images for managing dependencies
Testing code reorganized
Code Coverage measurement with codecov
Pre-commit hooks to enforce PEP8 and to verify imports and commit messages, which are also applied in the Travis CI verification
Docstring verification for added files that are not in the test branch or moved files
TensorBoard support for all key-value/log_tabular calls, plus support for logging distributions
Variable and name scope for symbolic operations in TensorFlow
Top-level base Space class for garage
Asynchronous plotting for Theano and Tensorflow
GPU support for Theano

Changed

Rename rllab to garage, including all the rllab references in the packages and modules inside the project
Rename run_experiment_lite to run_experiment
The file cma_es_lib.py was replaced by the pycma library available on PyPI
Move the contrib package to garage.contrib
Move Theano-dependent code to garage.theano
Move all code from sandbox.rocky.tf to garage.tf
Update several dependencies, mainly:
- Python to 3.6.6
- TensorFlow to 1.9
- Theano to 1.0.2
- mujoco-py to 1.50.1
- gym to 0.10.8
Transfer various dependencies from conda to pip
Separate example script files in the Theano and TensorFlow branch
Update LICENSE, CONTRIBUTING.md and .gitignore
Use convenience imports, that is, import classes and functions that share the same or similar name to its module in the corresponding __init__.py file of their package
Replace ProxyEnv with gym.Wrapper
Update installation scripts for Linux and macOS

Removed

All unused imports in the Python files
Unused packages from environment.yml
The files under rllab.mujoco_py were removed to use the pip release instead
Empty __init__.py files
The environment class defined by rllab.envs.Env was not imported to garage and the environment defined by gym.Env is used now

Fixed

Sleeping processes produced by the parallel sampler. NOTE: although the frequency of this issue has been reduced, our tests in TravisCI occasionally detect the issue and currently it seems to be an issue with re-entrant locks and multiprocessing in Python.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CHANGELOG.md

CHANGELOG.md

Changelog

2019.10.3

Fixed

2019.10.2

Fixed

2019.10.1

Added

Fixed

2019.10.0

Added

Changed

Removed

Fixed

2019.02.2

Fixed

2019.02.1

Fixed

2019.02.0

Added

Changed

Removed

Fixed

2018.10.1

Fixed

2018.10.0

Added

Changed

Removed

Fixed

Files

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

Changelog

2019.10.3

Fixed

2019.10.2

Fixed

2019.10.1

Added

Fixed

2019.10.0

Added

Changed

Removed

Fixed

2019.02.2

Fixed

2019.02.1

Fixed

2019.02.0

Added

Changed

Removed

Fixed

2018.10.1

Fixed

2018.10.0

Added

Changed

Removed

Fixed