Major code refactoring #6

haarnoja · 2018-01-29T00:51:27Z

This pull request replaces large parts of the implementation with soft actor-critic code for better compatibility and easier maintenance. It also changes the way how the action bounds are enforced by replacing InputBounds with a squashing function (tanh).

…parts (#9) * Replace RLAlgorithm with the version from SAC. * Replace ReplayBuffer and miscellaneous helper functions. * Replace NNQFunction and Plotter. * Replace StochasticNNPolicy. * Change `Kernel` class into a function.

* Eliminate the need for custom rllab * Instead of a `SerializableTensor` hack, now using similar serialization scheme as in SAC. * Add scripts to test serialization. * Add option to train on EC2.

* Remove unused files and code * Add missing `__init__.py`s * Rename `algos` -> `algorithms` and `envs` -> `environments` * Overall clean-up here and there

* Sample a different latent for each sample in batch * Replace hacky `input_bounds` with proper squashing * Rename `mlp` -> `feedforward_net` and clean it up * Simplify `SQL` structure and consolidate notation * Remove temperature parameter

* Add wrapper for Gym environments * Update multigoal example. * Revert multigoal env.

* Clean `Dockerfile` and `environment.yml` * Update `README.md`

haarnoja added 7 commits January 23, 2018 14:25

Better serialization and removed dependency on a custom rllab. (#10)

f644925

* Eliminate the need for custom rllab * Instead of a `SerializableTensor` hack, now using similar serialization scheme as in SAC. * Add scripts to test serialization. * Add option to train on EC2.

Remove unused files and general clean-up (#11)

932f3dc

* Remove unused files and code * Add missing `__init__.py`s * Rename `algos` -> `algorithms` and `envs` -> `environments` * Overall clean-up here and there

Refactor/sql/algorithmic improvements (#12)

75cba1b

* Sample a different latent for each sample in batch * Replace hacky `input_bounds` with proper squashing * Rename `mlp` -> `feedforward_net` and clean it up * Simplify `SQL` structure and consolidate notation * Remove temperature parameter

Refactor/sql/environment fixes (#14)

02882fa

* Add wrapper for Gym environments * Update multigoal example. * Revert multigoal env.

Update environment and README.md (#13)

a178303

* Clean `Dockerfile` and `environment.yml` * Update `README.md`

Commenting (#17)

0a4718e

haarnoja merged commit 59c0bbb into master Jan 29, 2018

haarnoja deleted the refactor/master branch January 29, 2018 00:51

haarnoja mentioned this pull request Jan 29, 2018

Temperature parameter is not handled properly #5

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major code refactoring #6

Major code refactoring #6

haarnoja commented Jan 29, 2018

Major code refactoring #6

Major code refactoring #6

Conversation

haarnoja commented Jan 29, 2018