Add PPO agent #126

toslunar · 2017-08-01T05:25:34Z

Resolves #120.

Currently, this only supports environments with discrete actions

coveralls · 2017-08-01T06:54:57Z

Changes Unknown when pulling aa30724 on toslunar:ppo-agent into ** on chainer:master**.

coveralls · 2017-08-02T06:42:01Z

Changes Unknown when pulling 7bf31c1 on toslunar:ppo-agent into ** on chainer:master**.

coveralls · 2017-08-03T10:24:29Z

Changes Unknown when pulling 7197d63 on toslunar:ppo-agent into ** on chainer:master**.

coveralls · 2017-08-09T07:55:36Z

Changes Unknown when pulling 7da765b on toslunar:ppo-agent into ** on chainer:master**.

muupan · 2017-08-11T22:07:27Z

How is the performance?

toslunar · 2017-10-27T07:39:31Z

passed the GPU tests

nosetests -v -a 'gpu' tests
./test_example.sh 0

on the commit f31626c, in Python 3.

muupan · 2017-11-09T01:48:47Z

chainerrl/links/empirical_normalization.py

+        shape (int or tuple of int): Shape of input values except batch axis.
+        batch_axis (int): Batch axis.
+        eps (float): Small value for stability.
+        dtype (dtype): Dtype of input values.


Can you add until?

muupan · 2017-11-09T04:26:32Z

chainerrl/links/empirical_normalization.py

+        # clear cache
+        self._cached_std_inverse = None
+
+    def __call__(self, x, update=True):


Can you add a docstring, since update is important?

muupan · 2017-11-09T04:27:07Z

chainerrl/links/empirical_normalization.py

+
+        return self._cached_std_inverse
+
+    def experience(self, x):


Can you add a docstring?

muupan · 2017-11-09T09:37:23Z

chainerrl/policies/gaussian_policy.py

+        layers.append(L.Linear(n_input_channels, n_hidden_channels))
+        for _ in range(n_hidden_layers - 1):
+            layers.append(self.nonlinearity)
+            layers.append(L.Linear(n_hidden_channels, n_hidden_channels))


Nonlinearity is missing before the last layer

muupan · 2017-11-13T10:28:42Z

LGTM

toslunar added 17 commits July 24, 2017 19:32

.

c3737aa

debug

3d823c1

..

94a4cc7

debug

ba87a4f

use pessimistic loss for value function

c11fe88

flake8

c37ff02

fix default lr

5438831

add option not to clip vf

164fcf5

example ppo

eddf28e

test train_ppo

a1cceaa

add docstring

6981ebd

add test_ppo

c31a76c

weaken test

c02b419

Add gpu support

6fb6547

debug

10ce5e2

debug gpu codes

ca80279

flake8

aa30724

toslunar added 2 commits August 2, 2017 14:16

add statistics

999a8f5

add A3CFFGaussian

7bf31c1

toslunar added 2 commits August 3, 2017 18:53

gaussian policy with state-independent variance

406291c

.

7197d63

debug

7da765b

toslunar added 3 commits August 14, 2017 07:24

cp

7320202

ppo

0217b2f

Use parameters for Atari

b138589

toslunar and others added 16 commits October 23, 2017 21:01

flake8

e3f0ba8

Merge branch 'master' into ppo-agent

634628b

fix 'add obs_filter'

22e157e

add bound_mean option

78a8c30

fix EmpiricalNormalization

8da29e9

move tests

d26d0c1

split tests

c6986ba

fix

a5b911a

Add tests

622cab1

Hide internals

c18e660

misc

d64c9a0

Add tests with Chainer

d0b3714

bugfix

67ac092

misc

fdc0c92

fix

f31626c

keep Python 2 support

3cd14a5

toslunar added 2 commits October 30, 2017 10:28

Add docstring

3e35647

Remove unused variable

2b6d64c

muupan reviewed Nov 9, 2017

View reviewed changes

muupan requested changes Nov 9, 2017

View reviewed changes

toslunar added 3 commits November 9, 2017 19:01

bugfix

f3a147d

add docstrings

5fe19fb

misc

877fa8f

muupan approved these changes Nov 13, 2017

View reviewed changes

muupan merged commit 04e938e into chainer:master Nov 13, 2017

toslunar deleted the ppo-agent branch November 13, 2017 10:34

muupan added the enhancement label Nov 30, 2017

muupan added this to the v0.3 milestone Nov 30, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PPO agent #126

Add PPO agent #126

toslunar commented Aug 1, 2017

coveralls commented Aug 1, 2017 •

edited

coveralls commented Aug 2, 2017 •

edited

coveralls commented Aug 3, 2017 •

edited

coveralls commented Aug 9, 2017 •

edited

muupan commented Aug 11, 2017

toslunar commented Oct 27, 2017

muupan Nov 9, 2017

muupan Nov 9, 2017

muupan Nov 9, 2017

muupan Nov 9, 2017

muupan commented Nov 13, 2017

Add PPO agent #126

Add PPO agent #126

Conversation

toslunar commented Aug 1, 2017

coveralls commented Aug 1, 2017 • edited

coveralls commented Aug 2, 2017 • edited

coveralls commented Aug 3, 2017 • edited

coveralls commented Aug 9, 2017 • edited

muupan commented Aug 11, 2017

toslunar commented Oct 27, 2017

muupan Nov 9, 2017

Choose a reason for hiding this comment

muupan Nov 9, 2017

Choose a reason for hiding this comment

muupan Nov 9, 2017

Choose a reason for hiding this comment

muupan Nov 9, 2017

Choose a reason for hiding this comment

muupan commented Nov 13, 2017

coveralls commented Aug 1, 2017 •

edited

coveralls commented Aug 2, 2017 •

edited

coveralls commented Aug 3, 2017 •

edited

coveralls commented Aug 9, 2017 •

edited