New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PPO agent #126
Add PPO agent #126
Conversation
Changes Unknown when pulling aa30724 on toslunar:ppo-agent into ** on chainer:master**. |
Changes Unknown when pulling 7bf31c1 on toslunar:ppo-agent into ** on chainer:master**. |
Changes Unknown when pulling 7197d63 on toslunar:ppo-agent into ** on chainer:master**. |
Changes Unknown when pulling 7da765b on toslunar:ppo-agent into ** on chainer:master**. |
How is the performance? |
passed the GPU tests
on the commit f31626c, in Python 3. |
shape (int or tuple of int): Shape of input values except batch axis. | ||
batch_axis (int): Batch axis. | ||
eps (float): Small value for stability. | ||
dtype (dtype): Dtype of input values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add until
?
# clear cache | ||
self._cached_std_inverse = None | ||
|
||
def __call__(self, x, update=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a docstring, since update
is important?
|
||
return self._cached_std_inverse | ||
|
||
def experience(self, x): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a docstring?
layers.append(L.Linear(n_input_channels, n_hidden_channels)) | ||
for _ in range(n_hidden_layers - 1): | ||
layers.append(self.nonlinearity) | ||
layers.append(L.Linear(n_hidden_channels, n_hidden_channels)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nonlinearity is missing before the last layer
LGTM |
Resolves #120.
Currently, this only supports environments with discrete actions