This repository was archived by the owner on Nov 17, 2023. It is now read-only.
Replies: 1 comment
-
|
@mxnet-label-bot add [Question, Example] |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am referring to the gluon example: actor critic.
According to the code in
actor_critic.py, the true returns of each states is calculated as:,which is an Monte Carlo method without bootstrapping.
So I think the name should be
REINOFRCE with Baselinebut notActor Critic. As stated in Section 13.5 of book Reinforcement Learning: An Introduction:And I also found
Pytorchhas the same issue with their example. But anyway, it is just a naming problem. If almost people think this should be also treated asActor Critic. Then never mind~Beta Was this translation helpful? Give feedback.
All reactions