You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello ! I have read your paper and your code, which are very helpful to me, thank you for sharing your valuable work to us !
I have a question :
Why you designed the actor network with one full connected layer and 3 weight, total learnable parameters is 16. However, the critic network is relative large, with 1.5k learnable parameters.
Could you please tell me the philosophy of this unbalance actor-critic design ?
LIN Xuan
The text was updated successfully, but these errors were encountered:
Hello Lin,
Thank you for your email. As you might know - the Critic network is built
to estimate the value function (i.e. either the simpler state-value 'V' or
the action-value 'Q'. While the Actor network is meant to update the policy
distribution in the direction *suggested by the Critic.* The Critic
therefore is doing the heavy "brain" work - for e.g. in chess it is doing
the actual "thinking" part. The Actor is "following" the critic's
suggestions.
That is why one finds that often the Critic network needs to be more
complicated.
Hope this helps. And wish you all the best for your research.
regards, Rajesh
On Fri, Jun 14, 2024 at 6:41 PM LIN Xuan ***@***.***> wrote:
Mr Siraskar,
Hello ! I have read your paper and your code, which are very helpful to
me, thank you for sharing your valuable work to us !
I have a question :
Why you designed the actor network with one full connected layer and 3
weight, total learnable parameters is 16. However, the critic network is
relative large, with 1.5k learnable parameters.
Could you please tell me the philosophy of this unbalance actor-critic
design ?
LIN Xuan
—
Reply to this email directly, view it on GitHub
<#6>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ASKXGUZEZCTLDCEU4MCYCZDZHLTZDAVCNFSM6AAAAABJKKXA3WVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM2TGMZVGI3TENY>
.
You are receiving this because you are subscribed to this thread.Message
ID: <Rajesh-Siraskar/Reinforcement-Learning-for-Control-of-Valves/issues/6
@github.com>
Mr Siraskar,
Hello ! I have read your paper and your code, which are very helpful to me, thank you for sharing your valuable work to us !
I have a question :
Why you designed the actor network with one full connected layer and 3 weight, total learnable parameters is 16. However, the critic network is relative large, with 1.5k learnable parameters.
Could you please tell me the philosophy of this unbalance actor-critic design ?
LIN Xuan
The text was updated successfully, but these errors were encountered: