Can you elaborate on running SAC on discrete action space #22

sandipan1 · 2018-11-11T22:10:11Z

In the docs, it is mentioned about an alternate version of SAC with slight change can be used for discrete action space. Please elaborate with some more details.

jachiam · 2018-11-11T22:54:32Z

You're actually the second person to ask about this! First person sent an email. I'll add a sub-section or a "you should know" to the docs to go over this soon.

sandipan1 · 2018-11-12T05:21:42Z

Thanks. Also since this tutorial is more in favor of learn-by-doing rather than being purely theoretical, it would be nice to see explanations with some images of neural network architectures to get a quick overview of how to implement. For e.g SAC implements about 5 NN for value ,value_target , gaussian_policy, 2 Q_networks . It would be more convenient to understand if there is some pictorial representation of the networks and their relation

etendue · 2019-01-08T11:58:57Z

count me as 3rd. For discrete action space, the entropy calculation can be directly derived from distribution. The policy loss needs probably to maximize the advantage * log_probablity. What I am confused is, do we still need 2 Q networks and 1 Value network?

Wei2Wakeup · 2019-01-12T21:09:55Z

Is it just average over all \pi(a|s) for all actions, as it is already parameterized?

redknightlois · 2019-04-08T14:21:00Z

+1
I am just learning RL and looking to modify SAC for discrete action state. If you can elaborate on how to derive the equation I can implement it and send a PR.

GusHebblewhite · 2020-07-21T03:03:51Z

+1

redknightlois mentioned this issue Apr 25, 2019

[Question] Any idea why SAC loss would diverge? rail-berkeley/rlkit#50

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can you elaborate on running SAC on discrete action space #22

Can you elaborate on running SAC on discrete action space #22

sandipan1 commented Nov 11, 2018

jachiam commented Nov 11, 2018

sandipan1 commented Nov 12, 2018

etendue commented Jan 8, 2019

Wei2Wakeup commented Jan 12, 2019

redknightlois commented Apr 8, 2019

GusHebblewhite commented Jul 21, 2020

Can you elaborate on running SAC on discrete action space #22

Can you elaborate on running SAC on discrete action space #22

Comments

sandipan1 commented Nov 11, 2018

jachiam commented Nov 11, 2018

sandipan1 commented Nov 12, 2018

etendue commented Jan 8, 2019

Wei2Wakeup commented Jan 12, 2019

redknightlois commented Apr 8, 2019

GusHebblewhite commented Jul 21, 2020