for discrete env #21

ccplxx · 2018-11-09T09:39:39Z

I read the paper DIAYN just now, and can't understand how to train the DIAYN in an env with discrete actions, because SAC is for continuous env. But in the paper, some experiments are based on mountain car and inverted pendulum. Thank you

haarnoja · 2018-11-09T14:41:38Z

I'm not too familiar with the DIAYN implementation, maybe @ben-eysenbach can help.

ccplxx · 2018-11-10T02:05:46Z

Thank you, haarnoja. can SAC for discrete actions env? if it can, how?

haarnoja · 2018-11-12T14:44:49Z

Yeah you can use SAC with discrete actions too, but this implementation does not support them. You would need to replace the policy with softmax distribution \pi(.,s) \propto \exp Q(s,.), which you can compute exactly for finite action space.

hyperparameter tweaks

haarnoja closed this as completed Dec 26, 2018

hartikainen pushed a commit that referenced this issue Feb 24, 2019

WIP: hypermater tweaks (#21)

25bd332

hyperparameter tweaks

witwolf mentioned this issue Aug 13, 2019

get tfp.distribution.Categorical probs using public interface HorizonRobotics/alf#149

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

for discrete env #21

for discrete env #21

ccplxx commented Nov 9, 2018

haarnoja commented Nov 9, 2018

ccplxx commented Nov 10, 2018

haarnoja commented Nov 12, 2018

for discrete env #21

for discrete env #21

Comments

ccplxx commented Nov 9, 2018

haarnoja commented Nov 9, 2018

ccplxx commented Nov 10, 2018

haarnoja commented Nov 12, 2018