Run it by
python A3C.py
The gaussian kernel is from haarnoja. This is for atari, there exist some problems, I will fix them soon.
After reading paper, I think sac can be almost like a3c...And because of the entropy, it's will not converge faster than a3c in my experiment.
Run it by
python sac.py
sac_new.py is ddpg style. fixed alpha, just for fun ~
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor