the implement of soft Q learning algorithm in pytorch
note that this is for discrete action space
update SQIL: soft q imitation learning
all code is in one file and easily to follow
- tensorboardX (for logging, you can delete the logging code if you don't need)
- pytorch (>= 1.0, 1.0.1 used in my experiment)
- gym
in Cartpole-v0
Reinforcement Learning with Deep Energy-Based Policies
SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards