Discrete deterministic policy gradient with Gumbel.

Blog post: how to do deterministic policy gradient with gumbel softmax and why you should do it.

Read us here!

To reproduce, you will need theano+lasagne and gym. We have a dockerfile in the repo if you prefer containers.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
gumbel.py		gumbel.py
gumbel_dpg_tutorial.ipynb		gumbel_dpg_tutorial.ipynb
replay_buffer.py		replay_buffer.py
target_network.py		target_network.py
xvfb		xvfb

Provide feedback