Discrete deterministic policy gradient with Gumbel. Blog post: how to do deterministic policy gradient with gumbel softmax and why you should do it. Read us here! To reproduce, you will need theano+lasagne and gym. We have a dockerfile in the repo if you prefer containers.