Skip to content
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Backpropagation through the Void: Optimizing control variates for black-box gradient estimation

by Will Grathwohl, Dami Choi, Yuhuai Wu, Geoffrey Roeder, David Duvenaud

We introduce a general framework for learning low-variance, unbiased gradient estimators for black-box functions of random variables, based on gradients of a learned function. These estimators can be jointly trained with model parameters or policies, and are applicable in both discrete and continuous settings. We give unbiased, adaptive analogs of state-of-the-art reinforcement learning methods such as advantage actor-critic. We also demonstrate this framework for training discrete latent-variable models.

Code for VAE Experiments lives here. The Discrete RL experiments can be found at:

A simplified, pure-python implementation is in /relax-autograd/

If you have any questions about the code or paper please contact Will Grathwohl ( The code is in "research-state" at the moment and I will be updating it periodically. If you have questions feel free to email me and I will do my best to respond. -Will


Optimizing control variates for black-box gradient estimation



No releases published


No packages published
You can’t perform that action at this time.