Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement measure-valued derivatives #78

Open
HEmile opened this issue Jul 30, 2020 · 1 comment
Open

Implement measure-valued derivatives #78

HEmile opened this issue Jul 30, 2020 · 1 comment
Labels
enhancement New feature or request

Comments

@HEmile
Copy link
Owner

HEmile commented Jul 30, 2020

Measure valued derivatives are an alternative to REINFORCE/score function. See https://arxiv.org/pdf/1906.10652.pdf for a clear explanation.

It has some problems when implementing it, though! Samples are taken using the positive and negative probability components. This means that blindly applying MC won't work for downstream estimation: It's not taken from the original distribution. We can easily fix this by importance sampling using the weighting function. Furthermore, to make it compatible with auto-diff, a solution could be: $\sum_i \theta_i \bot(c_{\theta_i}(f(x_1)-f(x_2))$, where $x_1 \sim p^+$ and $x_2\sim p^-$.

@HEmile
Copy link
Owner Author

HEmile commented Jul 30, 2020

We can fit this somewhat naturally in the DiCE formulation as follows: Use as the multiplicative estimator the following: $\theta_i \cdot \bot(c_{\theta_i} \cdot \frac{p^+(x)}{p(x)})$ for samples from the positive component, and $-\theta_i \cdot \bot(c_{\theta_i} \cdot \frac{p^-(x)}{p(x)})$ for samples of the negative component. This allows us to compensate for the importance weighting used in the rest of the calculation.

The compensated weight can be implemented by simply taking the inverse of the weight.

@HEmile HEmile added this to To do in Gradient estimators via automation Jul 30, 2020
@HEmile HEmile added the enhancement New feature or request label Jul 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

No branches or pull requests

1 participant