Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: how to handle implicit quantities associated with coordinates #94

Open
tpapp opened this issue Sep 5, 2022 · 0 comments
Open

RFC: how to handle implicit quantities associated with coordinates #94

tpapp opened this issue Sep 5, 2022 · 0 comments

Comments

@tpapp
Copy link
Owner

tpapp commented Sep 5, 2022

Motivation

Suppose that for a set of parameters $x$, the equation $F(x, y) = 0$ defines $y(x)$ implicitly. Eg $x$ could be parameters to a problem that we approximate numerically, and $y$ the parameters of an approximation we obtain numerically (rootfinding etc). Given data $d$, the likelihood is defined as $\ell(d \mid x, y)$.

Theoretically, one could of course solve for the $y$ that belongs to each $x$. But this may be expensive and brittle, and if

$$ x_2 = x_1 + \Delta $$

then

$$ \hat{y}_2 = y_2 + \frac{\partial y}{\partial x} \Delta $$

would be a good initial guess for $y_2 = y(x_2)$.

Ideally, "users" like Turing.jl and DynamicHMC.jl should be able to ignore the details of these things and just carry on doing HMC/NUTS/etc with minimal changes.

Proposal: allow coordinates to be opaque

I propose an addition to the API composed of 3 functions, with the fallbacks

lift(ℓ, x::AbstractVector) = x
unlift(ℓ, x::AbstractVector) = x
translate(ℓ, x::AbstractVector, Δ::AbstractVector) = x .+ Δ

Specifically,

  1. "users" would call lift when generating random points for starting MCs, and in similar situations. Otherwise they would use translate,
  2. similarly, unlift would be called when coordinates are needed (eg turn statistics),
  3. leapfrog and RWMH steps would use translate.
  4. otherwise the result of lift and the x arguments of logdensity, logdensity_and_gradient, translate, unlift are allowed to be opaque objects, not an ::AbstractVector of real numbers. Nevertheless, logdensity_and_gradient should provide a valid gradient of x -> logdensity(ℓ, lift(ℓ, x)), but how that is done is up to the implementation of .

Bikeshedding names is appreciated 😉, also alternative API suggestions.

How this meshes with AD

This is a bit tricky and I don't yet have a good API in mind. Related work is in

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant