Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support other output data structures #28

Closed
cscherrer opened this issue Dec 12, 2018 · 12 comments
Closed

Support other output data structures #28

cscherrer opened this issue Dec 12, 2018 · 12 comments

Comments

@cscherrer
Copy link

The current DynamicHMC approach of storing NUTS output in an array is common, but is not always the most desirable. Two difficulties with this:

  1. It's opaque until it's done running. This could be addressed with the addition of callback functions, but this requires a bit awkward coding style.
  2. When it's done, it's done. There's no way to inform the sampler that we're not really done yet and would like a few more samples.

By representing output in the form of an iterator, both of these difficulties could be improved. An iterator can easily be processed "live" in order to interrupt (if we realize the geometry of the problem needs adjustment) or extended (for example if we'd like it to run until the effective number of samples reaches a certain point).

There may be some small overhead to an iterator approach compared to storing things in an array. But NUTS output we're interested in is almost always in the 100-10000 range, so this should be minimal. Also, it's easy to pipe an iterator into an array if array output is preferred.

An alternative to the iterator approach is to allow multiple output formats, maybe with something like an output=:array default keyword argument.

@tpapp
Copy link
Owner

tpapp commented Dec 13, 2018

I am sympathetic to the spirit this approach, it is just that I am not sure that I would provide a method for Base.iterate, since the random state makes it non-deterministic. But an analogous function could work well, eg

sample, new_state = mcmc_step(stage, nuts::NUTS, state)

where new_state would update during adaptation and tuning (governed by stage), otherwise not.

Bikeshedding the interface is welcome.

Also, I would not worry about overhead. For anything but trivial models, 99% of the time is spent in evaluating log densities (with gradients) for the models I use. This is why I haven't bothered rewriting the building blocks using a mutating approach with preallocated state.

I am now in the stage of collecting these suggestions for an upcoming API redesign, hopefully in 2019Q1. Please keep them coming.

@cscherrer
Copy link
Author

Great idea, your mcmc_step suggestion is much better! This can serve as an intermediary to flexibly populating any data structure. Very nice.

Currently I'm wrapping the result like so:

struct NUTS_result{T}
    chain :: Vector{NUTS_Transition{Vector{Float64},Float64}}
    transformation
    samples :: Vector{T}
    tuning
end

(haven't yet been too careful about the types)

Then I call your code like this:

function nuts(model; data=NamedTuple{}(), numSamples = 1000)
    result = NUTS_result{}
    t = getTransform(model)

    fpre = eval(logdensity(model))
    f(par) = Base.invokelatest(fpre,par,data)

    P = TransformedLogDensity(t,f)
    ∇P = ADgradient(:ForwardDiff,P)
    chain, tuning = NUTS_init_tune_mcmc(∇P, numSamples);
    samples = transform.(Ref(∇P.transformation), get_position.(chain));
    NUTS_result(chain, t, samples, tuning)
end

It would be good to better understand what's "idiomatic" use of this library, and to what extent I'm missing the mark. I'll be pushing on this over the next few weeks, but first I have some travel that will slow things down on this end for a few days.

@tpapp
Copy link
Owner

tpapp commented Dec 16, 2018

Thanks for the writeup of your use case. I am curious why you are using eval and invokelatest.

TBH, this library started out as a reusable set of building blocks for a higher-level API, with the focus of just doing HMC/NUTS. Some libraries are indeed wrapping it like that, as a building block.

But I and some others have ended up using it directly, so a rudimentary API would be in useful for simple cases.

@cscherrer
Copy link
Author

I am curious why you are using eval and invokelatest.

Sure. Here's a simple model:

@model (μ, x) begin
    σ ~ HalfCauchy()
    x  Normal(μ, σ) |> iid
end

Internally, this is represented as an array of symbols (the arguments) and an expression (the body):

julia> hello.args
2-element Array{Symbol,1}:
 
 :x

julia> hello.body
quote
    σ ~ HalfCauchy()
    x  Normal(μ, σ) |> iid
end

logdensity takes a model like hello and produces an expression:

julia> logdensity(hello)
:(function (par, data)
      ℓ = 0.0
      σ = par.σ
      ℓ += logpdf(HalfCauchy(), σ)
      x = data.x
      ℓ += logpdf(Normal(μ, σ) |> iid, x)
      ℓ
  end)

We need to programmatically evaluate this, which (if I understand correctly) requires eval. But the result of eval can't be called immediately - there's a "world age" issue that comes up if you try this. invokelatest solves this.

The approach is inherently insecure - I'd love to hear if you see a better approach for this.

@tpapp
Copy link
Owner

tpapp commented Aug 19, 2019

Regarding the original issue: the reworked API is now implemented in master. Most of it is in src/mcmc.jl and extensively documented with docstrings.

Please let me know if it does what you want or whether anything else is needed.

@tpapp
Copy link
Owner

tpapp commented Sep 10, 2019

Closing because of no activity, feel free to reopen.

@tpapp tpapp closed this as completed Sep 10, 2019
@cscherrer
Copy link
Author

That's funny, I was just looking into this yesterday:
https://github.com/cscherrer/DynamicHMC.jl

This is a quick MWE edit using ResumableFunctions.jl. It's clearly not yet where it needs to be, but I think an iterator-first approach has a lot of promise. Could you open an iterators or similar branch so I can PR to it to discuss more?

@tpapp tpapp reopened this Sep 10, 2019
@tpapp
Copy link
Owner

tpapp commented Sep 10, 2019

Can't you just branch from master of this package and make a PR?

@cscherrer
Copy link
Author

Oh sure, but then testing it a bit more awkward since my fork isn't registered. I'll do that and you can handle as you like

@tpapp
Copy link
Owner

tpapp commented Sep 10, 2019

I think that if you PR against this repo, the tests will just run on Travis like everything else.

@tpapp
Copy link
Owner

tpapp commented Sep 11, 2019

A lot of discussion happened in #92.

@tpapp
Copy link
Owner

tpapp commented Sep 16, 2019

Closed by #94.

@tpapp tpapp closed this as completed Sep 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants