Support other output data structures #28

cscherrer · 2018-12-12T18:20:29Z

The current DynamicHMC approach of storing NUTS output in an array is common, but is not always the most desirable. Two difficulties with this:

It's opaque until it's done running. This could be addressed with the addition of callback functions, but this requires a bit awkward coding style.
When it's done, it's done. There's no way to inform the sampler that we're not really done yet and would like a few more samples.

By representing output in the form of an iterator, both of these difficulties could be improved. An iterator can easily be processed "live" in order to interrupt (if we realize the geometry of the problem needs adjustment) or extended (for example if we'd like it to run until the effective number of samples reaches a certain point).

There may be some small overhead to an iterator approach compared to storing things in an array. But NUTS output we're interested in is almost always in the 100-10000 range, so this should be minimal. Also, it's easy to pipe an iterator into an array if array output is preferred.

An alternative to the iterator approach is to allow multiple output formats, maybe with something like an output=:array default keyword argument.

The text was updated successfully, but these errors were encountered:

tpapp · 2018-12-13T09:00:39Z

I am sympathetic to the spirit this approach, it is just that I am not sure that I would provide a method for Base.iterate, since the random state makes it non-deterministic. But an analogous function could work well, eg

sample, new_state = mcmc_step(stage, nuts::NUTS, state)

where new_state would update during adaptation and tuning (governed by stage), otherwise not.

Bikeshedding the interface is welcome.

Also, I would not worry about overhead. For anything but trivial models, 99% of the time is spent in evaluating log densities (with gradients) for the models I use. This is why I haven't bothered rewriting the building blocks using a mutating approach with preallocated state.

I am now in the stage of collecting these suggestions for an upcoming API redesign, hopefully in 2019Q1. Please keep them coming.

cscherrer · 2018-12-16T04:55:31Z

Great idea, your mcmc_step suggestion is much better! This can serve as an intermediary to flexibly populating any data structure. Very nice.

Currently I'm wrapping the result like so:

struct NUTS_result{T}
    chain :: Vector{NUTS_Transition{Vector{Float64},Float64}}
    transformation
    samples :: Vector{T}
    tuning
end

(haven't yet been too careful about the types)

Then I call your code like this:

function nuts(model; data=NamedTuple{}(), numSamples = 1000)
    result = NUTS_result{}
    t = getTransform(model)

    fpre = eval(logdensity(model))
    f(par) = Base.invokelatest(fpre,par,data)

    P = TransformedLogDensity(t,f)
    ∇P = ADgradient(:ForwardDiff,P)
    chain, tuning = NUTS_init_tune_mcmc(∇P, numSamples);
    samples = transform.(Ref(∇P.transformation), get_position.(chain));
    NUTS_result(chain, t, samples, tuning)
end

It would be good to better understand what's "idiomatic" use of this library, and to what extent I'm missing the mark. I'll be pushing on this over the next few weeks, but first I have some travel that will slow things down on this end for a few days.

tpapp · 2018-12-16T07:14:50Z

Thanks for the writeup of your use case. I am curious why you are using eval and invokelatest.

TBH, this library started out as a reusable set of building blocks for a higher-level API, with the focus of just doing HMC/NUTS. Some libraries are indeed wrapping it like that, as a building block.

But I and some others have ended up using it directly, so a rudimentary API would be in useful for simple cases.

cscherrer · 2018-12-16T18:48:11Z

I am curious why you are using eval and invokelatest.

Sure. Here's a simple model:

@model (μ, x) begin
    σ ~ HalfCauchy()
    x ⩪ Normal(μ, σ) |> iid
end

Internally, this is represented as an array of symbols (the arguments) and an expression (the body):

julia> hello.args
2-element Array{Symbol,1}:
 :μ
 :x

julia> hello.body
quote
    σ ~ HalfCauchy()
    x ⩪ Normal(μ, σ) |> iid
end

logdensity takes a model like hello and produces an expression:

julia> logdensity(hello)
:(function (par, data)
      ℓ = 0.0
      σ = par.σ
      ℓ += logpdf(HalfCauchy(), σ)
      x = data.x
      ℓ += logpdf(Normal(μ, σ) |> iid, x)
      ℓ
  end)

We need to programmatically evaluate this, which (if I understand correctly) requires eval. But the result of eval can't be called immediately - there's a "world age" issue that comes up if you try this. invokelatest solves this.

The approach is inherently insecure - I'd love to hear if you see a better approach for this.

tpapp · 2019-08-19T08:39:14Z

Regarding the original issue: the reworked API is now implemented in master. Most of it is in src/mcmc.jl and extensively documented with docstrings.

Please let me know if it does what you want or whether anything else is needed.

tpapp · 2019-09-10T05:50:25Z

Closing because of no activity, feel free to reopen.

cscherrer · 2019-09-10T14:28:57Z

That's funny, I was just looking into this yesterday:
https://github.com/cscherrer/DynamicHMC.jl

This is a quick MWE edit using ResumableFunctions.jl. It's clearly not yet where it needs to be, but I think an iterator-first approach has a lot of promise. Could you open an iterators or similar branch so I can PR to it to discuss more?

tpapp · 2019-09-10T14:32:04Z

Can't you just branch from master of this package and make a PR?

cscherrer · 2019-09-10T14:33:22Z

Oh sure, but then testing it a bit more awkward since my fork isn't registered. I'll do that and you can handle as you like

tpapp · 2019-09-10T14:40:55Z

I think that if you PR against this repo, the tests will just run on Travis like everything else.

tpapp · 2019-09-11T06:54:21Z

A lot of discussion happened in #92.

tpapp · 2019-09-16T13:24:06Z

Closed by #94.

tpapp mentioned this issue Dec 25, 2018

RFC: API reorganization #30

Closed

tpapp closed this as completed Sep 10, 2019

tpapp reopened this Sep 10, 2019

tpapp closed this as completed Sep 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support other output data structures #28

Support other output data structures #28

cscherrer commented Dec 12, 2018

tpapp commented Dec 13, 2018

cscherrer commented Dec 16, 2018

tpapp commented Dec 16, 2018

cscherrer commented Dec 16, 2018

tpapp commented Aug 19, 2019

tpapp commented Sep 10, 2019

cscherrer commented Sep 10, 2019

tpapp commented Sep 10, 2019

cscherrer commented Sep 10, 2019

tpapp commented Sep 10, 2019

tpapp commented Sep 11, 2019

tpapp commented Sep 16, 2019

Support other output data structures #28

Support other output data structures #28

Comments

cscherrer commented Dec 12, 2018

tpapp commented Dec 13, 2018

cscherrer commented Dec 16, 2018

tpapp commented Dec 16, 2018

cscherrer commented Dec 16, 2018

tpapp commented Aug 19, 2019

tpapp commented Sep 10, 2019

cscherrer commented Sep 10, 2019

tpapp commented Sep 10, 2019

cscherrer commented Sep 10, 2019

tpapp commented Sep 10, 2019

tpapp commented Sep 11, 2019

tpapp commented Sep 16, 2019