Unifying trajectories #214

xukai92 · 2020-08-05T20:55:52Z

This PR replaces #213.

This PR solves #103 by explicitly separating Trajectory from HMCKernel,
without introducing any functionality change.
I left some comments on how to unify the notation as discussed in #48,
in which I use τ to refer a (numerical) Hamiltonian trajectory and κ to refer a MCMC kernel.
The remaining would be done in a separate PR to make the review easier.

As a showcase, below is how to construct NUTS now:

τ = Trajectory(Leapfrog(1e-3), NoUTurn())
κ = HMCKernel(τ, MultinomialTS)

Old syntax is still supported e.g. as below

NUTS{TS, TC}(int::AbstractIntegrator) where {TS, TC} = HMCKernel(Trajectory(int, TC()), TS)

As suggested by #103 (comment),
fixed simulation steps or length are unified as termination criteria.
A nice benefit of this is that the parameters regarding trajectory simulation are clear:
e.g. FixedNSteps(10), NoUTurn(5, 1_000) or NoUTurn(max_depth=5).

As a result, the main transition function looks like this.

function transition(rng, h, κ::HMCKernel, z)
    @unpack τ, TS = κ
    τ = reconstruct(τ, integrator=jitter(rng, τ.integrator))
    z = refresh(rng, z, h)
    return transition(rng, h, τ, TS, z)
end

As a side product of this new interface,
it also unifies how integrators are jittered - all before simulating the trajectory in the same place where momentum variables are refreshed.
With this design, this is the only place where jitter is called.

A list of changes is

The new interface
GeneralisedNoUTurn -> NoUTurn and StrictGeneralisedNoUTurn -> StrictNoUTurn
Store the termination statistics directly in the binary tree
- Why? Because termination criteria like NoUTurn are in the same role as FixedNSteps now. To unify the design, they are only intended to store the algorithm parameters, e.g. number of steps, maximum depth, etc.
All tests are updated accordingly
A regression test is added to make sure there is no functionality change compared to ef6de39
- It can only run locally for now using Julia 1.5 on macOS. This is because each Julia version or hardware would generate different chains and I only pre-generated those for my own machine.
- Making this fully supported on CI is left as a future work.

xukai92 · 2020-08-05T20:59:52Z

As mentioned in #213 (comment):

I added regression tests to compare the exact simulated chains and statistics in daade0a for the four variants.

Static HMC with MH

Static HMC with multinomial

NUTS with slice

NUTS with multinomial

These should be a good enough indicator that the functionality doesn't got changed.

This ensures that this PR doesn't change the functionality at all.

The regression test in this PR is added in 31c6517.

codecov · 2020-08-05T21:37:04Z

Codecov Report

Merging #214 (242dc4e) into master (692e646) will decrease coverage by 1.26%.
The diff coverage is 86.11%.

@@            Coverage Diff             @@
##           master     #214      +/-   ##
==========================================
- Coverage   89.13%   87.86%   -1.27%     
==========================================
  Files          16       16              
  Lines         672      676       +4     
==========================================
- Hits          599      594       -5     
- Misses         73       82       +9

Impacted Files	Coverage Δ
src/AdvancedHMC.jl	`75.00% <44.44%> (-25.00%)`	⬇️
src/sampler.jl	`84.48% <91.66%> (-4.81%)`	⬇️
src/trajectory.jl	`95.41% <92.15%> (-0.56%)`	⬇️
src/integrator.jl	`93.61% <0.00%> (-0.27%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e2ce003...3c06f2d. Read the comment docs.

yebai

I reviewed roughly half of this PR tonight. I'll review the rest tomorrow.

README.md

src/AdvancedHMC.jl

src/sampler.jl

src/trajectory.jl

README.md

src/sampler.jl

src/trajectory.jl

yebai · 2020-08-08T21:44:33Z

test/regression.jl

@@ -0,0 +1,63 @@
+using Test, Random, BSON, AdvancedHMC, Distributions, ForwardDiff
+
+is_ef6de39 = isdefined(AdvancedHMC, :EndPointTS)


A better way of performing this regression test is to check out a pinned version of AHMC, run the tests on CI machines with the pinned version and the current version. Then we can compare results without worrying machine-specific details.

Yes I agree. I don't know how to do that though. Can you point me some example CI codes on doing so?

BenchmarkTools allows users to write tests, and run regression tests automatically. Our use here is a bit special, but the principles are the same I think.

Can you point me an example? I only found example where they are looking at regression in time.

In my view, no single version of the code should be considered the ground truth. Rather, we can only be confident that features covered by the test suite work correctly. Hence instead of checking out a previous version to test against, it's better to add any missing tests, ensuring they pass on both the old version and the new version.

For performance regressions, you can use PkgBenchmark (see e.g. https://github.com/JuliaFolds/FLoops.jl/blob/master/.github/workflows/benchmark.yml), but I don't think that checks the outputs.

Ideally, we can keep PRs small, i.e. one PR for a change/feature. For big changes like this PR, we should try to do it in multiple steps. It would make reviewing code much easier, which leads to much quicker merging too.

test/sampler-vec.jl

test/trajectory.jl

yebai · 2020-08-08T21:53:26Z

Thanks, @xukai92. I've done one full pass of this PR; it looks overall good to me. I left many comments above.

Co-authored-by: Hong Ge <hg344@cam.ac.uk>

yebai

@xukai92 Pls ping me when this is ready for another review.

Co-authored-by: Hong Ge <hg344@cam.ac.uk>

This reverts commit 7c5d344.

xukai92 · 2020-08-31T19:31:40Z

@yebai This is ready for another look.

Two things I didn't address are:

Replacing rho by TurnStatistics
- I feel it's better to do this in another PR as it involves changes in many places.
Finding a better way to handle TS
- I did rename it to trajectory_sampler_type which is more descriptive
- Ideally I feel we need that we can avoid storing types in HMCKernel once we introduce TurnStatistics
  - As you suggested, we can parameterize TurnStatistics by termination criteria so does trajectory sampling methods
  - If we do this, we don't need to store turn statistics (for trajectory sampling) inside XXXTS anymore.
  - Then MultinomialTS() (instance) can be used over MultinomialTS (type) during construction, which I think is better.
- Open for discussion though. What's your opinion?

Also, do you think we should introduce a dev branch to AHMC?

sethaxen

This is an excellent refactor! I had a few minor questions/suggestions. Some of them are not necessarily on changes in this PR and don't need to be handled here, but they came up while I was reading.

src/trajectory.jl

sethaxen · 2020-10-23T23:20:31Z

src/trajectory.jl

 end

+const NoUTurn = GeneralisedNoUTurn


What's the purpose of this alias?

I want to make the transition of calling the old ones ClassificNoUTurn and new ones just NoUTurn and StrictNoUTurn, i.e. by default we refer to the generalised ones if not specified.

Perhaps it would make sense to go the other way around, then? Define NoUTurn and then make GeneralisedNoUTurn the alias so that users' legacy code works?

There were some debates between me and Hong on if we want to make this transition, and we ended up with this compromise. So with the PR, the recommended way is still the previous version. And we want to use the shorter version privately for a while before making the transition.

sethaxen · 2020-10-23T23:46:49Z

src/trajectory.jl


 Detect U turn for two phase points (`zleft` and `zright`) under given Hamiltonian `h`
 using the (original) no-U-turn cirterion.

 Ref: https://arxiv.org/abs/1111.4246, https://arxiv.org/abs/1701.02434
 """
-function isterminated(h::Hamiltonian, t::BinaryTree{<:ClassicNoUTurn})
+function isterminated(::ClassicNoUTurn, h::Hamiltonian, t::BinaryTree)


I agree that this should probably be in another PR.

sethaxen · 2020-10-23T23:48:45Z

src/trajectory.jl

@@ -587,79 +531,67 @@ function isterminated(h::Hamiltonian, t::BinaryTree{<:ClassicNoUTurn})
 end

 """
-    isterminated(h::Hamiltonian, t::BinaryTree{<:GeneralisedNoUTurn})
+    $(SIGNATURES)

 Detect U turn for two phase points (`zleft` and `zright`) under given Hamiltonian `h`


It seems like this docstring is outdated, since zleft and zright appear nowhere in the signature or body of either method.

sethaxen · 2020-10-23T23:52:59Z

src/trajectory.jl

+"""
+function isterminated(tc::StrictGeneralisedNoUTurn, h, t, tleft, tright)
+    # Step 0: original generalised U-turn check
+    s1 = isterminated(tc, h, t)


Does it make sense to return s1 if terminated, else s2 if terminated, else s3? Seems if s1 indicates is terminated, then we potentially do 3 times the work.

That's a good idea! Can you do that in your NUTS PR?

sethaxen · 2020-10-23T23:56:13Z

src/trajectory.jl

-) where {T<:BinaryTree{<:StrictGeneralisedNoUTurn}}
-    rho = tleft.c.rho + tright.zleft.r
+function check_left_subtree(h::Hamiltonian, t::T, tleft::T, tright::T) where {T<:BinaryTree}
+    rho = tleft.rho + tright.zleft.r


It'd be nice if we could avoid this allocation. Performing two extra dots in generalised_uturn_criterion should be cheaper than performing this allocation.

I'm happy to make the proposed change if some benchmark shows it improves the performance.

Here's a benchmark of one with the same time complexity:

julia> using LinearAlgebra, BenchmarkTools, Plots julia> function foo(a, b, c, d) e = a + b f = c + d dot(e, f) end; julia> function foo2(a, b, c, d) dot(a, c) + dot(a, d) + dot(b, c) + dot(b, d) end; julia> ns = 10 .^ (0:6); julia> times = map(ns) do n a, b, c, d = ntuple(_ -> randn(n), 4) (@belapsed($foo($a,$b,$c,$d)), @belapsed($foo2($a,$b,$c,$d))) end; julia> plot(ns, [first.(times) last.(times)]; xscale=:log10, yscale=:log10, labels=["allocating" "non-allocating"])

I can include it in a more general allocation-reducing PR though.

sethaxen · 2020-10-23T23:56:40Z

src/trajectory.jl

-) where {T<:BinaryTree{<:StrictGeneralisedNoUTurn}}
-    rho = tleft.zright.r + tright.c.rho
+function check_right_subtree(h::Hamiltonian, t::T, tleft::T, tright::T) where {T<:BinaryTree}
+    rho = tleft.zright.r + tright.rho


same point here as above.

src/trajectory.jl

Co-authored-by: Seth Axen <seth.axen@gmail.com>

yebai

Hi @xukai92, I've done another careful pass of this PR and left some comments. There is now some merge conflicts against the master branch. Maybe consider fixing that too.

Overall, I find reviewing this PR a bit painful ; ) I hope we can keep PRs small and incremental in the future given that the stake of introducing bugs is high. For a concrete example, the refactoring of Trajectories, TerminationCriteria, and the introduction of HMCKernel each would deserve its own PR...

yebai · 2021-01-06T17:42:47Z

src/AdvancedHMC.jl

+
+struct HMC{TS} end
+HMC{TS}(int::AbstractIntegrator, L) where {TS} = HMCKernel(Trajectory(int, FixedNSteps(L)), TS)
+HMC(int::AbstractIntegrator, L) = HMC{MetropolisTS}(int, L)


Maybe consider the following for clarity and performance?

Suggested change

HMC(int::AbstractIntegrator, L) = HMC{MetropolisTS}(int, L)

HMC(int::AbstractIntegrator, L) = HMCKernel(Trajectory(int, FixedNSteps(L)), MetropolisTS)

yebai · 2021-01-06T17:44:16Z

src/AdvancedHMC.jl

+struct HMC{TS} end
+HMC{TS}(int::AbstractIntegrator, L) where {TS} = HMCKernel(Trajectory(int, FixedNSteps(L)), TS)
+HMC(int::AbstractIntegrator, L) = HMC{MetropolisTS}(int, L)
+HMC(ϵ::AbstractScalarOrVec{<:Real}, L) = HMC{MetropolisTS}(Leapfrog(ϵ), L)


Suggested change

HMC(ϵ::AbstractScalarOrVec{<:Real}, L) = HMC{MetropolisTS}(Leapfrog(ϵ), L)

HMC(ϵ::AbstractScalarOrVec{<:Real}, L) = HMCKernel(Trajectory(Leapfrog(ϵ), FixedNSteps(L)), MetropolisTS)

yebai · 2021-01-06T17:45:20Z

src/AdvancedHMC.jl

+
+struct StaticTrajectory{TS} end
+@deprecate StaticTrajectory{TS}(args...) where {TS} HMC{TS}(args...)
+@deprecate StaticTrajectory(args...) HMC(args...)


Similar here, consider calling HMCKernel for clarity.

src/AdvancedHMC.jl

yebai · 2021-01-06T17:57:27Z

src/trajectory.jl

@@ -30,12 +30,48 @@ stat(t::Transition) = t.stat
 """
 Abstract Markov chain Monte Carlo proposal.
 """
-abstract type AbstractProposal end
+abstract type AbstractKernel end


Maybe consider the following for clarity?

Suggested change

abstract type AbstractKernel end

abstract type AbstractMCMCKernel end

src/trajectory.jl

yebai · 2021-01-06T20:17:01Z

src/trajectory.jl

        H′ = energy(z′)
        ΔH = H′ - H0
        α′ = exp(min(0, -ΔH))
-        sampler′ = S(sampler, H0, z′)
-        return BinaryTree(z′, z′, C(z′), α′, 1, ΔH), sampler′, Termination(sampler′, nt, H0, H′)
+        sampler′ = TS(sampler, H0, z′)


Consider using reconstruct here for consistency and clarity.

yebai · 2021-01-06T20:20:01Z

src/trajectory.jl

    h::Hamiltonian,
+    τ::Trajectory{I, C},
+    ::Type{TS},


Ideally, we should avoid passing this ::Type{TS} around.

src/trajectory.jl

yebai · 2021-01-06T20:24:46Z

test/regression.jl

@@ -0,0 +1,63 @@
+using Test, Random, BSON, AdvancedHMC, Distributions, ForwardDiff
+
+is_ef6de39 = isdefined(AdvancedHMC, :EndPointTS)


Ideally, we can keep PRs small, i.e. one PR for a change/feature. For big changes like this PR, we should try to do it in multiple steps. It would make reviewing code much easier, which leads to much quicker merging too.

yebai · 2021-01-06T20:45:30Z

Ps. I now feel the introduction of HMCKernel may not be necessary and potentially confusing. It is no longer necessary if we remove the TS field. It is inaccurate because, in HMC variants based on dynamic trajectories, the chosen trajectory sampler and the trajectory termination criteria (e.g. no-U-turn) is often coupled. That is, the step of simulating a trajectory and the step of picking a candidate phase point is coupled. We often iterate these two steps to get a final proposal. It is not easy to disentangle these two steps without introducing significant performance regression and more memory usage.

xukai92 · 2021-01-06T22:27:33Z

Hi @xukai92, I've done another careful pass of this PR and left some comments. There is now some merge conflicts against the master branch. Maybe consider fixing that too.

Thanks for taking another pass. I will address the comments soon.

Overall, I find reviewing this PR a bit painful ; ) I hope we can keep PRs small and incremental in the future given that the stake of introducing bugs is high. For a concrete example, the refactoring of Trajectories, TerminationCriteria, and the introduction of HMCKernel each would deserve its own PR...

Sorry about this. I'm actually learning towards separating it. I will work on extracting out the first PR after cooperating your suggestions.

Ps. I now feel the introduction of HMCKernel may not be necessary and potentially confusing. It is no longer necessary if we remove the TS field.

I completely agree that if we remove TS filed, we can probably merge HMCKernel and Trajectory . But just as a reminder, it has a few benefits in the future

Compatibility: I don't think we will be happy if we were to compose a HMC Trajectory with a MH kernel.
Modularity of momentum behaviour: partial momentum, coupling can be made modular; and I believe it's better to consider this step as part of the MCMC kernel
This last point is questionable: I kind of feel we should try to make the interface working on instances rather than types, it possible. For example, NUTS{MultinomialTS, GeneralisedNoUTurn}(Leapfrog(1e-3)) -> HMCKernel(Trajectory(Leapfrog(1e-3), NoUTurn()), MultinomialTS()). But anyway this is just my very initial thoughts on this.

But maybe I could just introduce it later when introducing the modularity of momentum, as we are agreeing on making separate PRs now.

It is inaccurate because, in HMC variants based on dynamic trajectories, the chosen trajectory sampler and the trajectory termination criteria (e.g. no-U-turn) is often coupled. That is, the step of simulating a trajectory and the step of picking a candidate phase point is coupled. We often iterate these two steps to get a final proposal. It is not easy to disentangle these two steps without introducing significant performance regression and more memory usage.

Yes I see that these two steps are coupled for dynamics trajectories; it is exactly the reason we are dispatching on termination and trajectory sampler together, right? Not sure what do you mean by "inaccurate" here. I still think they (trajectory and its sampler) are conceptually two concepts; it's just

For static trajectories, even the implementation can be (but not necessarily) separated.
For dynamic trajectories, it turns out to be the case that we need to dispatch on them together as the implementation is coupled.

Co-authored-by: Hong Ge <hg344@cam.ac.uk>

xukai92 added 6 commits August 5, 2020 16:31

add regression test against master

31c6517

rename EndPointTS to MetropolisTS

097dc3f

rename GeneralisedNoUTurn to NoUTurn

27668bd

update static trajectories

72309f8

update dynamic trajectories

52827b9

Support a richer interface for NUTS

8878944

yebai requested changes Aug 7, 2020

View reviewed changes

xukai92 commented Aug 8, 2020

View reviewed changes

README.md Outdated Show resolved Hide resolved

src/sampler.jl Outdated Show resolved Hide resolved

src/sampler.jl Outdated Show resolved Hide resolved

src/sampler.jl Show resolved Hide resolved

src/sampler.jl Show resolved Hide resolved

yebai requested changes Aug 8, 2020

View reviewed changes

xukai92 and others added 4 commits August 9, 2020 22:23

Apply suggestions from code review

04f621e

Co-authored-by: Hong Ge <hg344@cam.ac.uk>

add missing imports

9425d27

fix Hong's typo

c827dca

fix geweke

28887ca

yebai requested changes Aug 11, 2020

View reviewed changes

xukai92 and others added 12 commits August 28, 2020 10:25

name back no-U-turns

d17d79d

remove unnecessary interface for test

89f9b43

AbstractKernel -> AbstractKernel

cf0e5fc

FixedLength -> FixedIntegrationTime

ddc56f2

Update src/trajectory.jl

90b06cb

Co-authored-by: Hong Ge <hg344@cam.ac.uk>

make internal naming more descriptive

086590d

remove old comments

7c5d344

Revert "remove old comments"

cbd1234

This reverts commit 7c5d344.

rename TS to trajectory_sampler_type

237138f

improve internal namings

200b52f

push test toml

242dc4e

lower bound 1.3 on Travis

c59a739

Improve badge

e39aa93

xukai92 mentioned this pull request Oct 21, 2020

Fix adaptation of nominal step size for JitteredLeapfrog #220

Merged

sethaxen reviewed Oct 24, 2020

View reviewed changes

xukai92 and others added 2 commits October 27, 2020 12:08

Update src/trajectory.jl

68d1452

Co-authored-by: Seth Axen <seth.axen@gmail.com>

Update src/trajectory.jl

0e9200f

Co-authored-by: Seth Axen <seth.axen@gmail.com>

sethaxen mentioned this pull request Nov 5, 2020

Reduce number of allocations #224

Open

yebai requested changes Jan 6, 2021

View reviewed changes

xukai92 and others added 2 commits January 8, 2021 14:00

resolve conflicts

bde50d8

Apply suggestions from code review

3c06f2d

Co-authored-by: Hong Ge <hg344@cam.ac.uk>

xukai92 mentioned this pull request Jan 8, 2021

Refactoring termination criterion #240

Merged

xukai92 closed this Jan 12, 2021

yebai deleted the kx/unify-trajectory-1 branch March 11, 2022 14:18

		@@ -0,0 +1,63 @@
		using Test, Random, BSON, AdvancedHMC, Distributions, ForwardDiff

		is_ef6de39 = isdefined(AdvancedHMC, :EndPointTS)

	HMC(int::AbstractIntegrator, L) = HMC{MetropolisTS}(int, L)
	HMC(int::AbstractIntegrator, L) = HMCKernel(Trajectory(int, FixedNSteps(L)), MetropolisTS)

	HMC(ϵ::AbstractScalarOrVec{<:Real}, L) = HMC{MetropolisTS}(Leapfrog(ϵ), L)
	HMC(ϵ::AbstractScalarOrVec{<:Real}, L) = HMCKernel(Trajectory(Leapfrog(ϵ), FixedNSteps(L)), MetropolisTS)

	abstract type AbstractKernel end
	abstract type AbstractMCMCKernel end

Unifying trajectories #214

Unifying trajectories #214

Conversation

xukai92 commented Aug 5, 2020 • edited

xukai92 commented Aug 5, 2020 • edited

codecov bot commented Aug 5, 2020 • edited

Codecov Report

yebai left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yebai commented Aug 8, 2020

yebai left a comment

Choose a reason for hiding this comment

xukai92 commented Aug 31, 2020

sethaxen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yebai left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yebai commented Jan 6, 2021

xukai92 commented Jan 6, 2021

xukai92 commented Aug 5, 2020 •

edited

xukai92 commented Aug 5, 2020 •

edited

codecov bot commented Aug 5, 2020 •

edited

yebai left a comment •

edited