Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more statistics and move jitter to transition #124

Merged
merged 22 commits into from Nov 7, 2019
Merged

Add more statistics and move jitter to transition #124

merged 22 commits into from Nov 7, 2019

Conversation

sethaxen
Copy link
Member

@sethaxen sethaxen commented Nov 3, 2019

See #123 for discussion

Adds the following statistics:

  • hamiltonian_energy_error
  • max_hamiltonian_energy_error
  • is_adapt
  • nom_step_size

To-do:

  • test that stats are returned, in particular that is_adapt starts as true and switches to false after adaptation is over
  • refactor code around jitter so that the step size used is accessible for stats. I'd appreciate input on this.

I'll submit a PR to Turing so that the new stats fields show up in internals instead of parameters. Looks like Turing doesn't use AdvancedHMC's sample function, so I'll need to add an is_adapt over there.

@sethaxen
Copy link
Member Author

sethaxen commented Nov 3, 2019

I'm not certain why Travis isn't detecting this PR

@xukai92 xukai92 closed this Nov 3, 2019
@xukai92 xukai92 reopened this Nov 3, 2019
@xukai92
Copy link
Member

xukai92 commented Nov 3, 2019

Just see if reopening it will trigger Travis.

@sethaxen
Copy link
Member Author

sethaxen commented Nov 3, 2019

Thanks! Any advice on how to handle jitter?

@sethaxen
Copy link
Member Author

sethaxen commented Nov 3, 2019

What if the integrator steps returned a Transition instead of a PhasePoint? That would enable other experimental integrators to also return stats we don't know about.

@xukai92
Copy link
Member

xukai92 commented Nov 3, 2019

I prefer not because Transition is designed to store a transition from a MCMC kernel.

How about the following

  1. define stats(::AbstractLeapfrog) = ... (similar to L10 and L11 for jitter and temper)
    • This can then be used in transition to grab info from integrator
  2. add another filed ϵ0 to JitteredLeapfrog and make it mutable
  3. change jitter to jitter! to "refresh" ϵ using ϵ0

How do you think?

@sethaxen sethaxen mentioned this pull request Nov 3, 2019
@sethaxen
Copy link
Member Author

sethaxen commented Nov 4, 2019

That's a convenient way to do it. I implemented it.

I misunderstood what PyMC3's step_size_bar is. It's an internal parameter during the adaptation phase. step_size in PyMC3 and Stan indicates the nominal (unjittered) step size, and they don't output the jittered one. Still, I think that would be useful to have in stats. I think I'll keep the nominal step size as step_size and for JitteredLeapfrog only output step_size_jittered.

Adaptation uses the nominal step size, so I'll also add a nominal_step_size method. It also seems that a common pattern is NesterovDualAveraging(0.8, integrator.ϵ). I'm also planning to add something like

NesterovDualAveraging::AbstractFloat, i::AbstractIntegrator) = NesterovDualAveraging(δ, nominal_step_size(i))

src/trajectory.jl Outdated Show resolved Hide resolved
src/trajectory.jl Outdated Show resolved Hide resolved
@xukai92
Copy link
Member

xukai92 commented Nov 5, 2019

Adaptation uses the nominal step size, so I'll also add a nominal_step_size method.

Can you explain a bit more? I don't quite catch this point.

It also seems that a common pattern is NesterovDualAveraging(0.8, integrator.ϵ). I'm also planning to add something like

Sounds good to me.

@sethaxen
Copy link
Member Author

sethaxen commented Nov 5, 2019

Adaptation uses the nominal step size, so I'll also add a nominal_step_size method.

Can you explain a bit more? I don't quite catch this point.

So when one sets the step size, it's really setting a "nominal" step size, since if one jitters for example, the step size can change from step to step. It's the nominal step size that should get adapted (this is how Stan does it). You'd also want to use the nominal step size for HMCDA. If there's a more descriptive term than "nominal", I'm happy to use it; that's just the term Stan uses.

@sethaxen
Copy link
Member Author

sethaxen commented Nov 5, 2019

One more point tangential to this PR is that the way Stan and (I think) PyMC3 implement HMC, the step size is only jittered at the beginning of the transition, not at every step in the transition. As currently implemented, JitteredLeapfrog is going to jitter every step.

src/integrator.jl Outdated Show resolved Hide resolved
src/trajectory.jl Show resolved Hide resolved
src/trajectory.jl Outdated Show resolved Hide resolved
@xukai92
Copy link
Member

xukai92 commented Nov 6, 2019

One more point tangential to this PR is that the way Stan and (I think) PyMC3 implement HMC, the step size is only jittered at the beginning of the transition, not at every step in the transition. As currently implemented, JitteredLeapfrog is going to jitter every step.

You are right. It's the case for NUTS but HMC and HMCDA are fine as there is only a single call for the integrator. But let's also do jitter one time for a transition by moving it to the beginning of transition. It's fine now as we only have leapfrog integrator and we can worry about generalise it later.

Also I think what we could improve is that we don't need make JitteredLeapfrog mutable anymore because for jitter we can just return a new leapfrog and use that within the transition function. So it's acutally has a nice side effect :)

src/trajectory.jl Outdated Show resolved Hide resolved
@sethaxen
Copy link
Member Author

sethaxen commented Nov 6, 2019

But let's also do jitter one time for a transition by moving it to the beginning of transition.

Also I think what we could improve is that we don't need make JitteredLeapfrog mutable anymore because for jitter we can just return a new leapfrog and use that within the transition function. So it's acutally has a nice side effect :)

So just to be clear, you're recommending 1) making JitteredLeapfrog immutable again and renaming jitter! back to jitter 2) removing jitter from step, and 3) adding jitter to the beginning of transition, returning a new JitteredLeapfrog to be used within that transition. Did I get that right?

@xukai92
Copy link
Member

xukai92 commented Nov 6, 2019

Yes exactly!

@sethaxen sethaxen changed the title [WIP] Add more statistics Add more statistics and move jitter to transition Nov 6, 2019
@sethaxen
Copy link
Member Author

sethaxen commented Nov 6, 2019

Okay, unless there's anything else, I think this is ready to merge.

README.md Show resolved Hide resolved
src/integrator.jl Show resolved Hide resolved
src/integrator.jl Show resolved Hide resolved
src/trajectory.jl Show resolved Hide resolved
src/trajectory.jl Outdated Show resolved Hide resolved
@xukai92
Copy link
Member

xukai92 commented Nov 7, 2019

PR looks good to me now!

@xukai92
Copy link
Member

xukai92 commented Nov 7, 2019

I will merge it in 2 hours if no objection.

@xukai92 xukai92 merged commit 4b9c12c into TuringLang:master Nov 7, 2019
@sethaxen sethaxen mentioned this pull request Nov 7, 2019
@sethaxen sethaxen deleted the sethaxen/more_stats branch November 7, 2019 17:07
sethaxen added a commit to arviz-devs/ArviZ.jl that referenced this pull request Nov 7, 2019
sethaxen added a commit to salilab/hmc that referenced this pull request Nov 7, 2019
Add AdvancedHMC's new stats in TuringLang/AdvancedHMC.jl#124 following new ArviZ schema in arviz-devs/arviz#854.
sethaxen added a commit to arviz-devs/ArviZ.jl that referenced this pull request Nov 11, 2019
* Reformat

* Export from_cmdstan

* Document why from_xyz are not always exported

* Forward dataset functions

* Explicitly call Pandas

* Remove unnecessary function

* Reimplement conversion from Chains

This brings more complete support for Stan-style chains parsing and forwards metadata in info to InferenceData

* Extract groups from parameters with keys

* Add/move utility functions

* Add core function to make inf data from dicts

* Reformat

* Refactor conversion from MCMCChains

* Remove defunct test

* Empty commit

* Add new HMC stat conversions

Add new HMC statistics in TuringLang/AdvancedHMC.jl#124, using conventions in  arviz-devs/arviz#854

* Convert Stan style names with indexes to Turing style

* Don't assume dict type

* Adjust dims for log likelihood

* Enforce types of ArviZ's stats entries

* Incrementally build InferenceData

* Keep column ordering in Pandas

* Maintain structure of summary info

* Reformat

* Add nothing case

* Add reshape rule for 3d array

* Rename chn to chns

* Reimplement section_dict

* Remove unused indexify commands

* Reformat

* Parse comma-separated index style

* Add initial MCMCChains tests

* Remove isnothing (incompatible with Julia 1.0)

* Use dims/coords in test

* Test multivariate case

* Add internals to test

* Test all groups in chain conversion

* Reimplement chain to inferencedata/dataset conversions

* Test conversion to inferencedata/datasets

* Add test for missing -> NaN

* Simplify and better document array reshaping

* Use existing method

* Remove unused function

* Add test for info conversion

* Add CmdStan tests

* Skip CmdStan test on Julia 1.0

* Document from_mcmcchains

* Add function for converting dataset to dict

* Reimplement from_mcmcchains to support combinations of types

* Forward to from_mcmcchains

* Document functions

* Test dict/dataset conversion
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants