Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(inference): add nuts sampler #462

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

chentoast
Copy link
Contributor

@chentoast chentoast commented Apr 13, 2022

work in progress. implements a no u turn sampler but does not implement a dual averaging adaption process for setting epsilon, which can be done as follow up work. code generally needs to be cleaned up a little as well

todo:

  • docs
  • tests
  • calculate if a trajectory has diverged
  • add all sampler statistics necessary for dual averaging

@femtomc femtomc marked this pull request as draft April 14, 2022 05:08
@femtomc
Copy link
Contributor

femtomc commented Apr 14, 2022

Cool Tony! looks interesting -- I converted this to a draft PR, when you're ready to put it in the pipeline, you should be able to convert it back.

@chentoast chentoast marked this pull request as ready for review August 25, 2022 19:40
@chentoast chentoast changed the title [WIP] feat(inference): add nuts sampler feat(inference): add nuts sampler Sep 2, 2022
@chentoast
Copy link
Contributor Author

ping @femtomc :)

@femtomc
Copy link
Contributor

femtomc commented Sep 3, 2022

@chentoast Hi! I'll likely get to take a look at this sometime next week -- thank you for your work on this PR!

@femtomc femtomc self-requested a review September 19, 2022 00:15

for (name, val) in get_values_shallow(a)
out[name] = val + b[name]
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this function is intended to work if the choice maps have different shapes. Should we treat missing addresses in one choice map as "zero" for the value type of the choice map which does have the address?

It's unclear to me what the correct thing is here -- or whether you'd ever be adding choice maps with different address shapes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this invariant is enforced implicitly, it would be nice to document the code to indicate that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that makes sense. I can't see a way for the invariant to be violated - we are always applying these functions to values and momentum choicemaps that are defined to have the same mistake at the start of the function. Violating that invariant would involve somehow changing the selection while in the middle of a leapfrog step, which doesn't seem to be possible.

return (new_trace, SamplerStats(depth, tree.n, tree.diverging, true))
else
return (trace, SamplerStats(depth, tree.n, tree.diverging, false))
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small comment -- here, the normal interface for kernels is slightly violated (but in a way which I'm not unhappy about). I think normally, primitive kernels have a return signature of (Trace, accept_or_reject).

One interesting idea (to improve the information we can extract from our kernels) would be to create a new abstract type for the second element of the returned tuple, and allow definition of new types to be returned from kernels.

This change is a bit more drastic -- so I recommend that (for now), we remove SamplerStats from this PR -- and setup a new issue to handle the bigger question of how to better profile the kernel statistics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense - I'll do that in a follow-up PR.

end

return out
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like that this allocates a new choice map each time you call it -- but I need to examine and think about the code below further to suggest a better solution.

Even if we created a view-like structure, we'd still need to eat the allocations when we decide to modify (scale) the values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually implemented a version of this function that operated in place, but I didn't see noticeable performance benefits (at least, on my tiny toy problem). But I don't see any downsides of making it work in place, at least for now.

src/inference/nuts.jl Outdated Show resolved Hide resolved
src/inference/nuts.jl Outdated Show resolved Hide resolved
@chentoast
Copy link
Contributor Author

This pr is getting closer, but I would hold off on merging until I've had a chance to test it out on some more non-trivial models

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants