-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Approximate Inference #194
Refactor Approximate Inference #194
Conversation
Codecov Report
@@ Coverage Diff @@
## master #194 +/- ##
==========================================
+ Coverage 97.98% 98.05% +0.06%
==========================================
Files 10 10
Lines 348 359 +11
==========================================
+ Hits 341 352 +11
Misses 7 7
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broadly looks good -- just a couple of comments.
Unfortunately I think this is going to have to be a breaking change because one can no longer construct VFE
without any arguments. I guess it's time for 0.4.
We can add
I'm happy with this. If we find that it's confusing / causes problems for some reasons, we can always revert at a later date. |
Co-authored-by: willtebbutt <wt0881@my.bristol.ac.uk>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this now.
Unless @st-- has any objections, I propose that we bump the version of 0.4.0-DEV, so that we can remove deprecations / make any other breaking changes we need to make over the next couple of days, and aim to release 0.4.0 on Thursday.
Ohhh I still want to rename a bunch of things... not sure we'll manage that by Thursday though! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm generally happy with this, I think it's a good step forward. I've left a few minor suggestions for clean-up.
One larger question I have (but I'd be happy to discuss this separately/change it in another PR if you think that makes more sense): why posterior(VFE(z), fx, y)
and not, say, posterior(VFE(f(z)), x, y)
? I'm mainly wondering if you had thought this through and if so what conclusions you came to, or if you hadn't thought this through what pros/cons either way you might see:)
Co-authored-by: st-- <st--@users.noreply.github.com>
Cheers for the review :) The reason I did it that way is because you need to somehow pass both the observation noise for Also, it's more consistent with the exact |
For non-conjugate inference, would we then have something like |
Edit: Sorry, that's nonsense |
Could do -- I think we need to have some more discussion about this. We've also got the |
Awww, that's a shame; I thought that looked really good actually 😄 |
As discussed in person, let's just release this as 0.4.0. Ideally, we would wait with that until #190 has been merged & released as 0.3.9. (This is blocked, in turn, by JuliaGaussianProcesses/KernelFunctions.jl#344 and JuliaDiff/ChainRules.jl#496) |
Looking at this again, there is a problem with using I'm actually somewhat leaning towards reverting to the original version of: |
Hmmm, yes, good point. Thinking about this, I feel like we might be missing something from the API, but I'm not entirely sure what . I think that, once an approximation has been built and an The question, as I understand it, is
The only issue I have with your proposal is that you wind up with a For example, maybe we should be making the VFE(z, f(x), y; jitter)
SVGP(f(z, jitter), q) ? But then what do we replace logpdf(VFE(f(z), f(x), y), f(x), y)
logpdf(SVGP(f(z), q), f(x), y; kwargs...) both yield the appropriate elbo? Sorry, that's a bit of a brain dump. Do you have any more thoughts about the 1 vs 3 arg problem @rossviljoen ? Or perhaps I'm just being overly fussy, and we shouldn't worry about it for now? |
I think that summarises the problem perfectly. I think that your proposed constructors make the most conceptual sense, since
but then the I also agree that the entire point of this PR was to get a consistent API across all approximations. To that end, I'm in favour of the API looking like:
because that is also the most consistent with the non-approx API.
and simply define
? You're right though, it still doesn't quite feel 100% |
At this point in time, I think it's more important to having something in, so that we can crack on with building the other functionality, even if we refactor later when we realise there's a better way to do stuff.
I suppse that one could just delay computing stuff with
Would it not simply be edit: sorry, I misread your conclusion as to which thing to do. Lets go with the one you actually proposed, meaning that my second comment above is redundant. |
Ok, I've changed it again to use the style
as well as added assertions that fz.f == fx.f |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy for this to go in, once a minor version bump has been added and my comment addressed.
Implements the idea in #193 (see also JuliaGaussianProcesses/ApproximateGPs.jl#13)
Essentially, this replaces the current method for constructing approximate posteriors:
approx_posterior(VFE(), fx, y, fz)
with
posterior(VFE(z), fx, y)
(n.b. dropping the
approx_
prefix - should we keep it?)The reasoning is that using this style of approximation generalises more easily to cases where you need more complex approximations such as in SparseGPs - e.g. using a variational distribution
q
:posterior(SVGP(z, q), fx, y)
.Furthermore, it only requires passing the inducing inputs
z
instead of a fullfz isa FiniteGP
which enforces the fact thatfz
andfx
are constructed from the same underlying processfx.f
.It also moves the elbo and dtc from
finite_gp.jl
toapprox_posterior_gp.jl
and redefines them from:elbo(fx, y, fz)
to
elbo(VFE(z), fx, y)
As @willtebbutt mentioned in #193 - it would be useful to define other functions like
logpdf
andrand
on the approximation: