Add Trace_MMD class, add tests that MMD correctly fits distributions #1818

varenick · 2019-04-10T17:22:45Z

Adds a feature #1780

CLAassistant · 2019-04-10T17:22:52Z

All committers have signed the CLA.

eb8680

@varenick thanks for the PR! I have a few questions/comments. Would you also be able to add a simple standalone example script, either here or in a followup PR?

pyro/infer/trace_mmd.py

tests/infer/test_inference.py

pyro/infer/trace_mmd.py

eb8680 · 2019-04-10T22:35:02Z

pyro/infer/trace_mmd.py

+        or a dict that maps latent variable names to instances of :class: `pyro.contrib.gp.kernels.kernel.Kernel`.
+        In the latter case, different kernels are used for different latent variables.
+
+    :param mmd_scale: A scaling factor for MMD terms.


Does this need to be separate? Shouldn't scaling be handled within the kernels (via e.g. pyro.contrib.gp.kernels.VerticalScaling) or in the learning rate during optimization?

It is ok to handle scaling at initialization (simply with variance argument of Kernel instance), but I am not sure, if it would be convenient to handle scale change during optimization process via accessing kernel arguments. Also, not every Kernel instance has variance property.

pyro/infer/trace_mmd.py

…xample of MMD-VAE loss

varenick · 2019-04-15T18:05:09Z

@eb8680 Here's an example jupyter notebook with MMD-VAE model implemented with Trace_MMD class. Link to the cloud: https://yadi.sk/d/UHGUQLh834dX8A

pyro/infer/trace_mmd.py

eb8680 · 2019-04-19T00:11:15Z

pyro/infer/trace_mmd.py

+                        model_samples = independent_model_site['value']
+                        guide_samples = guide_site['value']
+                        model_samples = model_samples.view(
+                            -1, *[model_samples.size(j) for j in range(-independent_model_site['fn'].event_dim, 0)]


Hmm, I don't think what you've implemented here, where you're computing MMD across plate slices, is valid for models and guides with any nontrivial dependency structure. It happens to correspond to the objective in the InfoVAE paper for the InfoVAE graphical model because all the latent variables in that model are independent, but is not correct in general.

You'll need to do what you were doing before and treat only the particle dimension as a batch dimension, but compute it explicitly following my suggestion:

particle_dim = -self.max_plate_nesting - independent_model_site["fn"].event_dim model_samples = independent_model_site['value'] model_samples = model_samples.transpose(-model_samples.dim(), particle_dim) model_samples = model_samples.view(model_samples.shape[0], -1) # and similar for guide_samples

Unfortunately, this correct objective (which treats a probabilistic program as a big black box with no internal structure) does not correspond to the one in the InfoVAE paper. I suspect a general algorithm for MMD computation between arbitrary graphical models and mean-field guides that exploits all available conditional independence structure in the model, which would recover the InfoVAE objective as a trivial special case, would require kernels that decompose over Markov blankets in the model, rather than over individual variables.

It would probably look somewhat similar to this message-passing algorithm for Stein variational gradient descent or this message-passing algorithm for Jensen-Shannon divergences with neural density ratio estimators.

That's way beyond the scope of this PR, though - in fact, if you work out the general case you could write a nice paper about it. Note also that @fritzo and I are working on a new backend for Pyro that should make implementing such algorithms significantly easier, though it won't be ready for a few months.

You are right, I see that my implementation doesn't work correctly for the general case. I see two possible ways to move on:

Explicitly indicate for which special cases my class does work correctly.

Elaborate the general case.

I would like to choose the 2nd one, however, my contribution is a part of an educational project of my Masters degree. So, it is better for me to put some endpoint until the deadline comes, which is the 7th of May (may be, plus several days). However, if the PR won't be approved by this deadline, there will be no catastrophe, I will just receive a lower mark.

Is it acceptable to choose the 1st alternative? If it is, it seems to me, that my class works correctly if every batch dimension is marked by plate. Am I right?

Is it acceptable to choose the 1st alternative? If it is, it seems to me, that my class works correctly if every batch dimension is marked by plate. Am I right?

Sorry, I'm not sure I understand your disclaimer. It would probably be helpful for you to write out the math and describe exactly the computations you're performing in your current implementation, the assumptions you're implicitly making about the model and guide, and the precise circumstances under which your MMD estimators are unbiased.

Without something like that to refer to, I don't think we can accept the PR as is, given that it's probably not correct for any models with nontrivial plate structure. Compare that with the situation of TraceMeanField_ELBO, which only works with mean-field guides but is otherwise provably correct for all reparametrizable models with static control flow. I also don't think it's reasonable for us to expect you to work out (or even promise to work out) the general case for MMD between arbitrary graphical models just to get a first PR merged, though I would certainly encourage you to try outside this PR if you're interested.

Instead of having to do a lot of extra work, then, here are two simpler alternatives that would build on all the good work you've done already to get your code merged by May 7:

Follow my suggestion above and only compute MMD across the particle dimension, not all plate dimensions; that's guaranteed to be correct.

Put the code you wrote here into the nice example notebook you've already written and repurpose it as an advanced tutorial/example on implementing custom model-specific objectives in Pyro.

Either of these would be fine with me and would make great first contributions. I suspect other Pyro users would really appreciate a more thorough and well-motivated custom objective tutorial (specialized to the InfoVAE) that we could feature prominently on our example web site, since that's a very common problem faced by researchers and the only relevant tutorial we have now is not very detailed or thorough and does not contain a working end-to-end example.

It seems that I have already missed the deadline, however, I still want to get the job done. I have modified the class following your suggestion.

@varenick sorry for the delayed review, I'll try to get to this sometime this week

eb8680

@varenick sorry again for the delay. I'm going to go ahead and merge this. Would you mind (1) updating your example notebook to use this version of the loss, (2) confirming that it produces similar reconstructions/samples with sufficiently high num_particles, (3) converting it to a .py script and (4) opening another PR with the example script and a corresponding .rst stub in the examples directory? That way we can include it on the examples webpage.

varenick added 2 commits April 10, 2019 16:52

Add Trace_MMD class, add tests that MMD correctly fits distributions

e18a7ab

Merge branch 'dev' into mmd_elbo

995a69e

fix: remove unused variables in test_inference.py

dbf425c

eb8680 reviewed Apr 10, 2019

View reviewed changes

eb8680 added the awaiting response label Apr 12, 2019

varenick added 2 commits April 13, 2019 22:20

fix: bugs when vectorized_particles=True, minor bugs; add: explicit e…

d97eaf1

…xample of MMD-VAE loss

fix: minor bugs

a1f9bc9

eb8680 added awaiting review and removed awaiting response labels Apr 16, 2019

eb8680 reviewed Apr 19, 2019

View reviewed changes

eb8680 added awaiting response and removed awaiting review labels Apr 19, 2019

varenick added 3 commits April 24, 2019 17:07

fix: incorrect behaviour of _compute_mmd when vectorize_particles=True

141fe18

small bug fix; add disclaimer to docstring

f162323

fix: treat only particle_dim as batch_dim

0054abe

eb8680 added awaiting review and removed awaiting response labels May 22, 2019

eb8680 approved these changes Jul 15, 2019

View reviewed changes

eb8680 merged commit daeb066 into pyro-ppl:dev Jul 15, 2019

eb8680 mentioned this pull request Aug 8, 2019

[Feature Request] Analogue to TraceELBO class, but with MMD instead of KL #1780

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Trace_MMD class, add tests that MMD correctly fits distributions #1818

Add Trace_MMD class, add tests that MMD correctly fits distributions #1818

varenick commented Apr 10, 2019 •

edited

Loading

CLAassistant commented Apr 10, 2019 •

edited

Loading

eb8680 left a comment

eb8680 Apr 10, 2019

varenick Apr 13, 2019

varenick commented Apr 15, 2019 •

edited

Loading

eb8680 Apr 19, 2019

varenick May 1, 2019

eb8680 May 2, 2019 •

edited

Loading

varenick May 10, 2019

varenick Jul 9, 2019

eb8680 Jul 9, 2019

eb8680 left a comment

Add Trace_MMD class, add tests that MMD correctly fits distributions #1818

Add Trace_MMD class, add tests that MMD correctly fits distributions #1818

Conversation

varenick commented Apr 10, 2019 • edited Loading

CLAassistant commented Apr 10, 2019 • edited Loading

eb8680 left a comment

Choose a reason for hiding this comment

eb8680 Apr 10, 2019

Choose a reason for hiding this comment

varenick Apr 13, 2019

Choose a reason for hiding this comment

varenick commented Apr 15, 2019 • edited Loading

eb8680 Apr 19, 2019

Choose a reason for hiding this comment

varenick May 1, 2019

Choose a reason for hiding this comment

eb8680 May 2, 2019 • edited Loading

Choose a reason for hiding this comment

varenick May 10, 2019

Choose a reason for hiding this comment

varenick Jul 9, 2019

Choose a reason for hiding this comment

eb8680 Jul 9, 2019

Choose a reason for hiding this comment

eb8680 left a comment

Choose a reason for hiding this comment

varenick commented Apr 10, 2019 •

edited

Loading

CLAassistant commented Apr 10, 2019 •

edited

Loading

varenick commented Apr 15, 2019 •

edited

Loading

eb8680 May 2, 2019 •

edited

Loading