Rejection sampling variational inference #819

cavaunpeu · 2018-01-07T00:26:40Z

WIP. Addresses #379.

dustinvtran · 2018-01-15T20:30:57Z

Thanks for working on this! Ping me whenever you'd like some feedback.

cavaunpeu · 2018-01-15T20:39:40Z

Will do! It's very WIP for now. I want to get a thing working, then clean it up considerably.

…what is going on. tensorflow: you exhaust me

cavaunpeu · 2018-01-21T16:15:12Z

@dustinvtran, @naesseth

Hi. A few updates/questions:

I implemented a KucukelbirOptimizer as given in Equation 9 of the paper. (Too lazy to search for its real name, I called it this.) I put it in edward/optimizers/sgd.py and conformed its API to that of the other optimizer types that VariationalInference.initialize expects; it's a simple interface with an apply_gradients method. Eventually, this is something to robust-ify and merge upstream into Tensorflow, I'd think (if it's not there already). I've added an integration test as well.
My next step is to implement a ReparameterizedRejectionSampler object. This doesn't need to know anything about VI. The motivation is the ability to test it separately.

After this, everything should be on the right track.

Thus far, I've worked on build_rejection_sampling_loss_and_gradients, as a corollary to build_reparam_loss_and_gradients. Would the gradients given in the paper easily extend to a

build_rejection_sampling_kl_loss_and_gradients
build_rejection_sampling_entropy_loss_and_gradients

as well? I haven't taken the time to think this through.

How does this all sound? Thanks!

cavaunpeu · 2018-01-21T16:19:06Z

edward/optimizers/sgd.py

+    self.s_n = s_n
+    self.n = n
+
+  def apply_gradients(self, grads_and_vars, global_step=None):


@dustinvtran

I'd quite appreciate if you could glance at this method as well, as my integration test passes on some days and fails on others — with 0 changes to my code. Promise 🤞.

cavaunpeu · 2018-01-30T03:38:40Z

Hey @dustinvtran, @naesseth. I've:

Removed the "Alp" sampler, and specified 'rmsprop' with a decaying learning rate.
Unit-tested a GammaRejectionSampler object.
Unit-tested build_rejection_sampling_loss_and_gradients.

In the latter two, I pinned results to those computed in this notebook and vetted them thoroughly.

My gradients are still exploding. Might you have a chance to give some 👀 this week?

ghost · 2018-02-07T18:05:56Z

Hey @dustinvtran. Following up :)

dustinvtran · 2018-02-07T19:15:22Z

Apologies for the delay! Busy for ICML stuff due this Friday. Maybe ping me this weekend? :)

ghost · 2018-02-07T22:12:04Z

Ah, no worries! Will do.

The PR is in a good state, I think. Just need to close the gap between "unit-testing 1 iteration of RSVI gradient calculations passes" and "it just works". I'm guessing there's some slight "magic" with learning rates, supports, or the like that I'm missing. This is my hunch.

dustinvtran

This is great work! I think the testing as is is fine.

I like your API for rejection sampling for Gamma. In future work after merging this, it would be nice to move upstream to TensorFlow distributions. There, we can think a bit harder about how to incorporate various forms of reparameterized samplers in the tf.contrib.distributions.Gamma.

Edward 2.0's ed.klqp will incorporate the loss function you wrote down here.

dustinvtran · 2018-02-11T04:40:04Z

edward/inferences/inference.py

@@ -144,6 +143,7 @@ def run(self, variables=None, use_coordinator=True, *args, **kwargs):

    for _ in range(self.n_iter):
      info_dict = self.update()
+      print(info_dict)


dustinvtran · 2018-02-11T04:40:19Z

edward/inferences/inference.py

@@ -123,7 +123,6 @@ def run(self, variables=None, use_coordinator=True, *args, **kwargs):
        Passed into `initialize`.
    """
    self.initialize(*args, **kwargs)


add back newline? unrelated to PR

dustinvtran · 2018-02-11T04:40:45Z

edward/inferences/klpq.py

@@ -32,7 +32,7 @@ class KLpq(VariationalInference):

  with respect to $\\theta$.

-  In conditional inference, we infer $z` in $p(z, \\beta
+  In conditional inference, we infer $z$ in $p(z, \\beta


This is unrelated to this PR. Can you make a new PR to fix this?

dustinvtran · 2018-02-11T04:43:30Z

edward/inferences/klqp.py

+      tf.summary.scalar("loss/reg_penalty", reg_penalty,
+                        collections=[inference._summary_key])
+
+    g_rep = tf.gradients(rep, var_list)


Can you explain why you need the multiple gradient calls and not just one? This seems inefficient.

cavaunpeu · 2018-02-11T18:51:16Z

Hey @dustinvtran. Thanks for the feedback. I left this in "debug" state (print statements, etc.), as it doesn't yet work. Specifically, the main integration test, _test_poisson_gamma, does not pass: the gradients explode to np.nan.

Any insight as to why this might be? I'm guessing I'm lacking some expertise in getting this to work. NB: the "single pass gradient computation" integration test, _test_build_rejection_sampling_loss_and_gradients, does pass (and has been thoroughly cross-checked with the Blei-lab notebook).

Also, I think there are other code-organizational considerations to be made before readying this for merge. For instance, what happens when we have > 1 latent variables, where some require a rejection sampler and others don't?

In short: I think I need some help :)

cavaunpeu · 2018-02-20T23:26:22Z

Ping, @dustinvtran :) What's the best way forward? Have a moment to review in the coming days? Anyone else I should reach out to?

Cheers :)

dustinvtran · 2018-02-21T00:27:46Z

Not sure if I know enough about the algorithm to help unfortunately. What happens if you try 1000 samples per iteration? Maybe @naesseth can reply?

ghost · 2018-02-22T02:21:55Z

Cool! Will ping them here.

@naesseth @slinderman have a moment to help get this merged? If you're still in NYC, happy to come meet in person as well, so as to get this one done!

slinderman · 2018-02-22T02:44:03Z

Hi @williamabrwolf, I’m traveling now but I’ll be back in NYC mid March. Happy to meet then if you’d like. I’ll have limited cycles between now and then but I’ll try to take a look at the code here. Perhaps Christian can also help out in the meantime. Glad to see you’re integrating this into Edward!

ghost · 2018-02-22T03:50:08Z

@slinderman mid-March works great. Grateful for the help, and happy to meet then if we haven't resolved. williamabrwolf@gmail.com is me. Cheers :).

naesseth · 2018-02-22T18:02:42Z

I'll try and take a look early next week, currently traveling.

ghost · 2018-03-10T14:34:14Z

@slinderman @naesseth hey all. back in NYC? would love to meet for 2 hours and get this to a place where we can soon merge.

ghost · 2018-03-21T18:17:56Z

Hey @slinderman, @naesseth. Following up :). Back in the city? Happy to travel to you to make this happen.

Cheers!

slinderman · 2018-03-21T18:31:05Z

Hey Will, I'm around this week. Send me an email directly and we'll set up a time to meet. Best, Scott

…

On Wed, Mar 21, 2018 at 2:18 PM, Will Wolf ***@***.***> wrote: Hey @slinderman <https://github.com/slinderman>, @naesseth <https://github.com/naesseth>. Following up :). Back in the city? Happy to travel to you to make this happen. Cheers! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#819 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFXwKME1Zfo3UM94FQLAExQWnZo03PSLks5tgplggaJpZM4RVfTl> .

ghost · 2018-06-05T22:29:41Z

edward/inferences/klqp.py

+    g_cor = tf.gradients(cor, var_list)
+    g_entropy = tf.gradients(q_entropy, var_list)
+
+    grad_summands = zip(*[g_rep, g_cor, g_entropy])


Can we try dropping g_cor from this summand and see if tests still pass?

Expected behavior: pass at a higher tolerance, but not blow up.

This is a possible culprit re: why gradients are exploding in running _test_poisson_gamma.

With a reasonably small step size, maybe 100 epochs.

Worth keeping an eye on g_entropy:

First, try g_rep and g_entropy

Next, try just g_rep

Print all the gradient terms from notebook as well.

"With a reasonably small step size, maybe 100 epochs." --> i.e. it should pass "with a reasonably small step size, and run for maybe 100 epochs."

William Wolf added 7 commits January 2, 2018 21:41

fix typos in docstring

4efb780

add multinomial-dirichlet test, empty RejectionSamplingKLqp class

7e43d1b

Merge branch 'master' into rejection-sampling-variational-inference

d673763

remove sample_shape=1

7a5f90e

add poisson-gamma test

94a1bc3

WIP: begin to implement RSVI logic

a4c87cc

WIP: implement RSVI gradients

163414c

William Wolf added 19 commits January 18, 2018 23:40

add scrap notebook with gradient update algo

f162135

unit test gradient update algo in notebook

2f96076

unit test gradient update algo to 3 iterations

2c1162b

test_kucukelbir_grad passes

ad25f6d

correction: test_kucukelbir_grad passes

7e4a9ce

cleanup (still skeptical this test works, as it seems almost stochastic

8dc4f4f

move test_kucukelbir_grad to separate file

0aae8ed

add KucukelbirOptimizer

70172fb

pass n, s_n into KucukelbirOptimizer constructor

929e25c

looking forward to seeing if this passes CI. locally, i have no idea …

95d9774

…what is going on. tensorflow: you exhaust me

slightly more confidence

c212858

set trainable=False

81637fb

initialize n to 0

7aec66c

assert in loop

dda7f26

add dummy parameter global_step for temporary compatibility

2a4ccc8

add KucukelbirOptimizer

8f69548

2-space indent

26f8ed8

use KucukelbirOptimizer

c7f3ea1

cleanup

435ec01

cavaunpeu commented Jan 21, 2018

View reviewed changes

William Wolf added 14 commits January 22, 2018 22:09

add log_prob_s to GammaRejectionSampler

ef45bc3

add citation to docstring

b94ef73

add guts of RSVI, integrating w.r.t. z

a136f9d

parametrize sampler with density

680894b

pass density to rejection sampler; return gradients

47ba81c

dict_swap[z] comes from rejection sampler, not qz

26f0c32

delete gamma_rejection_sampler_vars

7b997e1

delete TODO

6108125

WIP: _test_build_rejection_sampling_loss_and_gradients

77e9a6c

WIP: _test_build_rejection_sampling_loss_and_gradients

3846fa6

WIP: _test_build_rejection_sampling_loss_and_gradients

23c33af

WIP: _test_build_rejection_sampling_loss_and_gradients

4c481a0

WIP: _test_build_rejection_sampling_loss_and_gradients

00c9325

pep8

40d3808

dustinvtran reviewed Feb 11, 2018

View reviewed changes

ghost reviewed Jun 27, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rejection sampling variational inference #819

Rejection sampling variational inference #819

cavaunpeu commented Jan 7, 2018

dustinvtran commented Jan 15, 2018

cavaunpeu commented Jan 15, 2018

cavaunpeu commented Jan 21, 2018 •

edited

cavaunpeu Jan 21, 2018

cavaunpeu commented Jan 30, 2018 •

edited

ghost commented Feb 7, 2018

dustinvtran commented Feb 7, 2018

ghost commented Feb 7, 2018

dustinvtran left a comment

dustinvtran Feb 11, 2018

dustinvtran Feb 11, 2018

dustinvtran Feb 11, 2018

dustinvtran Feb 11, 2018

cavaunpeu commented Feb 11, 2018

cavaunpeu commented Feb 20, 2018

dustinvtran commented Feb 21, 2018

ghost commented Feb 22, 2018

slinderman commented Feb 22, 2018

ghost commented Feb 22, 2018

naesseth commented Feb 22, 2018

ghost commented Mar 10, 2018 •

edited by ghost

ghost commented Mar 21, 2018

slinderman commented Mar 21, 2018 via email

ghost Jun 5, 2018

cavaunpeu Jun 27, 2018

Rejection sampling variational inference #819

Are you sure you want to change the base?

Rejection sampling variational inference #819

Conversation

cavaunpeu commented Jan 7, 2018

dustinvtran commented Jan 15, 2018

cavaunpeu commented Jan 15, 2018

cavaunpeu commented Jan 21, 2018 • edited

cavaunpeu Jan 21, 2018

Choose a reason for hiding this comment

cavaunpeu commented Jan 30, 2018 • edited

ghost commented Feb 7, 2018

dustinvtran commented Feb 7, 2018

ghost commented Feb 7, 2018

dustinvtran left a comment

Choose a reason for hiding this comment

dustinvtran Feb 11, 2018

Choose a reason for hiding this comment

dustinvtran Feb 11, 2018

Choose a reason for hiding this comment

dustinvtran Feb 11, 2018

Choose a reason for hiding this comment

dustinvtran Feb 11, 2018

Choose a reason for hiding this comment

cavaunpeu commented Feb 11, 2018

cavaunpeu commented Feb 20, 2018

dustinvtran commented Feb 21, 2018

ghost commented Feb 22, 2018

slinderman commented Feb 22, 2018

ghost commented Feb 22, 2018

naesseth commented Feb 22, 2018

ghost commented Mar 10, 2018 • edited by ghost

ghost commented Mar 21, 2018

slinderman commented Mar 21, 2018 via email

ghost Jun 5, 2018

Choose a reason for hiding this comment

cavaunpeu Jun 27, 2018

Choose a reason for hiding this comment

cavaunpeu commented Jan 21, 2018 •

edited

cavaunpeu commented Jan 30, 2018 •

edited

ghost commented Mar 10, 2018 •

edited by ghost