Vectorized map_data #62

eb8680 · 2017-07-20T08:18:59Z

Review, but needs some additional tests, do not merge yet

Should close #13

…de from replay_poutine

…aced with monkeypatching in base poutine

jpchen

a quick first look on my end. as you pointed out, lacking tests.

jpchen · 2017-07-20T08:20:44Z

pyro/poutine/poutine.py

        """
        Default pyro.map_data Poutine behavior
        """
-        if self.transparent and prev_val is not None:
+        if self.transparent and not (prev_val is None):


is there a reason for this change?

jpchen · 2017-07-20T08:22:03Z

pyro/poutine/poutine.py

+        else:
+            if batch_size is None:
+                batch_size = 0
+            assert batch_size >= 0, "cannot have negative batch sizes"


should we have batch_size default to 0 if it's negative?

jpchen · 2017-07-20T08:24:59Z

pyro/poutine/poutine.py

+
+                def scaled_sample(_prev_val, _name, _fn, *args, **kwargs):
+                    return old_sample(_prev_val, _name,
+                                      pyro.util.rescale_dist(_fn, scale),


import rescale_dist from pyro.util so you dont have to write it out (as other classes do as well)

jpchen · 2017-07-20T08:32:30Z

pyro/poutine/poutine.py

+
+                self._pyro_sample = scaled_sample
+                ret = list(map(lambda ix: fn(*ix), [(i, data[i]) for i in ind]))
+                self._pyro_sample = old_sample


can you add a comment here explaining multiplying the scaling factor then resetting it

jpchen · 2017-07-20T08:37:10Z

pyro/poutine/trace.py

@@ -40,14 +40,19 @@ def add_observe(self, name, val, fn, obs, *args, **kwargs):
        self[name] = site
        return self

-    def add_map_data(self, name, data, fn):
+    def add_map_data(self, name, fn, batch_size, scale, ind, **kwargs):


is there a reason you added **kwargs but not *args?

jpchen · 2017-07-20T08:38:10Z

pyro/poutine/trace.py

+        site["scale"] = scale
+        site["fn"] = fn
+        # site["value"] = val  # XXX too large to store
+        # site["args"] = ((), kwargs)


whats this for?

jpchen · 2017-07-20T08:43:25Z

pyro/util.py

+        def new_log_pdf(*args, **kwargs):
+            return old_log_pdf(*args, **kwargs) * scale
+
+        new_fn = copy.copy(fn)  # XXX incorrect?


is shallow sufficient for this?

jpchen · 2017-07-20T08:44:34Z

pyro/util.py

+    if hasattr(fn, "log_pdf"):
+        old_log_pdf = fn.log_pdf
+
+        def new_log_pdf(*args, **kwargs):


scaled_log_pdf might be a more descriptive name

jpchen · 2017-07-20T08:45:56Z

pyro/util.py

+        new_fn.log_pdf = new_log_pdf
+        return new_fn
+    else:
+        # XXX should raise an error here?


or at the very least a warning since it might help catch bugs that fall here

martinjankowiak · 2017-07-20T16:14:06Z

pyro/poutine/poutine.py

+
+                old_sample = self._pyro_sample
+
+                def scaled_sample(_prev_val, _name, _fn, *args, **kwargs):


either here, or in rescale_dist, check for scale == 1.0 and avoid unnecessary wrapping?

martinjankowiak · 2017-07-20T16:19:37Z

pyro/poutine/poutine.py

+            if batch_size is None:
+                batch_size = 0
+            assert batch_size >= 0, "cannot have negative batch sizes"
+            if hasattr(fn, "__map_data_indices") and \


isn't it strange to store this information as attributes of fn?

Yes, very strange, but I'm not sure if there's a more elegant way to do it. Naively, we'd want to put the rescaling and subsampling into a new Poutine subclass, but because map_data is implemented in the base Poutine, you'd be using an instance of a child class inside one of the base class methods. I imagine there's a way to do this, since I'm pretty sure mutual recursion with functions is OK, but I actually tried this initially and got a bunch of mysterious ImportErrors related to that which I was unable to clear up, so here we are.

Suggestions welcome!

can't it be an additional return value from map_data?

martinjankowiak · 2017-08-03T23:10:21Z

@jpchen @eb8680 what do we need to finish up and merge this PR?

eb8680 · 2017-08-05T22:58:49Z

Mainly needs some tests, currently the map_data sites in the existing inference tests are (almost?) all maps over lists/tuples rather than tensors.

jpchen · 2017-08-08T06:16:16Z

@eb8680 since youre away this week, @OptimusLime or i can help with the tests, but the rest of the comments should probably be addressed by you

…dev in, then refactored scaling. Scaling now lives in its own Poutine hidden inside TracePoutine, since that's the only place rescaled log_pdf can ever get called from.

…es sometimes, but why???

…th conservative downward behavior

martinjankowiak · 2017-08-30T15:14:17Z

@eb8680 awesome! i look forward to going through this [ and proving it still doesn't work ;) ]

…, and updated call signatures of primitives in tracegraphpoutine

martinjankowiak · 2017-09-05T18:25:21Z

pyro/__init__.py

+        # default behavior
+        if isinstance(data, (torch.Tensor, Variable)):  # XXX and np.ndarray?
+            if batch_size > 0:
+                if not hasattr(fn, "__map_data_indices"):


remove the attribute stuff

martinjankowiak · 2017-09-05T18:26:33Z

pyro/poutine/poutine.py

+            ret = self._pyro_sample(msg, msg["name"],
+                                    msg["fn"],
+                                    *msg["args"], **msg["kwargs"])
+            new_msg = msg.copy()


remove the copy()s ?

martinjankowiak · 2017-09-05T18:37:54Z

tests/test_mapdata.py

+                              x: pyro.observe(
+                                  "obs_%d" % i, dist.diagnormal,
+                                  x, mu_latent, torch.pow(self.lam, -0.5)), batch_size=batch_size)
+                pyro.map_data("bbb", self.data, lambda i,


what's the purpose of the z samples in these tests? dummies to see that things still work with 2 'map_data's?

I guess so, I should scrutinize the test models/guides more carefully

martinjankowiak · 2017-09-05T18:40:49Z

LGTM once comments addressed by @eb8680 and @jpchen gives the go ahead.

though one possible concern is whether poutines outside of the ones used by map_data have adequate test coverage given the poutine rewrite?

jpchen

took a first look. should be good for merge after fixes

jpchen · 2017-09-01T20:50:11Z

pyro/__init__.py

+    if len(_PYRO_STACK) == 0:
+        # default behavior
+        if isinstance(data, (torch.Tensor, Variable)):  # XXX and np.ndarray?
+            if batch_size > 0:


need to check batch_size <= len(Tensor/list) or youll index out of bounds here

jpchen · 2017-09-01T20:51:21Z

pyro/__init__.py

+            "fn": fn,
+            "data": data,
+            "batch_size": batch_size,
+            # XXX should these be added here or during application


should we add args, kwargs in msg?

I don't think so, better to have map_data functions have a standard fixed interface

jpchen · 2017-09-01T21:07:16Z

tests/test_mapdata.py

+from pyro.infer.kl_qp import KL_QP
+
+
+class NormalNormalTests(TestCase):


TODO: nested mapdata

Will do in separate PR

jpchen · 2017-09-01T21:29:27Z

tests/test_mapdata.py

+    def test_elbo_reparameterized(self):
+        for batch_size in [8, 7, 6, 4, 3, 0]:
+            self.do_elbo_test(True, 5000, batch_size, map_type="list")
+            self.do_elbo_test(True, 5000, batch_size, map_type="tensor")


can you separate these two into two unit tests? ie (test_elbo_reparam_list and test_elbo_reparam_tensor)? because as it stands it's one giant unit test and when it fails you dont know which of the 13 tests it failed. at least this will split the list and tensor types, which take two different control flows.

Good point, will do

jpchen · 2017-09-01T22:03:44Z

pyro/__init__.py

    """
-    :param name: named argument
-    :param data: data tp subsample


if you change the input parameters, change the comments dont remove them. these get auto-generated to docs so it will be missing parameter descriptions

I think I'm going to do a separate PR with just poutine documentation

jpchen · 2017-09-05T19:02:39Z

pyro/poutine/poutine.py

+                                    msg["fn"],
+                                    *msg["args"], **msg["kwargs"])
+            new_msg = msg.copy()
+            new_msg.update({"ret": ret})


pull these two lines out of the switch statement

jpchen · 2017-09-05T19:13:48Z

pyro/poutine/poutine.py

            for i in range(0, loc + 1):
                pyro._PYRO_STACK.pop(0)

-    def _pyro_sample(self, prev_val, name, fn, *args, **kwargs):
+    def _get_scale(self, data, batch_size):


probably can combine most of this repeated code with map_data above.. or have that function call this one

jpchen · 2017-09-05T19:22:32Z

pyro/poutine/poutine.py

-
-    def _pyro_param(self, prev_val, name, *args, **kwargs):
+        else:
+            if batch_size is None:


upper bound batch_size

jpchen · 2017-09-05T19:22:49Z

pyro/poutine/replay_poutine.py

+        """
+        Use the batch indices from the guide trace
+        """
+        if batch_size is None:


upper bound batch_size

Should be addressed by util.get_scale

jpchen · 2017-09-05T19:24:17Z

pyro/poutine/trace.py

@@ -1,4 +1,5 @@
 import pyro
+import pdb


…d up default map_data code

…uplicate code in poutine.up

…ndard interface

jpchen · 2017-09-06T04:26:06Z

looks good, approving but noting things that should be addressed in future PRs:

nested mapData test
moving msg struct to utils or trace
documentation for up and down and other methods in docstrings
write observe in terms of sample and eliminate repeated code

eb8680 added 7 commits July 20, 2017 00:04

added scale_poutine and wrote map_data for trace and replay poutines

ddca955

moved map_data from trace_poutine to poutine and removed duplicate co…

ddfb9a6

…de from replay_poutine

Merge branch 'dev' into eli-map_data-pr

4884342

fixed linting errors

db008ab

removed scale poutine because of mutually recursive inheritance, repl…

0624540

…aced with monkeypatching in base poutine

removed scale from __init__, added missing import to util

827dddf

fixed argument error in replay, tests run but fail with minibatches

1e10024

eb8680 requested review from jpchen and martinjankowiak July 20, 2017 08:18

eb8680 added enhancement high priority labels Jul 20, 2017

eb8680 self-assigned this Jul 20, 2017

fixed linting error

5c8a2a9

eb8680 mentioned this pull request Jul 20, 2017

Refactor map_data #13

Closed

2 tasks

jpchen requested changes Jul 20, 2017

View reviewed changes

martinjankowiak reviewed Jul 20, 2017

View reviewed changes

eb8680 added 10 commits August 16, 2017 14:53

Merge branch 'dev' into eli-map_data-pr

773afc5

map_data rescaling causing huge problems. Merged latest changes from …

c582168

…dev in, then refactored scaling. Scaling now lives in its own Poutine hidden inside TracePoutine, since that's the only place rescaled log_pdf can ever get called from.

linting error

6c11a95

still not working, now failing to store indices and scale in attribut…

59cc983

…es sometimes, but why???

adding stack mechanism from sketches, not ready yet

504e3b2

new bidirectional dispatch mechanism, now passes all poutine tests wi…

5a833ab

…th conservative downward behavior

reconfigured stack-blocking to look more symmetric

b1c47b9

fixed linting errors

4dfb9a9

moving messages up the stack instead of return values

d3d9b8b

removed global transparent flag in poutines

3b5b715

eb8680 added 2 commits September 1, 2017 20:10

removed dead code from default map_data behavior, cleaned up blocking…

f67890a

…, and updated call signatures of primitives in tracegraphpoutine

removed old comment, fixed linting error

33f0bee

martinjankowiak reviewed Sep 5, 2017

View reviewed changes

jpchen requested changes Sep 5, 2017

View reviewed changes

eb8680 added 11 commits September 5, 2017 13:27

changed scalepoutine behavior to work with nested map_data and cleane…

31b2593

…d up default map_data code

removed poutine message copies

c1d4734

moved batch_size asserts to common function

135a5b1

moved batch_size asserts to common function 2

0caef6f

batch_size check in replay and removed pdb import from trace

b23eb7d

added variable import to replay for size checking

d87e12b

split up map_data tests, updated get_scale in tracepoutine, removed d…

9686dbf

…uplicate code in poutine.up

isolated map_data from inference tests, things mostly passing now

60cbf35

removed dead poutine test code, updated scalepoutine to have more sta…

9bc887f

…ndard interface

slightly more descriptive comments

eafa16f

restored docstrings to top-level primitive defs

3868a06

jpchen previously approved these changes Sep 6, 2017

View reviewed changes

removed kwargs from map_data trace site

a34057a

eb8680 dismissed jpchen’s stale review via a34057a September 6, 2017 04:27

jpchen approved these changes Sep 6, 2017

View reviewed changes

jpchen merged commit e41b3de into dev Sep 6, 2017

jpchen mentioned this pull request Sep 6, 2017

Cleanup of mapdata PR #93

Closed

4 tasks

jpchen deleted the eli-map_data-pr branch September 6, 2017 04:58

This was referenced Sep 6, 2017

Poutine stack behavior #54

Closed

Implement AIR model #78

Closed

fritzo mentioned this pull request Aug 8, 2018

Support PyTorch JIT compilation #1063

Closed

8 tasks


		old_sample = self._pyro_sample

		def scaled_sample(_prev_val, _name, _fn, args, *kwargs):

		from pyro.infer.kl_qp import KL_QP


		class NormalNormalTests(TestCase):

Vectorized map_data #62

Vectorized map_data #62

Conversation

eb8680 commented Jul 20, 2017 • edited Loading

jpchen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eb8680 Jul 20, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martinjankowiak commented Aug 3, 2017

eb8680 commented Aug 5, 2017

jpchen commented Aug 8, 2017

martinjankowiak commented Aug 30, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martinjankowiak commented Sep 5, 2017 • edited Loading

jpchen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eb8680 Sep 5, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpchen commented Sep 6, 2017

eb8680 commented Jul 20, 2017 •

edited

Loading

eb8680 Jul 20, 2017 •

edited

Loading

martinjankowiak commented Sep 5, 2017 •

edited

Loading

eb8680 Sep 5, 2017 •

edited

Loading