Refactoring OPVI to support Normalizing Flows #2306

ferrine · 2017-06-13T09:28:06Z

Architecture I created this winter for variational inference supposed to be modular so implementations of new method is
few lines of code with math for computing samples from initial distribution ($z_0$) and probability ($q(z)$) for
generated posterior ($z$). So not much left for a developer: there is abstract method $f_{\theta} : z_0 \rightarrow z $
that generates $z$ and parametrized with $\theta$. The result is approximate posterior. Note that intermediate $z_t$ are not available in that case and we can't compute $q(z)$ for the flow.

That PR will solve the problem using new internal architecture. I'll rely on symbolic entry points and theano.clone for archiving necessary modularity and control

twiecki · 2017-06-23T10:07:02Z

I made this point to a the new 3.2_dev branch but seems like there are failing tests, e.g.:

../../../miniconda2/envs/testenv/lib/python2.7/site-packages/theano/tensor/var.py:821: Exception
________________________________ test_circular _________________________________
    def test_circular():
        trans = tr.circular
>       check_transform_identity(trans, Circ)
pymc3/tests/test_transforms.py:151: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pymc3/tests/test_transforms.py:14: in check_transform_identity
    x = constructor('x')
../../../miniconda2/envs/testenv/lib/python2.7/site-packages/theano/gof/type.py:405: in __call__
    return utils.add_tag_trace(self.make_variable(name))
../../../miniconda2/envs/testenv/lib/python2.7/site-packages/theano/tensor/type.py:353: in make_variable
    return self.Variable(self, name=name)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = x, type = TensorType(float64, scalar), owner = None, index = None
name = 'x'
    def __init__(self, type, owner=None, index=None, name=None):
        super(TensorVariable, self).__init__(type, owner=owner,
                                             index=index, name=name)
        if (config.warn_float64 != 'ignore' and type.dtype == 'float64'):
            msg = ('You are creating a TensorVariable '
                   'with float64 dtype. You requested an action via '
                   'the Theano flag warn_float64={ignore,warn,raise,pdb}.')
            if config.warn_float64 == "warn":
                # Get the user stack. We don't want function inside the
                # tensor and gof directory to be shown to the user.
                x = tb.extract_stack()
                nb_rm = 0
                while x:
                    file_path = x[-1][0]
                    rm = False
                    for p in ["theano/tensor/", "theano\\tensor\\",
                              "theano/gof/", "theano\\tensor\\"]:
                        if p in file_path:
                            x = x[:-1]
                            nb_rm += 1
                            rm = True
                            break
                    if not rm:
                        break
                warnings.warn(msg, stacklevel=1 + nb_rm)
            elif config.warn_float64 == "raise":
>               raise Exception(msg)
E               Exception: You are creating a TensorVariable with float64 dtype. You requested an action via the Theano flag warn_float64={ignore,warn,raise,pdb}.
../../../miniconda2/envs/testenv/lib/python2.7/site-packages/theano/tensor/var.py:821: Exception

ferrine · 2017-06-23T10:26:34Z

The file is pymc3/tests/test_transforms.py. It is strange

ferrine · 2017-06-23T22:08:27Z

@twiecki this error can be caused by session scoped fixture that is generator itself. strict_float32 seems not to be turned off. I've refactored that.

ferrine · 2017-06-25T06:18:03Z

Happy to say "tests pass". I also archived 5% speed up on convolutional mnist example using GPU

ferrine · 2017-06-25T06:37:27Z

pymc3/variational/opvi.py

TODO: fix docs here

ferrine · 2017-06-25T06:39:00Z

pymc3/variational/approximations.py

TODO: use node property

ferrine · 2017-06-25T06:40:32Z

pymc3/variational/approximations.py

TODO: use dict for shared params

(cherry picked from commit d12fddc)

(cherry picked from commit d181cea)

(cherry picked from commit c74337e)

(cherry picked from commit 48c38b2)

(cherry picked from commit 0f97ba3)

ferrine · 2017-06-26T18:50:56Z

LDA example runs with new refactoring at the same speed, but sgd eventually gave poor results. ADAM solved the problem

taku-y · 2017-06-27T09:15:36Z

@twiecki Let me confirm that we can merge this on master?

junpenglao · 2017-06-27T09:20:21Z

pymc3/model.py

+            order = ArrayOrdering(vars)
+        if inputvar is None:
+            inputvar = tt.vector('flat_view', dtype=theano.config.floatX)
+            if vars:


Shouldn't if vars: one less indent here?

it should not, I use if there for the case of empty vars. or else I get exception from flatten

junpenglao · 2017-06-27T09:22:59Z

pymc3/distributions/dist_math.py

+    if delta.ndim > 1:
+        result = tt.batched_dot(result, delta)
+    else:
+        result = result.dot(delta.T)


Shouldn't here result = result.dot(delta) ?

I'd better delete this function as it is not used

Yep I agree

junpenglao · 2017-06-27T09:25:16Z

pymc3/tests/conftest.py

-
-
-@pytest.fixture(scope="session", autouse=True)
+@pytest.yield_fixture(scope="function", autouse=True)


Isn't yield_fixture deprecated? https://docs.pytest.org/en/latest/fixture.html

Also, why changing the scope from "session" to "function"?

Do drop context after each test

taku-y · 2017-06-27T01:19:23Z

pymc3/variational/opvi.py

    -------
    list
    """
-    if isinstance(params, list):


cast_to_list() raise TypeError if list or tuple is given. It contradicts docstring.

I forgot to change it. Thanks

taku-y · 2017-06-27T06:10:43Z

pymc3/variational/opvi.py

+                   [self.view_local(l_posterior, name)
+                    for name in self.local_names]
+                   )
+        sample_fn = theano.function([theano.In(s, 'draws', 1)], sampled)


Could you tell me why you use theano.In() here? I think theano.function([s], sampled) might work, but there might be some reason.

There is no reason. Just decided to use it when first implemented

junpenglao · 2017-06-27T09:42:54Z

pymc3/variational/approximations.py

-    @memoize
-    def histogram_logp(self):
-        """Symbolic logp for every point in trace
+    def histogram(self):


So histogram is still a property but user only interact it with Empirical right?

As I remember we decided that Empirical is about what is a class about and Histogram is about how it stores samples. I see no bad in such property

Yep sounds good to me.

junpenglao · 2017-06-27T09:53:38Z

LGTM, I have no more comments.
Just a quick question @ferrine: what is the global and local variables in opvi.py?

taku-y · 2017-06-27T10:05:24Z

@junpenglao Those are used for autoencoding variational bayes. Global variables are relevant to the model, while global variables are associated with observatons, often referred to as latent variables.

ferrine · 2017-06-27T10:05:55Z

Local variables are somehow conditioned on data and thus encoded. So if we deal with generative model these variables are encoded in inference when computing logp. As encoder maps to mu and sd we do not need extra transformations there and they need different treatment

Global variables in contrast are independently sampled from data and need to be passed through trandforms

aseyboldt · 2017-06-27T10:23:09Z

I'm not sure all those memoize calls are such a good idea. Do we really need those? I think they can only help if we call something often or if we call it a couple of times for sure and it takes a long time to compute. They don't come for free:

The stack traces get longer and harder to read
Runtime overhead for caching (not much if it is just a property)
Higher memory usage if we cache something large
They can lead to bugs if self changed somehow and the property should be different. This can be hard to track down.

ferrine · 2017-06-27T10:43:48Z

I use node property for getting static symbolic references to nodes after they are created

aseyboldt · 2017-06-27T11:58:36Z

I just don't see why there is a memoize in the definition of node_property. Is that really necessary? And if it is I think it would be better to specify that separately with a second decorator, that way it is more transparent what is actually happening when accessing the property.

taku-y · 2017-06-27T12:22:04Z

I'm not sure about the extent to which memoization makes debugging difficult. However, we can easily remove memoize decorator if it found not to be effective. So, I'd say merging this PR is reasonable. Now tests have already passed and we can postpone to check the effectiveness of memoization.

I think if node_property doesn't memoize, different objects representing the same expression (equation, or node) are put into the expression graph. This might make it difficult to graph optimization. It would be better to do some performance evaluation.

aseyboldt · 2017-06-27T12:39:18Z

I agree that this is not critical for this PR.
As an example about problems that can happen because of memoize:

with pm.Model() as model:
    a = pm.Normal('a')

with model:
    trace = pm.sample()

with model:
    b = pm.Normal('b', mu=a)
    trace2 = pm.sample()

This currently fails with a disconnected input error, because logpt is cached in model.py.

twiecki · 2017-06-27T12:39:59Z

I half-agree with @taku-y. The conservative thing to do, however, would be to remove memoization for now and merge them back in once we've proven their worth.

ferrine · 2017-06-27T16:28:30Z

Memoization is not for performance. Properties are called on construction phase. Memoization is for creating things I can reuse and access being confident id is the same. theano.clone assumes unique mappings by id(obj). So if symbolic property returns nodes with different id it will be not possible to make things work. Moreover I am able to do some custom optimizations accessing nodes that are already created.

ferrine · 2017-06-27T16:39:42Z

Problem with disconnected input is more about how we do hashing. We just need make it dependent on vars

aseyboldt · 2017-06-27T16:54:19Z

If it is about keeping the result unique I think in most cases it is much better to just do that work in __init__. That way you get error messages earlier, and they are shorter and easier to read. I usually have a bad feeling when I see lots of memorization. It just makes the code asynchronous and hard to follow.
Sorry for picking on you here @ferrine, especially since I didn't even properly read this code. I guess that's not entirely fair. 😊

ferrine · 2017-06-27T20:35:54Z

@aseyboldt the idea behind that is the following

compute graph in lazy manner
create property-functions to construct a node and not overload init with lots of operations
be sure property returns the same object every time
do not care about property call order
lightweight serialization as these properties are not serialized (right?)

I see no alternative to memoization

About serialization. I've moved to using dicts as shared_params as they are easy to serialize (with get/set value). This mechanism can be discussed

ferrine · 2017-06-27T20:46:37Z

Do you know a flexible way to put all that stuff to init? Only metaclass comes in mind for me. But it for sure will be much more complicated

taku-y · 2017-06-28T00:58:31Z

Some tests related to SVGD failed. It seems relevant to the last two commits. I remember that tests has passed before these commits.

(cherry picked from commit cc4291e)

aseyboldt · 2017-06-28T08:37:58Z

I wanted to take the time to read though the variational code properly for quite some time, I guess I'll do that this weekend. I'll let you know if I come up with an alternative.
But don't wait with merging on my account. :-)

taku-y · 2017-06-28T09:45:23Z

@ferrine Ready to merge?

ferrine · 2017-06-28T09:51:18Z

Yes

taku-y · 2017-06-28T10:02:19Z

Thanks!

twiecki added this to the 3.2 milestone Jun 21, 2017

twiecki changed the base branch from master to 3.2_dev June 23, 2017 10:05

ferrine commented Jun 25, 2017

View reviewed changes

ferrine changed the base branch from 3.2_dev to master June 26, 2017 17:51

ferrine added 19 commits June 26, 2017 20:52

beginning of refactoring, simple tests pass for MeanField

faca8aa

more tests pass

12e6173

mv normal refactor

1be7aeb

mv normal refactor

ae3ff31

refactor tests for VI

241b303

more tests

a7467da

refactor rest code

02996f2

refactor tests

9e53c05

refactor tests

03b6d21

use yield fixtures

3ef423e

refactor tests, make them faster

289a559

fix tests

027ef00

(cherry picked from commit d12fddc)

fix tests duration

33fdca8

(cherry picked from commit d181cea)

fix deterministic for Empirical, more tests

a180066

change mean shape for Empirical

7d4562a

(cherry picked from commit c74337e)

support multitrace in Empirical

64ee80a

(cherry picked from commit 48c38b2)

fix convergence tests

5737626

do not duplicate tests

020c428

(cherry picked from commit 0f97ba3)

some small changes in docs+code

009b095

ferrine added 2 commits June 26, 2017 21:56

update LDA notebook

8708772

scale_cost to minibatch refactor

19aeec9

junpenglao reviewed Jun 27, 2017

View reviewed changes

taku-y reviewed Jun 27, 2017

View reviewed changes

junpenglao reviewed Jun 27, 2017

View reviewed changes

get feedback and refactor code

80e7e47

stein refactor and typo fix

a4a04a4

test value for stein

5b983a7

(cherry picked from commit cc4291e)

taku-y merged commit 37b4f77 into pymc-devs:master Jun 28, 2017

ferrine deleted the refactor-opvi branch June 28, 2017 10:20



		@pytest.fixture(scope="session", autouse=True)
		@pytest.yield_fixture(scope="function", autouse=True)

Refactoring OPVI to support Normalizing Flows #2306

Refactoring OPVI to support Normalizing Flows #2306

Uh oh!

Conversation

ferrine commented Jun 13, 2017

Uh oh!

twiecki commented Jun 23, 2017

Uh oh!

ferrine commented Jun 23, 2017

Uh oh!

ferrine commented Jun 23, 2017

Uh oh!

ferrine commented Jun 25, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ferrine commented Jun 26, 2017

Uh oh!

taku-y commented Jun 27, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

junpenglao commented Jun 27, 2017

Uh oh!

taku-y commented Jun 27, 2017

Uh oh!

ferrine commented Jun 27, 2017

Uh oh!

aseyboldt commented Jun 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ferrine commented Jun 27, 2017

Uh oh!

aseyboldt commented Jun 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

taku-y commented Jun 27, 2017

Uh oh!

aseyboldt commented Jun 27, 2017

Uh oh!

twiecki commented Jun 27, 2017

Uh oh!

ferrine commented Jun 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ferrine commented Jun 27, 2017

Uh oh!

aseyboldt commented Jun 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

aseyboldt commented Jun 27, 2017 •

edited

Loading

aseyboldt commented Jun 27, 2017 •

edited

Loading

ferrine commented Jun 27, 2017 •

edited

Loading

aseyboldt commented Jun 27, 2017 •

edited

Loading