Skip to content

Conversation

ferrine
Copy link
Member

@ferrine ferrine commented Jun 13, 2017

Architecture I created this winter for variational inference supposed to be modular so implementations of new method is
few lines of code with math for computing samples from initial distribution ($z_0$) and probability ($q(z)$) for
generated posterior ($z$). So not much left for a developer: there is abstract method $f_{\theta} : z_0 \rightarrow z $
that generates $z$ and parametrized with $\theta$. The result is approximate posterior. Note that intermediate $z_t$ are not available in that case and we can't compute $q(z)$ for the flow.

That PR will solve the problem using new internal architecture. I'll rely on symbolic entry points and theano.clone for archiving necessary modularity and control

@twiecki twiecki added this to the 3.2 milestone Jun 21, 2017
@twiecki twiecki changed the base branch from master to 3.2_dev June 23, 2017 10:05
@twiecki
Copy link
Member

twiecki commented Jun 23, 2017

I made this point to a the new 3.2_dev branch but seems like there are failing tests, e.g.:

../../../miniconda2/envs/testenv/lib/python2.7/site-packages/theano/tensor/var.py:821: Exception
________________________________ test_circular _________________________________
    def test_circular():
        trans = tr.circular
>       check_transform_identity(trans, Circ)
pymc3/tests/test_transforms.py:151: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pymc3/tests/test_transforms.py:14: in check_transform_identity
    x = constructor('x')
../../../miniconda2/envs/testenv/lib/python2.7/site-packages/theano/gof/type.py:405: in __call__
    return utils.add_tag_trace(self.make_variable(name))
../../../miniconda2/envs/testenv/lib/python2.7/site-packages/theano/tensor/type.py:353: in make_variable
    return self.Variable(self, name=name)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = x, type = TensorType(float64, scalar), owner = None, index = None
name = 'x'
    def __init__(self, type, owner=None, index=None, name=None):
        super(TensorVariable, self).__init__(type, owner=owner,
                                             index=index, name=name)
        if (config.warn_float64 != 'ignore' and type.dtype == 'float64'):
            msg = ('You are creating a TensorVariable '
                   'with float64 dtype. You requested an action via '
                   'the Theano flag warn_float64={ignore,warn,raise,pdb}.')
            if config.warn_float64 == "warn":
                # Get the user stack. We don't want function inside the
                # tensor and gof directory to be shown to the user.
                x = tb.extract_stack()
                nb_rm = 0
                while x:
                    file_path = x[-1][0]
                    rm = False
                    for p in ["theano/tensor/", "theano\\tensor\\",
                              "theano/gof/", "theano\\tensor\\"]:
                        if p in file_path:
                            x = x[:-1]
                            nb_rm += 1
                            rm = True
                            break
                    if not rm:
                        break
                warnings.warn(msg, stacklevel=1 + nb_rm)
            elif config.warn_float64 == "raise":
>               raise Exception(msg)
E               Exception: You are creating a TensorVariable with float64 dtype. You requested an action via the Theano flag warn_float64={ignore,warn,raise,pdb}.
../../../miniconda2/envs/testenv/lib/python2.7/site-packages/theano/tensor/var.py:821: Exception

@ferrine
Copy link
Member Author

ferrine commented Jun 23, 2017

The file is pymc3/tests/test_transforms.py. It is strange

@ferrine
Copy link
Member Author

ferrine commented Jun 23, 2017

@twiecki this error can be caused by session scoped fixture that is generator itself. strict_float32 seems not to be turned off. I've refactored that.

@ferrine
Copy link
Member Author

ferrine commented Jun 25, 2017

Happy to say "tests pass". I also archived 5% speed up on convolutional mnist example using GPU

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: fix docs here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: use node property

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: use dict for shared params

@ferrine ferrine changed the base branch from 3.2_dev to master June 26, 2017 17:51
@ferrine
Copy link
Member Author

ferrine commented Jun 26, 2017

LDA example runs with new refactoring at the same speed, but sgd eventually gave poor results. ADAM solved the problem

@taku-y
Copy link
Contributor

taku-y commented Jun 27, 2017

@twiecki Let me confirm that we can merge this on master?

order = ArrayOrdering(vars)
if inputvar is None:
inputvar = tt.vector('flat_view', dtype=theano.config.floatX)
if vars:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't if vars: one less indent here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should not, I use if there for the case of empty vars. or else I get exception from flatten

if delta.ndim > 1:
result = tt.batched_dot(result, delta)
else:
result = result.dot(delta.T)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't here result = result.dot(delta) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd better delete this function as it is not used

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep I agree



@pytest.fixture(scope="session", autouse=True)
@pytest.yield_fixture(scope="function", autouse=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't yield_fixture deprecated? https://docs.pytest.org/en/latest/fixture.html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why changing the scope from "session" to "function"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do drop context after each test

-------
list
"""
if isinstance(params, list):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cast_to_list() raise TypeError if list or tuple is given. It contradicts docstring.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to change it. Thanks

[self.view_local(l_posterior, name)
for name in self.local_names]
)
sample_fn = theano.function([theano.In(s, 'draws', 1)], sampled)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you tell me why you use theano.In() here? I think theano.function([s], sampled) might work, but there might be some reason.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no reason. Just decided to use it when first implemented

@memoize
def histogram_logp(self):
"""Symbolic logp for every point in trace
def histogram(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So histogram is still a property but user only interact it with Empirical right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I remember we decided that Empirical is about what is a class about and Histogram is about how it stores samples. I see no bad in such property

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep sounds good to me.

@junpenglao
Copy link
Member

LGTM, I have no more comments.
Just a quick question @ferrine: what is the global and local variables in opvi.py?

@taku-y
Copy link
Contributor

taku-y commented Jun 27, 2017

@junpenglao Those are used for autoencoding variational bayes. Global variables are relevant to the model, while global variables are associated with observatons, often referred to as latent variables.

@ferrine
Copy link
Member Author

ferrine commented Jun 27, 2017

Local variables are somehow conditioned on data and thus encoded. So if we deal with generative model these variables are encoded in inference when computing logp. As encoder maps to mu and sd we do not need extra transformations there and they need different treatment

Global variables in contrast are independently sampled from data and need to be passed through trandforms

@aseyboldt
Copy link
Member

aseyboldt commented Jun 27, 2017

I'm not sure all those memoize calls are such a good idea. Do we really need those? I think they can only help if we call something often or if we call it a couple of times for sure and it takes a long time to compute. They don't come for free:

  • The stack traces get longer and harder to read
  • Runtime overhead for caching (not much if it is just a property)
  • Higher memory usage if we cache something large
  • They can lead to bugs if self changed somehow and the property should be different. This can be hard to track down.

@ferrine
Copy link
Member Author

ferrine commented Jun 27, 2017

I use node property for getting static symbolic references to nodes after they are created

@aseyboldt
Copy link
Member

aseyboldt commented Jun 27, 2017

I just don't see why there is a memoize in the definition of node_property. Is that really necessary? And if it is I think it would be better to specify that separately with a second decorator, that way it is more transparent what is actually happening when accessing the property.

@taku-y
Copy link
Contributor

taku-y commented Jun 27, 2017

I'm not sure about the extent to which memoization makes debugging difficult. However, we can easily remove memoize decorator if it found not to be effective. So, I'd say merging this PR is reasonable. Now tests have already passed and we can postpone to check the effectiveness of memoization.

I think if node_property doesn't memoize, different objects representing the same expression (equation, or node) are put into the expression graph. This might make it difficult to graph optimization. It would be better to do some performance evaluation.

@aseyboldt
Copy link
Member

I agree that this is not critical for this PR.
As an example about problems that can happen because of memoize:

with pm.Model() as model:
    a = pm.Normal('a')

with model:
    trace = pm.sample()

with model:
    b = pm.Normal('b', mu=a)
    trace2 = pm.sample()

This currently fails with a disconnected input error, because logpt is cached in model.py.

@twiecki
Copy link
Member

twiecki commented Jun 27, 2017

I half-agree with @taku-y. The conservative thing to do, however, would be to remove memoization for now and merge them back in once we've proven their worth.

@ferrine
Copy link
Member Author

ferrine commented Jun 27, 2017

Memoization is not for performance. Properties are called on construction phase. Memoization is for creating things I can reuse and access being confident id is the same. theano.clone assumes unique mappings by id(obj). So if symbolic property returns nodes with different id it will be not possible to make things work. Moreover I am able to do some custom optimizations accessing nodes that are already created.

@ferrine
Copy link
Member Author

ferrine commented Jun 27, 2017

Problem with disconnected input is more about how we do hashing. We just need make it dependent on vars

@aseyboldt
Copy link
Member

aseyboldt commented Jun 27, 2017

If it is about keeping the result unique I think in most cases it is much better to just do that work in __init__. That way you get error messages earlier, and they are shorter and easier to read. I usually have a bad feeling when I see lots of memorization. It just makes the code asynchronous and hard to follow.
Sorry for picking on you here @ferrine, especially since I didn't even properly read this code. I guess that's not entirely fair. 😊

@ferrine
Copy link
Member Author

ferrine commented Jun 27, 2017

@aseyboldt the idea behind that is the following

  1. compute graph in lazy manner
  2. create property-functions to construct a node and not overload init with lots of operations
  3. be sure property returns the same object every time
  4. do not care about property call order
  5. lightweight serialization as these properties are not serialized (right?)

I see no alternative to memoization

About serialization. I've moved to using dicts as shared_params as they are easy to serialize (with get/set value). This mechanism can be discussed

@ferrine
Copy link
Member Author

ferrine commented Jun 27, 2017

Do you know a flexible way to put all that stuff to init? Only metaclass comes in mind for me. But it for sure will be much more complicated

@taku-y
Copy link
Contributor

taku-y commented Jun 28, 2017

Some tests related to SVGD failed. It seems relevant to the last two commits. I remember that tests has passed before these commits.

(cherry picked from commit cc4291e)
@aseyboldt
Copy link
Member

I wanted to take the time to read though the variational code properly for quite some time, I guess I'll do that this weekend. I'll let you know if I come up with an alternative.
But don't wait with merging on my account. :-)

@taku-y
Copy link
Contributor

taku-y commented Jun 28, 2017

@ferrine Ready to merge?

@ferrine
Copy link
Member Author

ferrine commented Jun 28, 2017

Yes

@taku-y taku-y merged commit 37b4f77 into pymc-devs:master Jun 28, 2017
@taku-y
Copy link
Contributor

taku-y commented Jun 28, 2017

Thanks!

@ferrine ferrine deleted the refactor-opvi branch June 28, 2017 10:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants