Leveraging SymPy for PyMC3 #178

twiecki · 2013-02-19T00:33:33Z

SymPy (http://sympy.org/en/index.html) is a Python library for symbolic mathematics.

My initial motivation for looking at SymPy resulted from #172 and #173. Instead of recoding all probability distributions, samplers etc in Theano, maybe we could just use the ones provided by sympy.stats (http://docs.sympy.org/dev/modules/stats.html).

For this to work we needed to convert the sympy computing graph to a theano one. It seems that there is some work that shows that this is possible (https://github.com/nouiz/theano_sympy)

Looking at sympy (and sympy.stats) more closely it seems that there are potentially more areas where integrating this could help. Maybe this would give the best of both worlds: "Theano focuses more on tensor expressions than Sympy, and has more machinery for compilation. Sympy has more sophisticated algebra rules and can handle a wider variety of mathematical operations (such as series, limits, and integrals)."

There is additional discussion here: nouiz/theano_sympy#1.

Copy pasting some chunks from @mrocklin response to move the discussion over here:

Overlap

There are some obvious points of overlap between the various projects

PyMC has distributions and SymPy has distributions (https://github.com/sympy/sympy/blob/master/sympy/stats/crv_types.py). SymPy doesn't currently have infinite discrete random variables like Poisson though. This could be fixed but is a current failing. SymPy's support for analytic solution of infinite sums is poor so this was a low priority. It seems like you're not really looking for that though.
PyMC has implemented some special functions that could be in Theano. I would encourage you to push these upstream. They'll probably get some useful attention from the Theano crowd.
Looking at the pymc2 readme it appears that you have created some sort of symbolic algebra class structure (you add two pymc.Normal objects). Presumably SymPy.core might be of use here.

What is the relationship with statsmodels? They also have a home-grown internal algebraic system. My guess is that if everyone were to unite under one algebraic system there would be some pleasant efficiencies. I obviously have a bias about what that algebraic system should be :)

Derivatives

Both Theano and SymPy provide derivatives which, apparently, you need. SymPy provides analytic ones, Theano provides automatic ones. My suggestion would be to use SymPy if it works and fall back on Theano if it doesn't work. You don't need SymPy.stats for this (in case you didn't want to offload your distributions work.) SymPy.core would be just fine.

Other benefits

In general the benefits to using symbolic systems tend to be unexpected. SymPy can provide lots of general aesthetic fluff like awesome pretty printing, symbolic simplification, C/Fortran code snippet generation, etc....

jsalvatier · 2013-02-19T00:41:32Z

Very interesting. Theano_Sympy doesn't provide examples that make it clear how one could take a calculation in sympy and then convert it to a theano calculation, so it's a little hard to evaluate.

I'm not sure what he means when he says 'you have created some sort of symbolic algebra class structure (you add two pymc.Normal objects)'. Is that just

x = Normal('x', 0,1)
y = Normal('y', 0, 1)
z = x + y

?

mrocklin · 2013-02-19T00:45:18Z

Yes, that is what I mean.

jsalvatier · 2013-02-19T00:52:39Z

Okay that makes sense. Right now, in pymc3, theano takes care of that stuff.

The pretty printing etc does sound pretty nice.

If I wanted to calculate the probability density of a Normal distribution given some value for the mean, variance and data value using the SymPy distribution but do that calculation in Theano, how would I do that?

mrocklin · 2013-02-19T01:10:17Z

It looks like theano_sympy was designed to take Theano expressions to SymPy, simplify them, and then send them back again. This is not your application

The function that seems most useful in your case is the aptly named sympy_to_theano. It requires a sympy expression (easy to obtain) and information about the resulting theano variables (simple). Theano variables have more information than SymPy variables do. In particular they have a dtype and a broadcastable . You're being asked to provide this.

Here is an example using sympy.stats and theano

In [1]: from sympy.stats import *
In [2]: from sympy import Symbol, pprint, pi
In [3]: import theano
In [4]: X = Normal('x', 2, 3)
In [5]: x = Symbol('x')
In [6]: expr = density(X)(x)
In [7]: pprint(expr)
               2
       -(x - 2) 
       ─────────
  ___      18   
╲╱ 2 ⋅ℯ         
────────────────
        ___     
    6⋅╲╱ π      



In [8]: from graph_translation import sympy_to_theano

In [9]: var_map = {'x': ('float64', (False, False))}  # x is a rank 2 tensor of floats
In [10]: texpr = sympy_to_theano(expr, var_map)  # oops, SymPy.pi doesn't have a Theano analog
KeyError: <class 'sympy.core.numbers.Pi'>

In [12]: expr = expr.subs(pi, pi.evalf())  # replace Pi with the float equivalent in SymPy
In [13]: texpr = sympy_to_theano(expr, var_map)  # try this again

In [14]: theano.printing.debugprint(texpr)
Elemwise{mul,no_inplace} [@A] ''   
 |InplaceDimShuffle{x,x} [@B] ''   
 | |TensorConstant{0.0940315946937} [@C]
 |InplaceDimShuffle{x,x} [@D] ''   
 | |Elemwise{pow,no_inplace} [@E] ''   
 |   |TensorConstant{2} [@F]
 |   |TensorConstant{0} [@G]
 |Elemwise{exp,no_inplace} [@H] ''   
   |Elemwise{mul,no_inplace} [@I] ''   
     |InplaceDimShuffle{x,x} [@J] ''   
     | |TensorConstant{-1} [@K]
     |Elemwise{pow,no_inplace} [@L] ''   
       |Elemwise{add,no_inplace} [@M] ''   
       | |InplaceDimShuffle{x,x} [@N] ''   
       | | |TensorConstant{-2} [@O]
       | |x [@P]
       |InplaceDimShuffle{x,x} [@Q] ''   
         |TensorConstant{2} [@R]

jsalvatier · 2013-02-19T01:10:59Z

It looks like I can just say:

distribution = Normal(0,1)
sympy_to_theano(distribution.pdf(x))

I'm curious whether that interacts well with things like indexing. For example if I say

mu = Symbol('mu')
distribution = Normal(mu[1:10], 1)
sympy_to_theano(distribution.pdf(x))

Whether this will work well.

mrocklin · 2013-02-19T01:16:01Z

In your example above is mu intended to be a scalar or a vector?

In SymPy Normal returns a scalar normal random variable. There was work on multivariate normal random variables a while ago but this isn't in the main branch.

mrocklin · 2013-02-19T01:17:13Z

Notes on my example above. You would want to extend the mapping variable in the theano_to_sympy to handle the SymPy.numbers.Pi class. This is likely to happen for other classes too. For example Theano may not have a Gamma class or other such mathematical functions.

mrocklin · 2013-02-19T01:18:28Z

The example above doesn't really use SymPy for anything other than a database of distributions. The random variable formalism allows the creation of some interesting problems.

jsalvatier · 2013-02-19T01:19:57Z

I meant mu to be a vector. It's okay that Normal is the univariate distribution in the sense that it doesn't deal with covariances. Can it describe many random variables that are each normally distributed with different means and variances (specified by a vector of means and variances)?

mrocklin · 2013-02-19T01:29:13Z

For this I would use a collection of Normals. They can be manipulated as normal symbolic variables to form products, tuples, etc....

>>> from sympy import *
>>> from sympy.stats import *
>>> xs = symbols('x:10')
>>> mus = symbols('mu:10')
>>> Xs = [Normal(x, mu, 1) for x, mu in zip(xs, mus)]
>>> pprint(simplify(E(sum(X**2 for X in Xs))))
  2     2     2     2     2     2     2     2     2     2     
μ₀  + μ₁  + μ₂  + μ₃  + μ₄  + μ₅  + μ₆  + μ₇  + μ₈  + μ₉  + 10

mrocklin · 2013-02-19T02:06:14Z

Or if you wanted the full pdf of the joint probability space you could access it as follows

In [1]: from sympy.stats import *

In [2]: from sympy import *

In [4]: mus = symbols('mu:10', real=True)

In [5]: Xs = [Normal('x_%d'%i, mu, 1) for i, mu in enumerate(mus)]

In [6]: pprint(pspace(Tuple(*Xs)).pdf)
            2             2             2             2             2             2             2             2             2             2
 -(-μ₀ + x₀)   -(-μ₁ + x₁)   -(-μ₂ + x₂)   -(-μ₃ + x₃)   -(-μ₄ + x₄)   -(-μ₅ + x₅)   -(-μ₆ + x₆)   -(-μ₇ + x₇)   -(-μ₈ + x₈)   -(-μ₉ + x₉) 
 ────────────  ────────────  ────────────  ────────────  ────────────  ────────────  ────────────  ────────────  ────────────  ────────────
      2             2             2             2             2             2             2             2             2             2      
ℯ            ⋅ℯ            ⋅ℯ            ⋅ℯ            ⋅ℯ            ⋅ℯ            ⋅ℯ            ⋅ℯ            ⋅ℯ            ⋅ℯ            
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
                                                                       5                                                                   
                                                                   32⋅π

jsalvatier · 2013-02-19T20:15:18Z

Okay, this makes pretty good sense to me.

Looks like Sympy could be good as a source of distribution math, but probably not a good replacement for the symbolic algebra part since it's not geared to be numpy like, which I think is an important feature.

So I think we'd probably use SymPy by

converting theano variables to sympy symbols
doing sympy calculations for distributions
converting back to theano variables

What are other people's thoughts?

twiecki · 2013-02-19T20:51:12Z

@jsalvatier I don't see where we would need 1.

Couldn't the model specification consist of sympy symbols which then get translated to theano for computation?

Also, I think there is a non-trivial probability that this approach will lead to some unexpected benefits further down the line.

mrocklin · 2013-02-19T20:54:58Z

What numpy-like features do you need? SymPy has a dense Matrix type and a set of purely symbolic MatrixExpr types. Each of these are strictly rank 2 tensors. Support for higher rank tensors in SymPy exists but is poorly integrated.

SymPy.stats had a multivariate branch a while back that generated matrix expressions. I could resurrect it if there was a definite need. If you're interested I would look at this blogpost I recommend skipping over the beginning and just scroll down to where the code starts.

jsalvatier · 2013-02-19T21:00:37Z

Perhaps you're right.

I think the part that makes me nervous is that sympy wasn't designed to do computation, so there can be unknown pitfalls.

For example, does sympy have a notion of multidimensional arrays? Does it do broadcasting? How about indexing and advanced indexing into vectors? Those seem pretty important to me.

It's also adding another interface between packages, and thus more possibility for problems. Sympy computations need to convert pretty well into theano graphs.

mrocklin · 2013-02-19T21:11:14Z

Definitely you're correct that trying to represent computations with SymPy is a recipe for disappointment. Rather, SymPy is good for representing mathematical models. In my experience there is usually a clear distinction between the model and the computation. Does such a distinction exist in your application?

You're definitely also correct that adding SymPy will bring a bunch of issues with it. SymPy.stats really isn't battle-tested. I'm confident that if you go this route you'll expose bugs. I'm around to support those but it adds inertia to the development cycle.

jsalvatier · 2013-02-19T21:35:45Z

You're right that there is such a distinction, and I think the separation is pretty clean in pymc3 too.

Perhaps a better way to say it is that often things like indexing and advanced indexing come in at the model level here. For example, you will want to specify things like

group = [0,0,1,1,2, 0]
x ~ Normal(0, 1, shape = (num_of_groups,) )
y ~ Normal(x[group], 1)

Meaning that the mean of the prior distribution for y_i is the value of x_{group_i}.

Often these are treated in a handwavy fashion in mathematics. For example: the notion of broadcasting is not a commonly discussed issue in math. So I would be somewhat surprised to find features like these to be well developed in sympy or to allow for easy implementation. I would love to be surprised though.

jsalvatier · 2013-02-19T21:38:44Z

Something along these lines does seem to exist in SymPy, but it doesn't look that well developed.

mrocklin · 2013-02-19T22:07:26Z

Yeah, I would steer clear of Indexed. I would look at Matrix if you don't need ranks greater than 2.

Your intuition is correct that SymPy would not cleanly allow that syntax. You would have to resort to list comprehensions, map, for-loops, etc....

Fancy indexing has been on our issues list for matrices for a while though. It's certainly conceivable that we would support this in the future. If you spoke up on the mailing list saying that you needed it for your project it's likely that someone would implement it relatively soon. We're in GSoC application mode right now so we have a number of aspiring contributors looking to pounce on things like this.

I wouldn't support adding a shape keyword to sympy.stats but you could wrap something fairly easily. The following SymPy code would serve as the basis for a decent recipe

In [1]: from sympy.stats import *
In [2]: groups = [0, 0, 1, 1, 2, 0]
In [3]: numgroups = len(set(groups))
In [4]: x = Matrix(1, numgroups, lambda i,j: Normal('x_%d'%j, 0, 1))
In [5]: x
Out[5]: [x₀  x₁  x₂  x₃  x₄]
In [6]: y = Normal(x[1, groups], 1)  # This fails due to lack of fancy indexing
IndexError: Invalid index a[[0, 0, 1, 1, 2, 0]]

In general though if you switch to SymPy as an modeling input language your interface will change. You'll get a huge amount of general mathematical formalism but will lose the specific features you've cooked into your own language. For some of those features you can add them to SymPy or ask the SymPy community to add them for you. I'm sure that some language features would be difficult to support though.

twiecki · 2013-02-19T22:50:34Z

Hm, the array broadcasting is a real issue.

One other idea: If we mainly just wanted to use sympy.stats for it's distributions. Would it be possible to borrow those and replace the sympy symbols with Theano ones (e.g. sympy.sin -> Theano.sin)?

mrocklin · 2013-02-19T23:04:15Z

The theano_sympy project and in particular the sympy_to_theano function was designed to handle that transformation. You may want to rewrite it but the core pieces for how to break down a sympy graph and how to construct a theano graph are there.

mrocklin · 2013-02-19T23:05:33Z

If you just wanted the distributions you might want to skip the random variable and probability space formalism. Functions like Normal provide a random variable. You probably would want the classes like NormalDistribution. They're less powerful but far simpler.

jsalvatier · 2013-02-19T23:09:06Z

With this approach we don't have to invest too much. We could even have some theano, some Sympy based distributions.

twiecki · 2013-02-20T15:42:40Z

@mrocklin I didn't find NormalDistribution, can you link me to it? Also, does Normal use NormalDistribution internally?

mrocklin · 2013-02-20T15:52:32Z

https://github.com/sympy/sympy/blob/master/sympy/stats/crv_types.py#L1632

NormalDistribution is a distribution and contains information like the pdf. It has the ability to compute expectations of expresssions, cdf, etc....

Normal is a random variable. It acts in every way like a SymPy variable and handles interaction with the rest of the SymPy language. Whenever you query a SymPy expression with random variables e.g. P(X>2*Y) then all of the distributions of all the random variabes within that expression are collected and considered collectively.

If you want SymPy to do statistical simplifications for you then use Normal. If you just want a database of distributions then use NormalDistribution.

Using Normal and SymPy gives you simplifications like the following

>>> X = Normal('x', 0, 1)
>>> Y = Normal('y', 0, 1)
>>> Z = Normal('z', 0, 1)
>>> density(X**2 + Y**2 + Z**2)
ChiSquaredDistribution(3)

But it would force you to use SymPy as your modeling language, which may not be appropriate.

twiecki · 2013-02-20T16:01:38Z

Oh, that's really neat! Although it doesn't seem to work on my install, the last call just returns a lambda with a nasty integral. Maybe a version issue (I'm running 0.7.2, this is sympy master branch I suppose?.

Do you think this could figure out conjugate priors? For example:

Bern(x | \theta) * Beta(\theta) / Integral(Bern(x | \lambda) * Beta(\lambda) d\lambda) = Beta()

Where the result is another Beta distribution with updated parameters.

There is an explicit example at https://en.wikipedia.org/wiki/Conjugate_prior

mrocklin · 2013-02-20T16:39:44Z

In principle, yes that should be solvable. In practice no, that is not yet solvable. I've never used sympy.stats with random parameters so this exposes some bugs which I'm now fixing. Obviously this is an important application though.

In general sympy.stats transforms random expressions like what you have above into integrals. Sympy.integrate is generally surprisingly good at solving integrals. Of course, there are limits. There is nothing stopping us from generating the relevant integrals; I can't speak to what will happen after that.

The statistical simplification is in a separate branch and not yet in master.

twiecki · 2013-02-20T17:15:06Z

OK, I see that it's not practical. But just to finish this line of thought. The trick with conjugate priors is that there is no integration required. Just simplifying the numerator and denominator one can see that this is again a beta distribution, but with different parameters. Not sure if that changes anything.

mrocklin · 2013-02-20T18:45:42Z

Mixing finite and continuous RVs is a bit buggy. Here is an example with strictly normal random variables.

>>> from sympy.stats import *
>>> from sympy import *
>>> mu = Normal('mu', 2, 3)
>>> x = Normal('x', mu, 1)
>>> y, z = Symbol('y'), Symbol('z')
>>> simplify(density(mu, Eq(x, y))(z))  # density of mu given x == y
            2                2           
         9⋅y          y   5⋅z    2⋅z   1 
       - ──── + y⋅z - ─ - ──── + ─── - ──
  ___     20          5    9      9    45
╲╱ 5 ⋅ℯ                                  
─────────────────────────────────────────
                     ___                 
                 3⋅╲╱ π

Internally it sets up and solves the following integral

In [17]: simplify(density(mu, Eq(x, y), evaluate=False)(z))  # density of mu given x == y
Out[17]: 
 oo                                                               
  /                                                               
 |                                                                
 |                      2          2                              
 |               (x - z)    (z - 2)                               
 |             - -------- - --------                              
 |                  2          18                                 
 |            e                     *DiracDelta(x - y)            
 |  ----------------------------------------------------------- dx
 |        oo  oo                                                  
 |         /   /                                                  
 |        |   |                                                   
 |        |   |            2          2                           
 |        |   |     (x - z)    (z - 2)                            
 |        |   |   - -------- - --------                           
 |        |   |        2          18                              
 |        |   |  e                     *DiracDelta(x - y)         
 |  6*pi* |   |  ---------------------------------------- dx dz   
 |        |   |                    6*pi                           
 |        |   |                                                   
 |       /   /                                                    
 |       -oo -oo                                                  
 |                                                                
/                                                                 
-oo

This is all in a branch.

mrocklin · 2013-02-22T00:38:38Z

Is it correct to assume you meant to say "The denominator doesn't come into it" ? Otherwise I'm more confused than I thought :)

I suspect that the d/dtheta (log(p(x | theta) * p(theta)) term might be algebraically simplifiable. Is the cost of your computation sensitive to the complexity of this term?

jsalvatier · 2013-02-22T00:45:18Z

Heh, yep. Looks like I was confused and not you!

Yes simplifying either the log posterior (logp) or the derivative (dlogp)
would be useful. My impression is that right now the dlogp takes longer to
compute than logp, but not by a lot.

On Thu, Feb 21, 2013 at 4:38 PM, Matthew Rocklin
notifications@github.comwrote:

Is it correct to assume you meant to say "The denominator doesn't come
into it" ? Otherwise I'm more confused than I thought :)

I suspect that the d/dtheta (log(p(x | theta) * p(theta)) term might be
algebraically simplifiable. Is the cost of your computation sensitive to
the complexity of this term?

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/178#issuecomment-13922311.

mrocklin · 2013-02-22T01:30:25Z

This is obviously not my field but I've been playing with this a while. Here is a problem with a Poisson distribution whose rate parameter is distributed with a Beta distribution.

In this particular case it looks like dlogp is cheaper than logp.

In [1]: from sympy import *
In [2]: from sympy.stats import *
In [3]: 
In [3]: x = Symbol('x', positive=True)
In [4]: l = Symbol('lambda', positive=True)
In [5]: 
In [5]: rate = Beta(l, 2, 3)
In [6]: X = Poisson(x, rate)
In [7]: numer = density(X, rate)(X.symbol) * density(rate)(rate.symbol)
In [8]: numer
Out[8]: 
      x         2  -λ
12⋅λ⋅λ ⋅(-λ + 1) ⋅ℯ  
─────────────────────
          x!         
In [9]: log(numer)
Out[9]: 
   ⎛      x         2  -λ⎞
   ⎜12⋅λ⋅λ ⋅(-λ + 1) ⋅ℯ  ⎟
log⎜─────────────────────⎟
   ⎝          x!         ⎠

In [10]: simplify(log(numer))
Out[10]: 
                            ⎛ 2          ⎞          
                            ⎜λ  - 2⋅λ + 1⎟          
-λ + x⋅log(λ) + log(λ) + log⎜────────────⎟ + log(12)
                            ⎝     x!     ⎠          

In [11]: simplify(log(numer).diff(l))
Out[11]: 
   2                    
- λ  + λ⋅x + 4⋅λ - x - 1
────────────────────────
       λ⋅(λ - 1)

If this is interesting at all let me know. If you can provide a more likely example (that doesn't include array broadcasting) let me know.

mrocklin · 2013-02-22T02:11:13Z

It looks like the fully general solution to Poisson parametrized by a Beta is actually pretty simple

In [25]: rate = Beta(l, a, b)

In [26]: X = Poisson(x, rate)

In [27]: numer = density(X, rate)(X.symbol) * density(rate)(rate.symbol)

In [28]: simplify(log(numer).diff(l))
Out[28]: 
                 2                  
a⋅λ - a + b⋅λ - λ  + λ⋅x - λ - x + 1
────────────────────────────────────
             λ⋅(λ - 1)              
In [29]: simplify(log(numer).diff(l)).count_ops()
Out[29]: 14

In [30]: print ccode(simplify(log(numer).diff(l)))
(a*lambda - a + b*lambda - pow(lambda, 2) + lambda*x - lambda - x + 1)/(lambda*(lambda - 1))

twiecki · 2013-02-22T12:39:43Z

I suppose the question is which (implicit) solution Theano would come up with using autodiff. @jsalvatier is there any way to get the complexity of that for this case?

jsalvatier · 2013-02-22T14:57:51Z

I get the not very informative result of:

'Flatten{1}(Elemwise{Composite{[Composite{[Composite{[Composite{[Composite{[sub(add(i0,
i1), add(i2, i3))]}(Switch(i0, i1, i2), Switch(i3, i4, i2), Switch(i0,
i5, i2), Switch(i3, i6, i7))]}(i0, true_div(i1, i2), i3, i4,
true_div(i5, i2), true_div(i6, i7), i8, i9)]}(i0, add(i1, i2), i3, i4,
i5, i6, add(i1, i7), sub(i8, i3), i8,
i9)]}(Composite{[Composite{[Composite{[Composite{[AND(AND(i0, i1),
GT(i2, i3))]}(AND(i0, i1), GT(i2, i3), i4, i3)]}(AND(i0, i1), LE(i2,
i0), i3, i4, i5)]}(i0, GE(i1, i2), i1, i3, i2, i4)]}(i0, i1, i2, i3,
i4), i5, i3, i1, i2, Composite{[AND(i0, GT(i1, i2))]}(i0, i1, i2), i6,
i4, i7, i8)]}}(TensorConstant{1}, l, TensorConstant{0}, a, b,
TensorConstant{-1.0}, x, TensorConstant{1.0}, TensorConstant{0.0}))'

I'm working on graphing this

On Fri, Feb 22, 2013 at 4:39 AM, Thomas Wiecki notifications@github.comwrote:

I suppose the question is which (implicit) solution Theano would come up
with using autodiff. @jsalvatier https://github.com/jsalvatier is there
any way to get the complexity of that for this case?

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/178#issuecomment-13941294.

mrocklin · 2013-02-22T15:32:50Z

Try theano.printing.pydotprint

mrocklin · 2013-02-22T15:38:43Z

Also make sure that this comes after you've run all of the optimizations. Theano does simplification too, it just tends not to be quite as algebraically focused. The syntactically easiest way to do this it to get the fgraph object after it has been made into a function.

from theano import *
x = tensor.matrix('x')
y = 3+ x**2 + x + 1
f = function([x], y)
theano.printing.debugprint(f.maker.fgraph)

mrocklin · 2013-02-22T15:41:17Z

The awesomest solution would be to use SymPy to generate the algebraic result and simplify algebraically, then print to Theano and use Theano to optimize numerically (e.g. inplace operations, constant folding). They optimize in very different ways and it'd be great to have a practical example where they work together. This was the original motivation behind the theano_sympy code sprint.

jsalvatier · 2013-02-22T23:55:51Z

That's actually what I was thinking too. I'm still having trouble with pydot :(

nouiz · 2013-02-23T02:48:36Z

You only need call pydotprint and debugprint like this:

f = theano.function(...)
theano.printing.debugprint(f)
theano.printing.pydotprint(f)

Also, to disable the fusion of elemwise to see more clearly the operation done, you can use this theano flags:

optimizer_excluding=local_elemwise_fusion

mrocklin · 2013-02-28T20:59:57Z

Any development on this? If you provide comparison pymc code I can handle the Theano analysis side.

jsalvatier · 2013-03-05T21:22:50Z

Hey, I've been super busy, so I haven't done work on this, but I do have an example.

from pymc import * 
from pymc.model import Variable 

m = Model()
a = Variable('a', (), 'float64', 1)
b = Variable('b', (), 'float64', 1) 
m.vars.append(a)
m.vars.append(b)

l = m.Var('l', Beta(a,b), ())
x = m.Var('x', Poisson(l), (), testval = 1)

lp0 = logp(m)
dlp0 = dlogp([l])(m)

lp1 = m.logp().f
dlp1 = m.dlogp([l]).f

from theano.printing import pydotprint 
from theano import * 
pydotprint(dlp1, 'dlogp.png')

pp(dlp1.maker.fgraph.outputs[0])

Gives

and

'Flatten{1}(Elemwise{Composite{[Composite{[Composite{[Composite{[Composite{[sub(add(i0, i1), add(i2, i3))]}(Switch(i0, i1, i2), Switch(i3, i4, i2), Switch(i0, i5, i6), Switch(i3, i7, i2))]}(i0, true_div(i1, i2), i3, i4, true_div(i5, i2), i6, i7, true_div(i8, i9))]}(i0, i1, i2, i3, i4, add(i5, i6), i7, i8, add(i5, i9), sub(i7, i2))]}(Composite{[AND(i0, GT(i1, i2))]}(i0, i1, i2), i3, i1, i2, Composite{[Composite{[Composite{[Composite{[AND(AND(i0, i1), GT(i2, i3))]}(AND(i0, i1), GT(i2, i3), i4, i3)]}(AND(i0, i1), LE(i2, i0), i3, i4, i5)]}(i0, GE(i1, i2), i1, i3, i2, i4)]}(i0, i1, i2, i4, i5), i6, i4, i7, i8, i5)]}}(TensorConstant{1}, l, TensorConstant{0}, x, a, b, TensorConstant{-1.0}, TensorConstant{1.0}, TensorConstant{0.0}))'

I can translate these into Theano if that's better.

I'm thinking that this is complex enough that we probably wouldn't want to put this work into pymc3. If it's worth pursuing, I think it would be better to put it into Theano as better optimizations. That way other people can benefit anyway.

mrocklin · 2013-03-17T15:16:50Z

I'm thinking again about the Theano-SymPy-Theano crossover, particularly about what's possible and at what level of generality it should be done. I think that there is value here but negotiating the best crossover point is complex.

It's clear that, at least for this problem, a fair amount can be gained by involving sympy. The Theano result isn't very intuitive and appears to be a more complex solution. I assume that this complexity affects runtime.

One option is to take the result you present above, translate it to SymPy, see what it can do, and then translate back. This is good because it's general, applies to all of Theano, and doesn't require any legwork by the pymc folks. It's bad though because the form presented here is fairly low-level, potentially limiting SymPy effectiveness. In general Theano representations are mathematically slightly lower-level than SymPy. It feels weird to travel up the abstraction stack rather than down.

For this particular project I'm curious if I can start from the Normal, Beta, etc... layer of random variables. I wonder if it is reasonable to implement these as standard Theano Ops. That way there would be a clear high-level transition over to SymPy. It would also mean that there was a single point of truth for statistical information.

pymc presents a good example problem for me; it has clear issues and clear potential benefits. I was looking for a more complex example that involved array indexing (which is where I expect SymPy to break down). I tried downloading your repository and running your example but ran into a few issues. Probably I'm just doing something silly.

pymc$ ipython
In [1]: run examples/ARM5-4.py
/home/mrocklin/workspace/pymc/pymc/step_methods/hmc.py in <module>()
      5 ''' 
      6 from numpy import floor
----> 7 from quadpotential import *
      8 from utils import *
      9 from ..core import *

AttributeError: 'module' object has no attribute 'QuadPotential_SparseInv'

Ah, well, maybe it's just not installed properly

mrocklin@notebook:~/workspace/pymc$ python setup.py install
/home/mrocklin/Software/epd-7.3-2-rh5-x86_64/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'install_requires'
  warnings.warn(msg)
running install
running build
running build_py
error: package directory 'pymc/history' does not exist

Any tips?

jsalvatier · 2013-03-17T17:32:22Z

I'm not sure why quadpotential is giving you a problem. I'll check that out.

The second problems aren't related to the first I don't think.

jsalvatier · 2013-03-17T17:37:48Z

I've solved the second problem I think, can you update and try again?

jsalvatier · 2013-03-17T17:42:05Z

For the first problem, it's related to not having https://pypi.python.org/pypi/scikits.sparse . You could install that, or I'll fix it in an hour or two.

mrocklin · 2013-03-17T17:45:54Z

mrocklin@notebook:~/workspace/pymc$ git pull 
 setup.py |    4 ++--

pymc$ ipython
In [2]: import pymc
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/home/mrocklin/workspace/pymc/<ipython-input-2-5f262cfcb99b> in <module>()
----> 1 import pymc

/home/mrocklin/workspace/pymc/pymc/__init__.py in <module>()
      5 from trace import *
      6 from sample import *
----> 7 from step_methods import *
      8 from tuning import *
      9 

/home/mrocklin/workspace/pymc/pymc/step_methods/__init__.py in <module>()
      1 from compound import *
----> 2 from hmc import *
      3 from metropolis import *
      4 from gibbs import *

/home/mrocklin/workspace/pymc/pymc/step_methods/hmc.py in <module>()
      5 ''' 
      6 from numpy import floor
----> 7 from quadpotential import *
      8 from utils import *
      9 from ..core import *

AttributeError: 'module' object has no attribute 'QuadPotential_SparseInv'

sudo easy_install scikits.sparse gives a gcc compilation error.

scikits/sparse/cholmod.c:278:33: fatal error: suitesparse/cholmod.h: No such file or directory

jsalvatier · 2013-03-17T20:09:01Z

I think I've fixed the problem with importing (it did something conditional on scikits.sparse being available).

mrocklin · 2013-03-17T21:29:03Z

In [1]: import pymc
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
/home/mrocklin/workspace/pymc/<ipython-input-1-5f262cfcb99b> in <module>()
----> 1 import pymc

/home/mrocklin/workspace/pymc/pymc/__init__.py in <module>()
      6 from sample import *
      7 from step_methods import *
----> 8 from tuning import *
      9 
     10 from debug import *

/home/mrocklin/workspace/pymc/pymc/tuning/__init__.py in <module>()
      1 from starting import find_MAP
----> 2 from scaling import approx_hess

/home/mrocklin/workspace/pymc/pymc/tuning/scaling.py in <module>()
      4 @author: johnsalvatier
      5 '''
----> 6 import numdifftools as nd
      7 import numpy as np
      8 from ..core import *

ImportError: No module named numdifftools

By the way I'm running on Ubuntu 12.04 with the Enthought Python Distribution 7.3.2

mrocklin · 2013-03-17T21:35:39Z

If I pip install numdifftools on my own I can import pymc cleanly. I rarely use setup.py. Does it handle dependencies?

jsalvatier · 2013-03-17T22:17:42Z

Yes it does, and it's already listed there. I'm not sure what would cause that.

requires=['theano','numpy','scipy','numdifftools']

(Maybe I should also list pandas since the examples that load external data require it)

jsalvatier · 2013-03-17T23:06:35Z

I had been using distutils to do the install, but I switched to using setuptools. Then if you tell pip to install from source, it seems to work better:

sudo pip install pymc3/

mrocklin · 2013-03-29T14:16:35Z

SymPy now has a fairly natural Theano printer which supports dimensionality.

In [1]: from sympy.stats.crv_types import ExponentialDistribution
In [2]: from sympy.printing.theanocode import theano_code, theano
In [3]: from sympy import *
In [4]: rate = Symbol('lambda', positive=True)
In [5]: x = Symbol('x', real=True)
In [6]: ExponentialDistribution(rate) # This is a SymPy object
Out[6]: ExponentialDistribution(lambda)
In [7]: ExponentialDistribution(rate)(x) # This is a SymPy expression
Out[7]: 
   -λ⋅x
λ⋅ℯ    
In [8]: theano_code(ExponentialDistribution(rate)(x))  # This is a Theano var
Out[8]: Elemwise{mul,no_inplace}.0

In [10]: theano_code(ExponentialDistribution(rate)(x), broadcastables={x: (False,), rate: (True,)})  # This is a Theano tensor var
Out[10]: Elemwise{mul,no_inplace}.0

In [11]: theano.printing.debugprint(_)
Elemwise{mul,no_inplace} [@A] ''   
 |lambda [@B]
 |Elemwise{exp,no_inplace} [@C] ''   
   |Elemwise{mul,no_inplace} [@D] ''   
     |InplaceDimShuffle{x} [@E] ''   
     | |TensorConstant{-1} [@F]
     |x [@G]
     |lambda [@B]

I'm not sure if this is of any use to you all (I suspect that you've moved beyond this idea). I'm not sure how this would be integrated into your system but it does supply a clean transition and opens up the possibility to use SymPy's simplification (both stats specific and general algebraic simplification).

In [20]: simplify(log(ExponentialDistribution(rate)(x)))
Out[20]: -λ⋅x + log(λ)

In [28]: theano.printing.debugprint(theano_code(_, broadcastables={x: (False,), rate: (True,)}))
Elemwise{add,no_inplace} [@A] ''   
 |Elemwise{mul,no_inplace} [@B] ''   
 | |InplaceDimShuffle{x} [@C] ''   
 | | |TensorConstant{-1} [@D]
 | |x [@E]
 | |lambda [@F]
 |Elemwise{log,no_inplace} [@G] ''   
   |lambda [@F]

I recently wrote about the benefits of SymPy and Theano integration here
http://matthewrocklin.com/blog/work/2013/03/19/SymPy-Theano-part-1/
http://matthewrocklin.com/blog/work/2013/03/28/SymPy-Theano-part-2/

If we can find a motivating use case I'd be to support it from the SymPy and Theano ends.

jsalvatier · 2013-04-01T05:36:10Z

This is pretty cool Matthew. I think we won't use this right now, but it might come in handy in the near future.

twiecki · 2013-04-01T11:15:20Z

I agree. It also seems like Theano is working on the Theano -> SymPy conversion which would be easier to use in pymc3 for e.g. better simplification (as your blog post clearly shows).

It would be easy to enough to tie in SymPy on a per-need basis now that one can do the SymPy -> Theano conversion if, for example, someone requires a distribution that's only present in there. Maybe I can cook up an example.

nouiz · 2013-04-01T11:32:36Z

I don't know where you take that we are working on this. I just made a ticket to don't forget it. I didn't heard anyone telling he will work on that and I have other thing to do before.

twiecki · 2013-04-01T12:18:28Z

My bad. Let me restate that there is a chance this will be possible in Theano in the unspecified future.

twiecki mentioned this issue Feb 19, 2013

Current status? nouiz/theano_sympy#1

Closed

mrocklin mentioned this issue Mar 17, 2013

Running examples from clean install #185

Closed

twiecki closed this as completed May 26, 2013

Leveraging SymPy for PyMC3 #178

Leveraging SymPy for PyMC3 #178

Comments

twiecki commented Feb 19, 2013

Overlap

Derivatives

Other benefits

jsalvatier commented Feb 19, 2013

mrocklin commented Feb 19, 2013

jsalvatier commented Feb 19, 2013

mrocklin commented Feb 19, 2013

jsalvatier commented Feb 19, 2013

mrocklin commented Feb 19, 2013

mrocklin commented Feb 19, 2013

mrocklin commented Feb 19, 2013

jsalvatier commented Feb 19, 2013

mrocklin commented Feb 19, 2013

mrocklin commented Feb 19, 2013

jsalvatier commented Feb 19, 2013

twiecki commented Feb 19, 2013

mrocklin commented Feb 19, 2013

jsalvatier commented Feb 19, 2013

mrocklin commented Feb 19, 2013

jsalvatier commented Feb 19, 2013

jsalvatier commented Feb 19, 2013

mrocklin commented Feb 19, 2013

twiecki commented Feb 19, 2013

mrocklin commented Feb 19, 2013

mrocklin commented Feb 19, 2013

jsalvatier commented Feb 19, 2013

twiecki commented Feb 20, 2013

mrocklin commented Feb 20, 2013

twiecki commented Feb 20, 2013

mrocklin commented Feb 20, 2013

twiecki commented Feb 20, 2013

mrocklin commented Feb 20, 2013

mrocklin commented Feb 22, 2013

jsalvatier commented Feb 22, 2013

mrocklin commented Feb 22, 2013

mrocklin commented Feb 22, 2013

twiecki commented Feb 22, 2013

jsalvatier commented Feb 22, 2013

mrocklin commented Feb 22, 2013

mrocklin commented Feb 22, 2013

mrocklin commented Feb 22, 2013

jsalvatier commented Feb 22, 2013

nouiz commented Feb 23, 2013

mrocklin commented Feb 28, 2013

jsalvatier commented Mar 5, 2013

mrocklin commented Mar 17, 2013

jsalvatier commented Mar 17, 2013

jsalvatier commented Mar 17, 2013

jsalvatier commented Mar 17, 2013

mrocklin commented Mar 17, 2013

jsalvatier commented Mar 17, 2013

mrocklin commented Mar 17, 2013

mrocklin commented Mar 17, 2013

jsalvatier commented Mar 17, 2013

jsalvatier commented Mar 17, 2013

mrocklin commented Mar 29, 2013

jsalvatier commented Apr 1, 2013

twiecki commented Apr 1, 2013

nouiz commented Apr 1, 2013

twiecki commented Apr 1, 2013