Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leveraging SymPy for PyMC3 #178

Closed
twiecki opened this issue Feb 19, 2013 · 60 comments
Closed

Leveraging SymPy for PyMC3 #178

twiecki opened this issue Feb 19, 2013 · 60 comments

Comments

@twiecki
Copy link
Member

twiecki commented Feb 19, 2013

SymPy (http://sympy.org/en/index.html) is a Python library for symbolic mathematics.

My initial motivation for looking at SymPy resulted from #172 and #173. Instead of recoding all probability distributions, samplers etc in Theano, maybe we could just use the ones provided by sympy.stats (http://docs.sympy.org/dev/modules/stats.html).

For this to work we needed to convert the sympy computing graph to a theano one. It seems that there is some work that shows that this is possible (https://github.com/nouiz/theano_sympy)

Looking at sympy (and sympy.stats) more closely it seems that there are potentially more areas where integrating this could help. Maybe this would give the best of both worlds: "Theano focuses more on tensor expressions than Sympy, and has more machinery for compilation. Sympy has more sophisticated algebra rules and can handle a wider variety of mathematical operations (such as series, limits, and integrals)."

There is additional discussion here: nouiz/theano_sympy#1.

Copy pasting some chunks from @mrocklin response to move the discussion over here:

Overlap

There are some obvious points of overlap between the various projects

  • PyMC has distributions and SymPy has distributions (https://github.com/sympy/sympy/blob/master/sympy/stats/crv_types.py). SymPy doesn't currently have infinite discrete random variables like Poisson though. This could be fixed but is a current failing. SymPy's support for analytic solution of infinite sums is poor so this was a low priority. It seems like you're not really looking for that though.
  • PyMC has implemented some special functions that could be in Theano. I would encourage you to push these upstream. They'll probably get some useful attention from the Theano crowd.
  • Looking at the pymc2 readme it appears that you have created some sort of symbolic algebra class structure (you add two pymc.Normal objects). Presumably SymPy.core might be of use here.

What is the relationship with statsmodels? They also have a home-grown internal algebraic system. My guess is that if everyone were to unite under one algebraic system there would be some pleasant efficiencies. I obviously have a bias about what that algebraic system should be :)

Derivatives

Both Theano and SymPy provide derivatives which, apparently, you need. SymPy provides analytic ones, Theano provides automatic ones. My suggestion would be to use SymPy if it works and fall back on Theano if it doesn't work. You don't need SymPy.stats for this (in case you didn't want to offload your distributions work.) SymPy.core would be just fine.

Other benefits

In general the benefits to using symbolic systems tend to be unexpected. SymPy can provide lots of general aesthetic fluff like awesome pretty printing, symbolic simplification, C/Fortran code snippet generation, etc....

@jsalvatier
Copy link
Member

Very interesting. Theano_Sympy doesn't provide examples that make it clear how one could take a calculation in sympy and then convert it to a theano calculation, so it's a little hard to evaluate.

I'm not sure what he means when he says 'you have created some sort of symbolic algebra class structure (you add two pymc.Normal objects)'. Is that just

x = Normal('x', 0,1)
y = Normal('y', 0, 1)
z = x + y

?

@mrocklin
Copy link

Yes, that is what I mean.

@jsalvatier
Copy link
Member

Okay that makes sense. Right now, in pymc3, theano takes care of that stuff.

The pretty printing etc does sound pretty nice.

If I wanted to calculate the probability density of a Normal distribution given some value for the mean, variance and data value using the SymPy distribution but do that calculation in Theano, how would I do that?

@mrocklin
Copy link

It looks like theano_sympy was designed to take Theano expressions to SymPy, simplify them, and then send them back again. This is not your application

The function that seems most useful in your case is the aptly named sympy_to_theano. It requires a sympy expression (easy to obtain) and information about the resulting theano variables (simple). Theano variables have more information than SymPy variables do. In particular they have a dtype and a broadcastable . You're being asked to provide this.

Here is an example using sympy.stats and theano

In [1]: from sympy.stats import *
In [2]: from sympy import Symbol, pprint, pi
In [3]: import theano
In [4]: X = Normal('x', 2, 3)
In [5]: x = Symbol('x')
In [6]: expr = density(X)(x)
In [7]: pprint(expr)
               2
       -(x - 2) 
       ─────────
  ___      18   
╲╱ 2         
────────────────
        ___     
    6⋅╲╱ π      



In [8]: from graph_translation import sympy_to_theano

In [9]: var_map = {'x': ('float64', (False, False))}  # x is a rank 2 tensor of floats
In [10]: texpr = sympy_to_theano(expr, var_map)  # oops, SymPy.pi doesn't have a Theano analog
KeyError: <class 'sympy.core.numbers.Pi'>

In [12]: expr = expr.subs(pi, pi.evalf())  # replace Pi with the float equivalent in SymPy
In [13]: texpr = sympy_to_theano(expr, var_map)  # try this again

In [14]: theano.printing.debugprint(texpr)
Elemwise{mul,no_inplace} [@A] ''   
 |InplaceDimShuffle{x,x} [@B] ''   
 | |TensorConstant{0.0940315946937} [@C]
 |InplaceDimShuffle{x,x} [@D] ''   
 | |Elemwise{pow,no_inplace} [@E] ''   
 |   |TensorConstant{2} [@F]
 |   |TensorConstant{0} [@G]
 |Elemwise{exp,no_inplace} [@H] ''   
   |Elemwise{mul,no_inplace} [@I] ''   
     |InplaceDimShuffle{x,x} [@J] ''   
     | |TensorConstant{-1} [@K]
     |Elemwise{pow,no_inplace} [@L] ''   
       |Elemwise{add,no_inplace} [@M] ''   
       | |InplaceDimShuffle{x,x} [@N] ''   
       | | |TensorConstant{-2} [@O]
       | |x [@P]
       |InplaceDimShuffle{x,x} [@Q] ''   
         |TensorConstant{2} [@R]

@jsalvatier
Copy link
Member

It looks like I can just say:

distribution = Normal(0,1)
sympy_to_theano(distribution.pdf(x))

I'm curious whether that interacts well with things like indexing. For example if I say

mu = Symbol('mu')
distribution = Normal(mu[1:10], 1)
sympy_to_theano(distribution.pdf(x))

Whether this will work well.

@mrocklin
Copy link

In your example above is mu intended to be a scalar or a vector?

In SymPy Normal returns a scalar normal random variable. There was work on multivariate normal random variables a while ago but this isn't in the main branch.

@mrocklin
Copy link

Notes on my example above. You would want to extend the mapping variable in the theano_to_sympy to handle the SymPy.numbers.Pi class. This is likely to happen for other classes too. For example Theano may not have a Gamma class or other such mathematical functions.

@mrocklin
Copy link

The example above doesn't really use SymPy for anything other than a database of distributions. The random variable formalism allows the creation of some interesting problems.

@jsalvatier
Copy link
Member

I meant mu to be a vector. It's okay that Normal is the univariate distribution in the sense that it doesn't deal with covariances. Can it describe many random variables that are each normally distributed with different means and variances (specified by a vector of means and variances)?

@mrocklin
Copy link

For this I would use a collection of Normals. They can be manipulated as normal symbolic variables to form products, tuples, etc....

>>> from sympy import *
>>> from sympy.stats import *
>>> xs = symbols('x:10')
>>> mus = symbols('mu:10')
>>> Xs = [Normal(x, mu, 1) for x, mu in zip(xs, mus)]
>>> pprint(simplify(E(sum(X**2 for X in Xs))))
  2     2     2     2     2     2     2     2     2     2     
μ+ μ+ μ+ μ+ μ+ μ+ μ+ μ+ μ+ μ+ 10

@mrocklin
Copy link

Or if you wanted the full pdf of the joint probability space you could access it as follows

In [1]: from sympy.stats import *

In [2]: from sympy import *

In [4]: mus = symbols('mu:10', real=True)

In [5]: Xs = [Normal('x_%d'%i, mu, 1) for i, mu in enumerate(mus)]

In [6]: pprint(pspace(Tuple(*Xs)).pdf)
            2             2             2             2             2             2             2             2             2             2
 -(-μ+ x₀)   -(-μ+ x₁)   -(-μ+ x₂)   -(-μ+ x₃)   -(-μ+ x₄)   -(-μ+ x₅)   -(-μ+ x₆)   -(-μ+ x₇)   -(-μ+ x₈)   -(-μ+ x₉) 
 ────────────  ────────────  ────────────  ────────────  ────────────  ────────────  ────────────  ────────────  ────────────  ────────────
      2             2             2             2             2             2             2             2             2             2      
            
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
                                                                       5                                                                   
                                                                   32π                                                                    

@jsalvatier
Copy link
Member

Okay, this makes pretty good sense to me.

Looks like Sympy could be good as a source of distribution math, but probably not a good replacement for the symbolic algebra part since it's not geared to be numpy like, which I think is an important feature.

So I think we'd probably use SymPy by

  1. converting theano variables to sympy symbols
  2. doing sympy calculations for distributions
  3. converting back to theano variables

What are other people's thoughts?

@twiecki
Copy link
Member Author

twiecki commented Feb 19, 2013

@jsalvatier I don't see where we would need 1.

Couldn't the model specification consist of sympy symbols which then get translated to theano for computation?

Also, I think there is a non-trivial probability that this approach will lead to some unexpected benefits further down the line.

@mrocklin
Copy link

What numpy-like features do you need? SymPy has a dense Matrix type and a set of purely symbolic MatrixExpr types. Each of these are strictly rank 2 tensors. Support for higher rank tensors in SymPy exists but is poorly integrated.

SymPy.stats had a multivariate branch a while back that generated matrix expressions. I could resurrect it if there was a definite need. If you're interested I would look at this blogpost I recommend skipping over the beginning and just scroll down to where the code starts.

@jsalvatier
Copy link
Member

Perhaps you're right.

I think the part that makes me nervous is that sympy wasn't designed to do computation, so there can be unknown pitfalls.

For example, does sympy have a notion of multidimensional arrays? Does it do broadcasting? How about indexing and advanced indexing into vectors? Those seem pretty important to me.

It's also adding another interface between packages, and thus more possibility for problems. Sympy computations need to convert pretty well into theano graphs.

@mrocklin
Copy link

Definitely you're correct that trying to represent computations with SymPy is a recipe for disappointment. Rather, SymPy is good for representing mathematical models. In my experience there is usually a clear distinction between the model and the computation. Does such a distinction exist in your application?

You're definitely also correct that adding SymPy will bring a bunch of issues with it. SymPy.stats really isn't battle-tested. I'm confident that if you go this route you'll expose bugs. I'm around to support those but it adds inertia to the development cycle.

@jsalvatier
Copy link
Member

You're right that there is such a distinction, and I think the separation is pretty clean in pymc3 too.

Perhaps a better way to say it is that often things like indexing and advanced indexing come in at the model level here. For example, you will want to specify things like

group = [0,0,1,1,2, 0]
x ~ Normal(0, 1, shape = (num_of_groups,) )
y ~ Normal(x[group], 1)

Meaning that the mean of the prior distribution for y_i is the value of x_{group_i}.

Often these are treated in a handwavy fashion in mathematics. For example: the notion of broadcasting is not a commonly discussed issue in math. So I would be somewhat surprised to find features like these to be well developed in sympy or to allow for easy implementation. I would love to be surprised though.

@jsalvatier
Copy link
Member

Something along these lines does seem to exist in SymPy, but it doesn't look that well developed.

@mrocklin
Copy link

Yeah, I would steer clear of Indexed. I would look at Matrix if you don't need ranks greater than 2.

Your intuition is correct that SymPy would not cleanly allow that syntax. You would have to resort to list comprehensions, map, for-loops, etc....

Fancy indexing has been on our issues list for matrices for a while though. It's certainly conceivable that we would support this in the future. If you spoke up on the mailing list saying that you needed it for your project it's likely that someone would implement it relatively soon. We're in GSoC application mode right now so we have a number of aspiring contributors looking to pounce on things like this.

I wouldn't support adding a shape keyword to sympy.stats but you could wrap something fairly easily. The following SymPy code would serve as the basis for a decent recipe

In [1]: from sympy.stats import *
In [2]: groups = [0, 0, 1, 1, 2, 0]
In [3]: numgroups = len(set(groups))
In [4]: x = Matrix(1, numgroups, lambda i,j: Normal('x_%d'%j, 0, 1))
In [5]: x
Out[5]: [xxxxx₄]
In [6]: y = Normal(x[1, groups], 1)  # This fails due to lack of fancy indexing
IndexError: Invalid index a[[0, 0, 1, 1, 2, 0]]

In general though if you switch to SymPy as an modeling input language your interface will change. You'll get a huge amount of general mathematical formalism but will lose the specific features you've cooked into your own language. For some of those features you can add them to SymPy or ask the SymPy community to add them for you. I'm sure that some language features would be difficult to support though.

@twiecki
Copy link
Member Author

twiecki commented Feb 19, 2013

Hm, the array broadcasting is a real issue.

One other idea: If we mainly just wanted to use sympy.stats for it's distributions. Would it be possible to borrow those and replace the sympy symbols with Theano ones (e.g. sympy.sin -> Theano.sin)?

@mrocklin
Copy link

The theano_sympy project and in particular the sympy_to_theano function was designed to handle that transformation. You may want to rewrite it but the core pieces for how to break down a sympy graph and how to construct a theano graph are there.

@mrocklin
Copy link

If you just wanted the distributions you might want to skip the random variable and probability space formalism. Functions like Normal provide a random variable. You probably would want the classes like NormalDistribution. They're less powerful but far simpler.

@jsalvatier
Copy link
Member

With this approach we don't have to invest too much. We could even have some theano, some Sympy based distributions.

@twiecki
Copy link
Member Author

twiecki commented Feb 20, 2013

@mrocklin I didn't find NormalDistribution, can you link me to it? Also, does Normal use NormalDistribution internally?

@mrocklin
Copy link

https://github.com/sympy/sympy/blob/master/sympy/stats/crv_types.py#L1632

NormalDistribution is a distribution and contains information like the pdf. It has the ability to compute expectations of expresssions, cdf, etc....

Normal is a random variable. It acts in every way like a SymPy variable and handles interaction with the rest of the SymPy language. Whenever you query a SymPy expression with random variables e.g. P(X>2*Y) then all of the distributions of all the random variabes within that expression are collected and considered collectively.

If you want SymPy to do statistical simplifications for you then use Normal. If you just want a database of distributions then use NormalDistribution.

Using Normal and SymPy gives you simplifications like the following

>>> X = Normal('x', 0, 1)
>>> Y = Normal('y', 0, 1)
>>> Z = Normal('z', 0, 1)
>>> density(X**2 + Y**2 + Z**2)
ChiSquaredDistribution(3)

But it would force you to use SymPy as your modeling language, which may not be appropriate.

@twiecki
Copy link
Member Author

twiecki commented Feb 20, 2013

Oh, that's really neat! Although it doesn't seem to work on my install, the last call just returns a lambda with a nasty integral. Maybe a version issue (I'm running 0.7.2, this is sympy master branch I suppose?.

Do you think this could figure out conjugate priors? For example:

Bern(x | \theta) * Beta(\theta) / Integral(Bern(x | \lambda) * Beta(\lambda) d\lambda) = Beta()

Where the result is another Beta distribution with updated parameters.

There is an explicit example at https://en.wikipedia.org/wiki/Conjugate_prior

@mrocklin
Copy link

In principle, yes that should be solvable. In practice no, that is not yet solvable. I've never used sympy.stats with random parameters so this exposes some bugs which I'm now fixing. Obviously this is an important application though.

In general sympy.stats transforms random expressions like what you have above into integrals. Sympy.integrate is generally surprisingly good at solving integrals. Of course, there are limits. There is nothing stopping us from generating the relevant integrals; I can't speak to what will happen after that.

The statistical simplification is in a separate branch and not yet in master.

@twiecki
Copy link
Member Author

twiecki commented Feb 20, 2013

OK, I see that it's not practical. But just to finish this line of thought. The trick with conjugate priors is that there is no integration required. Just simplifying the numerator and denominator one can see that this is again a beta distribution, but with different parameters. Not sure if that changes anything.

@mrocklin
Copy link

Mixing finite and continuous RVs is a bit buggy. Here is an example with strictly normal random variables.

>>> from sympy.stats import *
>>> from sympy import *
>>> mu = Normal('mu', 2, 3)
>>> x = Normal('x', mu, 1)
>>> y, z = Symbol('y'), Symbol('z')
>>> simplify(density(mu, Eq(x, y))(z))  # density of mu given x == y
            2                2           
         9y          y   5z    2z   1 
       - ──── + yz -- ──── + ─── - ──
  ___     20          5    9      9    45
╲╱ 5                                  
─────────────────────────────────────────
                     ___                 
                 3⋅╲╱ π                  

Internally it sets up and solves the following integral

In [17]: simplify(density(mu, Eq(x, y), evaluate=False)(z))  # density of mu given x == y
Out[17]: 
 oo                                                               
  /                                                               
 |                                                                
 |                      2          2                              
 |               (x - z)    (z - 2)                               
 |             - -------- - --------                              
 |                  2          18                                 
 |            e                     *DiracDelta(x - y)            
 |  ----------------------------------------------------------- dx
 |        oo  oo                                                  
 |         /   /                                                  
 |        |   |                                                   
 |        |   |            2          2                           
 |        |   |     (x - z)    (z - 2)                            
 |        |   |   - -------- - --------                           
 |        |   |        2          18                              
 |        |   |  e                     *DiracDelta(x - y)         
 |  6*pi* |   |  ---------------------------------------- dx dz   
 |        |   |                    6*pi                           
 |        |   |                                                   
 |       /   /                                                    
 |       -oo -oo                                                  
 |                                                                
/                                                                 
-oo                                                               

This is all in a branch.

@mrocklin
Copy link

Is it correct to assume you meant to say "The denominator doesn't come into it" ? Otherwise I'm more confused than I thought :)

I suspect that the d/dtheta (log(p(x | theta) * p(theta)) term might be algebraically simplifiable. Is the cost of your computation sensitive to the complexity of this term?

@jsalvatier
Copy link
Member

Heh, yep. Looks like I was confused and not you!

Yes simplifying either the log posterior (logp) or the derivative (dlogp)
would be useful. My impression is that right now the dlogp takes longer to
compute than logp, but not by a lot.

On Thu, Feb 21, 2013 at 4:38 PM, Matthew Rocklin
notifications@github.comwrote:

Is it correct to assume you meant to say "The denominator doesn't come
into it" ? Otherwise I'm more confused than I thought :)

I suspect that the d/dtheta (log(p(x | theta) * p(theta)) term might be
algebraically simplifiable. Is the cost of your computation sensitive to
the complexity of this term?


Reply to this email directly or view it on GitHubhttps://github.com//issues/178#issuecomment-13922311.

@mrocklin
Copy link

This is obviously not my field but I've been playing with this a while. Here is a problem with a Poisson distribution whose rate parameter is distributed with a Beta distribution.

In this particular case it looks like dlogp is cheaper than logp.

In [1]: from sympy import *
In [2]: from sympy.stats import *
In [3]: 
In [3]: x = Symbol('x', positive=True)
In [4]: l = Symbol('lambda', positive=True)
In [5]: 
In [5]: rate = Beta(l, 2, 3)
In [6]: X = Poisson(x, rate)
In [7]: numer = density(X, rate)(X.symbol) * density(rate)(rate.symbol)
In [8]: numer
Out[8]: 
      x         2  -λ
12λλ ⋅(-λ + 1) ⋅  
─────────────────────
          x!         
In [9]: log(numer)
Out[9]: 
   ⎛      x         2  -λ⎞
   ⎜12λλ ⋅(-λ + 1) ⋅log⎜─────────────────────⎟
   ⎝          x!         ⎠

In [10]: simplify(log(numer))
Out[10]: 
                            ⎛ 2          ⎞          
                            ⎜λ  - 2λ + 1-λ + xlog(λ) + log(λ) + log⎜────────────⎟ + log(12)
                            ⎝     x!     ⎠          

In [11]: simplify(log(numer).diff(l))
Out[11]: 
   2                    
- λ  + λx + 4λ - x - 1
────────────────────────
       λ⋅(λ - 1)        

If this is interesting at all let me know. If you can provide a more likely example (that doesn't include array broadcasting) let me know.

@mrocklin
Copy link

It looks like the fully general solution to Poisson parametrized by a Beta is actually pretty simple

In [25]: rate = Beta(l, a, b)

In [26]: X = Poisson(x, rate)

In [27]: numer = density(X, rate)(X.symbol) * density(rate)(rate.symbol)

In [28]: simplify(log(numer).diff(l))
Out[28]: 
                 2                  
aλ - a + bλ - λ  + λx - λ - x + 1
────────────────────────────────────
             λ⋅(λ - 1)              
In [29]: simplify(log(numer).diff(l)).count_ops()
Out[29]: 14

In [30]: print ccode(simplify(log(numer).diff(l)))
(a*lambda - a + b*lambda - pow(lambda, 2) + lambda*x - lambda - x + 1)/(lambda*(lambda - 1))

@twiecki
Copy link
Member Author

twiecki commented Feb 22, 2013

I suppose the question is which (implicit) solution Theano would come up with using autodiff. @jsalvatier is there any way to get the complexity of that for this case?

@jsalvatier
Copy link
Member

I get the not very informative result of:

'Flatten{1}(Elemwise{Composite{[Composite{[Composite{[Composite{[Composite{[sub(add(i0,
i1), add(i2, i3))]}(Switch(i0, i1, i2), Switch(i3, i4, i2), Switch(i0,
i5, i2), Switch(i3, i6, i7))]}(i0, true_div(i1, i2), i3, i4,
true_div(i5, i2), true_div(i6, i7), i8, i9)]}(i0, add(i1, i2), i3, i4,
i5, i6, add(i1, i7), sub(i8, i3), i8,
i9)]}(Composite{[Composite{[Composite{[Composite{[AND(AND(i0, i1),
GT(i2, i3))]}(AND(i0, i1), GT(i2, i3), i4, i3)]}(AND(i0, i1), LE(i2,
i0), i3, i4, i5)]}(i0, GE(i1, i2), i1, i3, i2, i4)]}(i0, i1, i2, i3,
i4), i5, i3, i1, i2, Composite{[AND(i0, GT(i1, i2))]}(i0, i1, i2), i6,
i4, i7, i8)]}}(TensorConstant{1}, l, TensorConstant{0}, a, b,
TensorConstant{-1.0}, x, TensorConstant{1.0}, TensorConstant{0.0}))'

I'm working on graphing this

On Fri, Feb 22, 2013 at 4:39 AM, Thomas Wiecki notifications@github.comwrote:

I suppose the question is which (implicit) solution Theano would come up
with using autodiff. @jsalvatier https://github.com/jsalvatier is there
any way to get the complexity of that for this case?


Reply to this email directly or view it on GitHubhttps://github.com//issues/178#issuecomment-13941294.

@mrocklin
Copy link

Try theano.printing.pydotprint

@mrocklin
Copy link

Also make sure that this comes after you've run all of the optimizations. Theano does simplification too, it just tends not to be quite as algebraically focused. The syntactically easiest way to do this it to get the fgraph object after it has been made into a function.

from theano import *
x = tensor.matrix('x')
y = 3+ x**2 + x + 1
f = function([x], y)
theano.printing.debugprint(f.maker.fgraph)

@mrocklin
Copy link

The awesomest solution would be to use SymPy to generate the algebraic result and simplify algebraically, then print to Theano and use Theano to optimize numerically (e.g. inplace operations, constant folding). They optimize in very different ways and it'd be great to have a practical example where they work together. This was the original motivation behind the theano_sympy code sprint.

@jsalvatier
Copy link
Member

That's actually what I was thinking too. I'm still having trouble with pydot :(

@nouiz
Copy link
Contributor

nouiz commented Feb 23, 2013

You only need call pydotprint and debugprint like this:

f = theano.function(...)
theano.printing.debugprint(f)
theano.printing.pydotprint(f)

Also, to disable the fusion of elemwise to see more clearly the operation done, you can use this theano flags:

optimizer_excluding=local_elemwise_fusion

@mrocklin
Copy link

Any development on this? If you provide comparison pymc code I can handle the Theano analysis side.

@jsalvatier
Copy link
Member

Hey, I've been super busy, so I haven't done work on this, but I do have an example.

from pymc import * 
from pymc.model import Variable 

m = Model()
a = Variable('a', (), 'float64', 1)
b = Variable('b', (), 'float64', 1) 
m.vars.append(a)
m.vars.append(b)

l = m.Var('l', Beta(a,b), ())
x = m.Var('x', Poisson(l), (), testval = 1)

lp0 = logp(m)
dlp0 = dlogp([l])(m)

lp1 = m.logp().f
dlp1 = m.dlogp([l]).f

from theano.printing import pydotprint 
from theano import * 
pydotprint(dlp1, 'dlogp.png')

pp(dlp1.maker.fgraph.outputs[0])

Gives

dlogp

and

'Flatten{1}(Elemwise{Composite{[Composite{[Composite{[Composite{[Composite{[sub(add(i0, i1), add(i2, i3))]}(Switch(i0, i1, i2), Switch(i3, i4, i2), Switch(i0, i5, i6), Switch(i3, i7, i2))]}(i0, true_div(i1, i2), i3, i4, true_div(i5, i2), i6, i7, true_div(i8, i9))]}(i0, i1, i2, i3, i4, add(i5, i6), i7, i8, add(i5, i9), sub(i7, i2))]}(Composite{[AND(i0, GT(i1, i2))]}(i0, i1, i2), i3, i1, i2, Composite{[Composite{[Composite{[Composite{[AND(AND(i0, i1), GT(i2, i3))]}(AND(i0, i1), GT(i2, i3), i4, i3)]}(AND(i0, i1), LE(i2, i0), i3, i4, i5)]}(i0, GE(i1, i2), i1, i3, i2, i4)]}(i0, i1, i2, i4, i5), i6, i4, i7, i8, i5)]}}(TensorConstant{1}, l, TensorConstant{0}, x, a, b, TensorConstant{-1.0}, TensorConstant{1.0}, TensorConstant{0.0}))'

I can translate these into Theano if that's better.

I'm thinking that this is complex enough that we probably wouldn't want to put this work into pymc3. If it's worth pursuing, I think it would be better to put it into Theano as better optimizations. That way other people can benefit anyway.

@mrocklin
Copy link

I'm thinking again about the Theano-SymPy-Theano crossover, particularly about what's possible and at what level of generality it should be done. I think that there is value here but negotiating the best crossover point is complex.

It's clear that, at least for this problem, a fair amount can be gained by involving sympy. The Theano result isn't very intuitive and appears to be a more complex solution. I assume that this complexity affects runtime.

One option is to take the result you present above, translate it to SymPy, see what it can do, and then translate back. This is good because it's general, applies to all of Theano, and doesn't require any legwork by the pymc folks. It's bad though because the form presented here is fairly low-level, potentially limiting SymPy effectiveness. In general Theano representations are mathematically slightly lower-level than SymPy. It feels weird to travel up the abstraction stack rather than down.

For this particular project I'm curious if I can start from the Normal, Beta, etc... layer of random variables. I wonder if it is reasonable to implement these as standard Theano Ops. That way there would be a clear high-level transition over to SymPy. It would also mean that there was a single point of truth for statistical information.

pymc presents a good example problem for me; it has clear issues and clear potential benefits. I was looking for a more complex example that involved array indexing (which is where I expect SymPy to break down). I tried downloading your repository and running your example but ran into a few issues. Probably I'm just doing something silly.

pymc$ ipython
In [1]: run examples/ARM5-4.py
/home/mrocklin/workspace/pymc/pymc/step_methods/hmc.py in <module>()
      5 ''' 
      6 from numpy import floor
----> 7 from quadpotential import *
      8 from utils import *
      9 from ..core import *

AttributeError: 'module' object has no attribute 'QuadPotential_SparseInv'

Ah, well, maybe it's just not installed properly

mrocklin@notebook:~/workspace/pymc$ python setup.py install
/home/mrocklin/Software/epd-7.3-2-rh5-x86_64/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'install_requires'
  warnings.warn(msg)
running install
running build
running build_py
error: package directory 'pymc/history' does not exist

Any tips?

@jsalvatier
Copy link
Member

I'm not sure why quadpotential is giving you a problem. I'll check that out.

The second problems aren't related to the first I don't think.

@jsalvatier
Copy link
Member

I've solved the second problem I think, can you update and try again?

@jsalvatier
Copy link
Member

For the first problem, it's related to not having https://pypi.python.org/pypi/scikits.sparse . You could install that, or I'll fix it in an hour or two.

@mrocklin
Copy link

mrocklin@notebook:~/workspace/pymc$ git pull 
 setup.py |    4 ++--

pymc$ ipython
In [2]: import pymc
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/home/mrocklin/workspace/pymc/<ipython-input-2-5f262cfcb99b> in <module>()
----> 1 import pymc

/home/mrocklin/workspace/pymc/pymc/__init__.py in <module>()
      5 from trace import *
      6 from sample import *
----> 7 from step_methods import *
      8 from tuning import *
      9 

/home/mrocklin/workspace/pymc/pymc/step_methods/__init__.py in <module>()
      1 from compound import *
----> 2 from hmc import *
      3 from metropolis import *
      4 from gibbs import *

/home/mrocklin/workspace/pymc/pymc/step_methods/hmc.py in <module>()
      5 ''' 
      6 from numpy import floor
----> 7 from quadpotential import *
      8 from utils import *
      9 from ..core import *

AttributeError: 'module' object has no attribute 'QuadPotential_SparseInv'

sudo easy_install scikits.sparse gives a gcc compilation error.

scikits/sparse/cholmod.c:278:33: fatal error: suitesparse/cholmod.h: No such file or directory

@jsalvatier
Copy link
Member

I think I've fixed the problem with importing (it did something conditional on scikits.sparse being available).

@mrocklin
Copy link

In [1]: import pymc
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
/home/mrocklin/workspace/pymc/<ipython-input-1-5f262cfcb99b> in <module>()
----> 1 import pymc

/home/mrocklin/workspace/pymc/pymc/__init__.py in <module>()
      6 from sample import *
      7 from step_methods import *
----> 8 from tuning import *
      9 
     10 from debug import *

/home/mrocklin/workspace/pymc/pymc/tuning/__init__.py in <module>()
      1 from starting import find_MAP
----> 2 from scaling import approx_hess

/home/mrocklin/workspace/pymc/pymc/tuning/scaling.py in <module>()
      4 @author: johnsalvatier
      5 '''
----> 6 import numdifftools as nd
      7 import numpy as np
      8 from ..core import *

ImportError: No module named numdifftools

By the way I'm running on Ubuntu 12.04 with the Enthought Python Distribution 7.3.2

@mrocklin
Copy link

If I pip install numdifftools on my own I can import pymc cleanly. I rarely use setup.py. Does it handle dependencies?

@jsalvatier
Copy link
Member

Yes it does, and it's already listed there. I'm not sure what would cause that.

requires=['theano','numpy','scipy','numdifftools']

(Maybe I should also list pandas since the examples that load external data require it)

@jsalvatier
Copy link
Member

I had been using distutils to do the install, but I switched to using setuptools. Then if you tell pip to install from source, it seems to work better:

sudo pip install pymc3/

@mrocklin
Copy link

SymPy now has a fairly natural Theano printer which supports dimensionality.

In [1]: from sympy.stats.crv_types import ExponentialDistribution
In [2]: from sympy.printing.theanocode import theano_code, theano
In [3]: from sympy import *
In [4]: rate = Symbol('lambda', positive=True)
In [5]: x = Symbol('x', real=True)
In [6]: ExponentialDistribution(rate) # This is a SymPy object
Out[6]: ExponentialDistribution(lambda)
In [7]: ExponentialDistribution(rate)(x) # This is a SymPy expression
Out[7]: 
   -λx
λ    
In [8]: theano_code(ExponentialDistribution(rate)(x))  # This is a Theano var
Out[8]: Elemwise{mul,no_inplace}.0

In [10]: theano_code(ExponentialDistribution(rate)(x), broadcastables={x: (False,), rate: (True,)})  # This is a Theano tensor var
Out[10]: Elemwise{mul,no_inplace}.0

In [11]: theano.printing.debugprint(_)
Elemwise{mul,no_inplace} [@A] ''   
 |lambda [@B]
 |Elemwise{exp,no_inplace} [@C] ''   
   |Elemwise{mul,no_inplace} [@D] ''   
     |InplaceDimShuffle{x} [@E] ''   
     | |TensorConstant{-1} [@F]
     |x [@G]
     |lambda [@B]

I'm not sure if this is of any use to you all (I suspect that you've moved beyond this idea). I'm not sure how this would be integrated into your system but it does supply a clean transition and opens up the possibility to use SymPy's simplification (both stats specific and general algebraic simplification).

In [20]: simplify(log(ExponentialDistribution(rate)(x)))
Out[20]: -λx + log(λ)

In [28]: theano.printing.debugprint(theano_code(_, broadcastables={x: (False,), rate: (True,)}))
Elemwise{add,no_inplace} [@A] ''   
 |Elemwise{mul,no_inplace} [@B] ''   
 | |InplaceDimShuffle{x} [@C] ''   
 | | |TensorConstant{-1} [@D]
 | |x [@E]
 | |lambda [@F]
 |Elemwise{log,no_inplace} [@G] ''   
   |lambda [@F]

I recently wrote about the benefits of SymPy and Theano integration here
http://matthewrocklin.com/blog/work/2013/03/19/SymPy-Theano-part-1/
http://matthewrocklin.com/blog/work/2013/03/28/SymPy-Theano-part-2/

If we can find a motivating use case I'd be to support it from the SymPy and Theano ends.

@jsalvatier
Copy link
Member

This is pretty cool Matthew. I think we won't use this right now, but it might come in handy in the near future.

@twiecki
Copy link
Member Author

twiecki commented Apr 1, 2013

I agree. It also seems like Theano is working on the Theano -> SymPy conversion which would be easier to use in pymc3 for e.g. better simplification (as your blog post clearly shows).

It would be easy to enough to tie in SymPy on a per-need basis now that one can do the SymPy -> Theano conversion if, for example, someone requires a distribution that's only present in there. Maybe I can cook up an example.

@nouiz
Copy link
Contributor

nouiz commented Apr 1, 2013

I don't know where you take that we are working on this. I just made a ticket to don't forget it. I didn't heard anyone telling he will work on that and I have other thing to do before.

@twiecki
Copy link
Member Author

twiecki commented Apr 1, 2013

My bad. Let me restate that there is a chance this will be possible in Theano in the unspecified future.

@twiecki twiecki closed this as completed May 26, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants