New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Theano computation output #1905
Conversation
Recreated Theano variables with the same inputs are not equal. Recreated SymPy variables with the same inputs are. We cache the transformation to preserve SymPy behavior
What is the best way to shield this code in case Theano isn't available on the system? For example, I fully expect travis to choke on this PR. |
This work derives from a project I did with @nouiz |
Checkout how the tests in external are done. |
So, um, now that you know how the printers work... |
Ha! Maybe sometime, dot-printing isn't anywhere near the top of my stack though. In general though I still don't think that the printer system handles the dot situation well though. The output format isn't strictly hierarchical like all of our other printing systems. I don't think that they're a good fit. |
SymPy Bot Summary: 🔴 Failed after merging mrocklin/theano-print (6d6c6ca) into master (b498c0f). |
SymPy Bot Summary: ✳️ Passed after merging mrocklin/theano-print (7ddd3f9) into master (592024a). |
+1. I would be extremely happy to see a working and cleanly implemented lambdify. For the moment the original one is neither, while the experimental one might be working but is certainly not cleanly implemented. However this PR will fail on stuff like The way this was "solved" in |
We could create a SymPyOp in Theano that calls wraps up SymPy calls. They might also have a generic Python function wrapper; I'm not sure. I wouldn't mind failing on these cases though. This encourages users to simplify their expressions as much as possible in SymPy before moving down to a computational layer. E.g. "You're using Bras and Kets in numeric code? Maybe you should think about how those can be represented with vectors...." A numeric approach to quantum mechanics needs to think about this at some point; I don't mind forcing the issue sooner rather than later during a profiling stage. I'm afraid that wrapping up SymPy code will leave people wondering "why isn't this working faster?" I remember a conversation with you when you espoused the "garbage in, garbage out" philosophy. I think I usually take the opposite approach, requiring clean inputs. |
I do agree about that. I like "garbage in, garbage out" in cases when it simplifies the live of the developer and forces the user to understand the workflow (not always a selling point). I like the aforementioned "forcing of the issue" for similar reasons. However, as we already started talking about that, I would like to mention a tangential issue. For use in the prototyping stage (e.g. heavy use of the plotting module) one needs a:
Issue 2. is trivial if we use Anyway, this is not an issue in this PR. Again, I am +1 for merging. |
Hrm, automated and robust use by things like the plotting module is a good point. I hadn't though of non-human use. Maybe nouiz's suggestion about raising a warning when using slow SymPy operations is a good idea. In the particular case of plotting, do we want SymPy to depend on Theano? It's substantially rarer than numpy and I suspect that the compilation time wouldn't be sensible for rapid prototyping. |
It would be amusing to have the option to play with, but from practical/architectural/philosophical point of view I am against it. My reasons are slow compilation and a lot of added complexity. As I said, I actually prefer to see even |
By the way, some time ago you employed
|
Ever tried |
I have forgotten about |
Yes, I think my approach was
|
I'd not stress the "always" too much. For example it stops I'd have to search one but I remember that it happened. |
Yes, the code generation/translation from SymPy to numerical has grown into several modules. This definitely needs to be all documented nicely, especially since this is such a big use-case for SymPy.
Maybe not a case of exactly that, but issue 3706 is an example of the printer breaking because evalf breaks.
The example that @Krastanov gave of an Integral is in general very slow (on the order of half a minute to compute a single value). I think we would be better off shipping off to a compiled numerical integrator in that case. @mrocklin's argument that the user should try to simplify things isn't really helpful here, though. Yes, obviously the user should call |
But they should be aware that their call to integrate produced an unevaluated result. They should go and look for something. Maybe they convert it into a Sum and send it to Theano.sum or maybe we implement a QuadratureOp in Theano. Seems like a reasonable place for it. Mainly the opinion that I'm voicing is that when we run into a not-very-well-handled case the user should be alerted. I think in the long run this will provide valuable feedback to us, the developers, so that we can implement quality solutions. When working with sympy.stats I think I ran across the SciPy numerical quadrature routines. They call down to some battle-hardened Fortran code. Actually I think it'd be a pretty cool project to use SymPy to select good numerical quadrature routines. Many times when we can't compute integrals we can still compute derivatives which can be very useful for efficient adaptivity. I'll bet we could produce some pretty awesome results here. |
Wait, couldn't we describe numeric quadrature well with |
Here is a start of the wiki page https://github.com/sympy/sympy/wiki/Philosophy-of-Numerics-and-Code-Generation-in-SymPy
That is what I meant. |
Yes (with some caveats), but this will still fail if the expression contains special functions not supported by SciPy/Theano/Fortran/etc. My stance is that if the user (and we) should not expect magic (i.e. fast evaluation of all possible expressions). We have different tools for "all possible expression" and "fast evaluation". We will work to bridge the gap but given that we already cover the extremes I am rather happy. |
The quadrature bit is somewhat unrelated. I think it would make a cool project (and possibly small paper?) on its own. Computing integrals is harder than computing derivatives. Derivatives are useful for computing integrals, but just require bookkeeping. We're good at that; we can exploit this.
I'm not ready to give up on this as a goal yet. I think it can be rephrased in a more constructive and achievable manner. I think we need to create simple routes from specialized representations to general ones (e.g. Integral -> Sum via Quadrature). Ideally we have a few such routes and can select between them. Perhaps when people make new structures we ask them to provide routes to old structures (e.g. stats expressions -> Integral expressions). We rarely stress this. Perhaps we should. |
Let me show an own use case: I use lambdify for getting potential definitions Summarized this is the "fast evaluation of a (small) subset of expressions" |
Maybe this discussion reflects in some parts why Mathematica has |
@raoulb, I was unaware of these two different workflows in Mathematica. Could you mention them on the wiki when you have the time? https://github.com/sympy/sympy/wiki/Philosophy-of-Numerics-and-Code-Generation-in-SymPy |
SymPy Bot Summary: ✳️ Passed after merging mrocklin/theano-print (3539973) into master (5125961). |
As I recall Maple also splits things like this, at least for differential equations. |
Is there anything blocking this. It look good for merging. |
I'm out of contact for a few days. please wait until I can take another look
|
This isn't perfect but it's in a usable state. +1 |
OK, on the basis of Krastanov and Rocklin, I'll commit this. |
This transforms SymPy expressions into Theano computations. Theano can then be used to generate relatively high performance code. Perhaps this is an alternative to lambdify? Theano can use numpy or CUDA under the hood.