Philosophy of Numerics and Code Generation in SymPy

Krastanov edited this page Mar 19, 2013 · 4 revisions
Clone this wiki locally

The entirety of this document is base on the (correct) assumption that using result = [expr.subs(*zip(vars, p)).evalf() for p in points] is too slow for serious numerical work.

There is a number of different philosophies put forward by SymPy developers for dealing with code generation. These are the two extremes:

  • prototyping: The tools should be useful for users completely unaware of how SymPy works. The tools might be slightly slower due to their generality, however they will work on all possible SymPy expressions. For instance, the plotting module needs a way to evaluate any possible SymPy expression and the simplest solution is to loop evalf(subs) over the points to be evaluated. (see evalf)
  • tuned project: The tools should work only on simple SymPy expressions (algebraic, elementary functions), but should employ tested hight performance methods. For instance, rewriting in Fortran/C. The user must find for himself how to employ special functions. (see codegen)

Obviously there is a whole spectrum between these extremes including:

  • rewriting in python math, mpmath, NumPy or even SciPy (see lambdify and exprimental_lambdify) (this solution defers to evalf when translation is impossible)
  • employing Theano and similar projects (see ``)
  • compiling Cython or Fortran code and autowrapping it in a python function (see autowrap.ufuncify)

Addressing the issue of special functions, infinite sums, limits, derivatives or integrals without closed form is hard. For the moment:

  • python based methods defer to evalf which is mostly mpmath
  • sometimes one is lucky and a Fortran/SciPy library exists that can do the numerics (for the moment this is not the case with tools used by SymPy)

Tangential issues:

  • pypy might be a magic bullet. In this case only the Theano/codegen modules will continue to be necessary.
  • When using NumPy, much of SymPy does not respect dtypes or does not broadcast correctly.
  • The precision of the results. For evalf it is multiprecision using mpmath. For lambdify and experimental_lambdify it is not well defined. For the rest it is float or double.
  • lambdify is ugly and fails on a lot of the possible inputs. experimenta_lambdify works on all possible inputs (as far as tested), but has fewer options than lambdify and is ugly as well.

One point of view is:

My stance is that the user (and we) should not expect magic (i.e. fast evaluation of all possible expressions). We have different tools for "all possible expression" and "fast evaluation". We will work to bridge the gap but given that we already cover the extremes I am rather happy.