GSoC 2009 Ideas

Aaron Meurer edited this page Mar 9, 2011 · 1 revision
Clone this wiki locally

Table of Contents

Google Summer of Code 2009

The main GSoC SymPy page is here:


Please add new ideas here. And remember, that you can apply with something completely different if you like. This is for inspiration. In no particular order:

  • anything from our roadmap
  • integrate our experimental Cython core into SymPy
  • asymptotic series
  • port to Python 3.0 ()
  • improve the integration algorithm, so that SymPy can integrate anything that can be integrated.
    • implement recursive Risch algorithm (compare with heuristic version)
    • improve integration of rational functions (via subresultants)
    • integration of functions on domains of maximum extent, etc.
  • definite integration & integration on complex plane using residues
  • Groebner bases and their applications in geometry, simplification and integration
    • improve Buchberger's algorithm and implement Faugere F4 (compare their speed)
  • improve polynomial algorithms (gcd, factorization) by allowing coefficients in algebraic extensions of the ground domain
  • implement efficient multivariate polynomials (arithmetics, gcd, factorization)
    • choose a polynomial representation (e.g. recursive dense) or use task dependent representations
    • implement efficient arithmetics (e.g. using geobuckets (Yan) or heaps (Monagan & Pearce))
    • implement factorization algorithm (Musser's or Wang's EEZ (better)) and gcd (e.g. EEZ-GCD)
    • provide high-level OO abstraction over polynomial tools you've developed
  • improve SymPy's pattern matching abilities (efficiency and generality)
    • experiment with regular expressions over SymPy's algebraic expressions
    • implement similarity measure between expression trees
    • expression complexity measures (e.g. Kolmogorov's complexity)
    • implement expressions signatures and heuristic equivalence testing
    • implement semantic matching (e.g. expression: cos(x), pattern: sin(a*x) + b)
      • e.g by using power series for this purpose (improve series speed)
    • matching of Python's objects (lists, sets, tuples etc.)
    • use pure Python syntax for all this (no preparsing)
  • improve simplification and term rewriting algorithms
    • add (improve) verbatim and semi-verbatim modes (more control on expression rewriting)
    • fix annoying minus sign rewriting problem (separate internal representation and printing)
    • extend simplify() function to support something more than rational functions (see trim())
    • implement more expression rewrite functions (to an exact form that user specifies)
    • maybe put transformation rules in an external database (e.g. prolog), what about speed?
    • improve context (e.g. input) depended simplification steps in different algorithms
      • e.g. the integrator needs different sets of rules to return "better" output for different input
      • but there are more: recurrences, summations, solvers, polynomials with arbitrary coefficients
    • what about information carried by expressions?
      • what is simpler: chebyshevt(1, x) or x ?
      • what is simpler: chebyshevt(1000, x) or (...) ?
  • implement symbolic (formal) logics and set theory
    • implement predicate (e.g. first-order), modal, temporal, description logics
    • implement multivalued logics; fuzzy and uncertain logics and variables
    • implement rewriting, minimization, normalization (e.g. Skolem) of expressions
    • implement set theory, cardinal numbers, relations etc.
    • add nice unicode, latex and mathml pretty printing
      • maybe use user dependent sets of symbols e.g. for implication (=>), (->)
    • add drawing of Venn diagrams (e.g. for educational purpose)
    • improve SymPy's logic facilities to boost assumptions engine
  • implement symbolic global optimization (value, argument) with/without constraints, use assumptions
  • improve the series expansion (relevant issues)
    • formal power series (FPS)
    • improve limits - make sure all basic limits work
  • objects with indices (tensors)
  • improve the plotting module:
    • better matplotlib integration
    • extract the math TeX typesetting engine from matplotlib (it has some external dependencies on freetype and Agg that will need to be resolved), and integrate it in our plotting lib (maybe create a new project for this engine)
  • generalized functions -- Dirac delta, P(1/x), etc... Convolution, Fourier and Laplace transforms
  • vector calculus, differential fields, maybe Lie algebras & groups
  • parametric integrals asymptotic expansion (integral series)
  • differential equations (currently SymPy only supports some very simple ones)
  • increase image processing of PIL+SymPy functionality to match that of octave or matlab
  • improve SymPy's interoperatibility with other CAS software
    • implement general code parsing (Mathematica, Maxima, Axiom etc.) using e.g. pyparsing
    • implement Python + SymPy (structure + semantics) translation to your favorite CAS (other than SymPy :)

Detailed Ideas

Polynomials module

Efficient Groebner bases and their applications

  • Status:
Groebner bases computation is one of the most important tools in computer algebra, which can be used for computing multivariate polynomial LCM and GCD, solving systems of polynomial equations, symbolic integration, simplification of rational expressions, etc. Currently there is an efficient version of Buchberger algorithm implemented, along with naive multivariate polynomial arithmetics in monomial form.
  • Idea:
Improve efficiency of Groebner basis algorithm by using better selection strategy (e.g. sugar method) and implement Faugere F4 or F5 algorithm and analyze which approach is better in what contexts. Apply Groebner bases in integration of rational and transcendental functions and simplification of rational expressions modulo a polynomial ideal (e.g. trigonometric functions).
  • Rating: 5 (very hard)
Multivariate polynomials and factorization
  • Status:
Factorization of multivariate polynomials is an important tool in algebra systems, very useful by its own, also used in symbolic integration algorithms, simplification of expressions, partial fractions, etc. Currently multivariate factorization algorithm is based on Kronecker's method, which is impractical for real life problems. Undergo there is implementation of Wang's algorithm, the most widely used method for the task.
  • Idea:
Start with implementing efficient multivariate polynomial arithmetics and GCD algorithm. You do this by improving existing code, which is based on recursive dense representation or implement new methods based on your research in the field. There are many interesting methods, like Yan's geobuckets or heap based algorithms (Monagan & Pearce). Having this, implement efficient GCD algorithm over integers, which is not a heuristic, e.g. Zippel's SPMOD, Musser's EZ-GCD, Wang's EEZ-GCD. Help with implementing Wang's EEZ factorization algorithm or implement your favourite method, e.g. Gao's partial differential equations approach. You can go further and extend all this to polynomials with coefficients in algebraic domains or implement efficient multivariate factorization over finite fields.
  • Rating: 4-5 (quite hard)
Univariate polynomials over algebraic domains
  • Status:
Currently SymPy features efficient univariate polynomial arithmetics, GCD and factorization over Galois fields and integers (rationals). This is, however, insufficient in solving real life problems, and isn't very useful symbolic integration and simplification algorithms.
  • Idea:
Choose a univariate polynomial representation in which elements of algebraic domains will be efficiently encoded. By algebraic domains we mean algebraic numbers and algebraic function fields. Having a good representation, implement efficient arithmetics and GCD algorithm. You should refer to work due to Monagan, Pearce, van Hoeij et. al. Having this, implement your favourite algorithm for factorization over discussed domains. This will require algorithms for computing minimal polynomials (this can be done by using LLL or Groebner bases). You can also go ahead and do all this in multivariate case.
  • Rating: 4-5 (quite hard)

Integration module

Implement Risch algorithm, a decision procedure for symbolic integration

  • Status:
The main symbolic integration algorithm in SymPy is the heuristic Risch algorithm, which is based on Manuel Bronstein's Poor Man Integrator. Although, this heuristic algorithms is quite simple and fast, and handle sever classes of special functions, it is not a decision procedure so it may fail to find an antiderivative, even if one exists (and it happens quite often with very simple integrals). It also have very weak support for algebraic functions.
  • Idea:
A symbolic manipulation package must be equipped with a decision procedure for integration problem. The task is to implement Risch algorithm (also known as the recursive Risch algorithm) for integrating elementary transcendental functions. You should implement as much as possible of the algorithm, especially special cases. The algorithm is described in detail in Bronstein's book but you will need to put some effort to handle algebraic functions in your implementation.
  • Rating: 3-5 (quite hard)
Implement symbolic integration via Marichev-Adamchik Mellin transform
  • Status:
SymPy's integrator supports several classes of special function. This is, however, insufficient in solving advanced physics and engineering problems.
  • Idea:
Extend the integrator by implementing methods for integration of elementary and special functions via Meijer G-functions (also known as Marichev-Adamchik Mellin transform). This approach is useful in both indefinite and definite integration. The problems are how to convert arbitrary input expression to a G-function and how to rewrite the resulting expression in terms of more familiar functions (elementary, special, hypergeometric).
  • Rating: 3-5 (quite hard)
Implement definite integration algorithm using residues
  • Status:
Currently the integrator handles definite integration problem by applying Newton-Liebnitz theorem. This is, however, only useful when there are no poles on the path of integration.
  • Idea:
Improve the integrator by expressing the initial integral as an integral on the complex plane and applying Cauchy integral theorem.

Rating: 3-4 (hard)

Concrete module

Implement Karr algorithm, a decision procedure for symbolic summation

  • Status:
SymPy currently features Gosper algorithm and some heuristics for computing sums of expressions. Special preference is for summations of hypergeometric type. It would be very convenient to support more classes of expressions, like (generalized) harmonic numbers etc.
  • Idea:
Algorithm due to Karr is the most powerful tool in the field of symbolic summation, which you will implement in SymPy. There are strong similarities between this method and Risch algorithm for the integration problem. You will start with implementing the indefinite case and later can extend it to support definite summation (see work due to Schneider). Possibly you will also need to work on solving difference equations.
  • Rating: 3-5 (quite hard)

Potential Mentors

If you are willing to mentor, please add yourself here:

  • Ondrej Certik
  • David Joyner (for DEs or image processing)
  • Mateusz Paprocki