GSoC 2018 Ideas

Abdullah Javed Nesar edited this page Mar 28, 2018 · 31 revisions

Introduction

This is the list of ideas for students wishing to apply for Google Summer of Code 2018. For more information on how to apply, see the GSoC 2018 Student Instructions. This list is here for inspiration and to give students an idea of what directions may be good for SymPy.

If you want to pursue an idea listed here, you should contact us on our mailing list and discuss it. Be sure to always ask about these ideas to get the latest information about what is implemented and what exactly has to be done.

The list is organized as follows:

High Priority Projects
Projects that are considered important in our roadmap.
Mathematics Projects
Well developed ideas of interest for us, however they do not block the release of the next major version of SymPy. These require deep understanding of the mathematics in question.
Physics Projects
Well developed ideas of interest for us, however they do not block the release of the next major version of SymPy. A number of well developed modules dealing with classical and modern physics are being developed as part of SymPy.
Computer Science, Graphics, and Infrastructure Projects
These ideas enhance the core capabilities of SymPy.
User Application Projects
We support two web applications (SymPy Live and SymPy Gamma) for end users to interact with SymPy and there are many other external applications that utilize SymPy as a library. These project ideas focus on development and improvements of applications for end users.
SymEngine Projects
These ideas support the C++ based project SymEngine. SymEngine provides core symbolic capabilities at very fast speeds. It will eventually provide an optional core for SymPy to improve its computation speeds.
Idea Prompts
List in a "brainstorming" style that has many nice project ideas. Each of the above projects were born in this list. Read it carefully as the most interesting projects may come from there.

The order of ideas in this list has no bearing to the chances of an idea to be accepted. All of them are equally good and your chances depend on the quality of your application. Also do not worry if there are no mentors assigned to a given idea. If the application is good, we will find a mentor. As already said, you can very well submit your own idea not listed here.

Submitting Your Own Idea

You can apply with something completely different if you like. The best project for you is one you are interested in, and are knowledgeable about. That way, you will be the most successful in your project and have the most fun doing it, while we will be the most confident in your commitment and your ability to complete it.

If you do want to suggest your own idea, please discuss it with us first, so we can determine if it is already been implemented, if it is enough work to constitute a summer's worth of coding, if it is not too much work, and if it is within the scope of our project.

Please use the below Idea Template to Mention Ideas:

Title

Idea

(Specify your idea with proper explanation)

Status

(What is the Status of this Idea in the Sympy Community currently, previous work done and Issues)

Involved Software

(Any other Software Involved that would be required to implement your idea)

Difficulty

(Advanced, Intermediate, or Beginner and any specific comments on the difficulty)

Prerequisite Knowledge

(Any prerequisite knowledge or approach needed)

Potential Mentors

If you are willing to mentor, please add yourself here. Also please register at https://summerofcode.withgoogle.com and add your email that you registered with. Finally, list your name with any projects below that you would be willing to mentor.

Table of Contents

High Priority

Assumptions

Idea

The project is to completely remove our old assumptions system, replacing it with the new one. The difference between the two systems is outlined in the first two sections of this blog post.

This project is challenging. It requires deep understanding of the core of SymPy, basic logical inference, excellent code organization, and attention to performance. It is also very important and of high value to the SymPy community.

Numerous related tasks are mentioned in the "Ideas" section.

Status

There has been a signficant amount of merged and unmerged work on this topic. A list of detailed issues can be found at this issue. You should take a look at the work started at https://github.com/sympy/sympy/pull/2508.

Involved Software

None

Difficulty

Advanced

Prerequisite Knowledge

Number theory, Boolean algebra, etc.

Speed: Improve SymEngine - SymPy Integration

SymPy can be slow. SymEngine provides a very fast implementation of core symbolic algorithms, and SymPy should use it to gain speed.

See here for more details:

https://github.com/sympy/sympy/wiki/GSoC-2018-Ideas#improve-sympy-integration

Mathematics Projects

Integration

Idea

Finish the implementation of the Rubi integrator. The issue related to the GSoC 2017 is https://github.com/sympy/sympy/issues/12233

Status

The current code is located in sympy/integrals/rubi, it only support Python version 3.6, it requires the external dependency MatchPy. Unfortunately there is still a lot of work to do, as the rules may raise exceptions or give wrong results. Furthermore, we need to support versions of Python other than 3.6.

Reports from the GSoC 2017:

  1. https://parsoyaarihant.github.io/blog/gsoc/2017/08/28/GSoC17-Final-Report.html
  2. https://github.com/sympy/sympy/wiki/GSoC-2017-Report-Abdullah-Javed-Nesar:-Rule-based-Integrator

Prerequisite knowledge

Some familiarity with Wolfram Mathematica can be helpful.

Pattern Matching with MatchPy

Idea

In order to use the RUBI ruleset for symbolic integration, we planned to use the efficient many-to-one pattern matching algorithms provided by MatchPy. To increase the performance of many-to-one matching, MatchPy allows to generate code, instead of using native Python data structures, similar to how a parser generator generates code to parse a grammar.

There are different possible directions for this project:

  • MatchPy is a Python 3 project. At present, it only generates Python 3 code. Since SymPy also supports Python 2.7, it would be useful for MatchPy to also generate Python 2.7 code. At present, the generated code still has some dependencies to MatchPy. Some of those dependencies can be removed by extending the code generation. For those dependencies where code generation is not feasible, the algorithms have to be translated to Python 2.7. Furthermore, there are some basic data structures, for example the Expression class, which has to be made available in Python 2.7.
  • To improve the performance of MatchPy (possibly including the code generated by MatchPy), the core pattern matching algorithms could be implemented in a lower level language, for example C++ (possibly in the SymEngine) or C. While C is probably the most challenging language, it would make it easier to make MatchPy's functionality available in other languages.

Status

At the moment, MatchPy generates Python 3 code which has several dependencies on MatchPy itself. So far, nothing was implemented in a lower level language.

Involved Software

Python, possibly C or C++

Difficulty

Intermediate (code generation) to Advanced (C implementation).

Prerequisite Knowledge

  • Knowledge of theoretical computer science, especially graphs and combinatorics.
  • knowledge of pattern matching as provided by MatchPy or Mathematica is beneficial.

Solvers

Idea

SymPy already has a pretty powerful solve function. But it has a lot of major issues

  1. It doesn't have a consistent output for various types of solutions It needs to return a lot of types of solutions consistently:

    • single solution : x == 1
    • Multiple solutions: x**2 == 1
    • No Solution: x**2 + 1 == 0; x is real
    • Interval of solution: floor(x) == 0
    • Infinitely many solutions: sin(x) == 0
    • Multivariate functions with point solutions x**2 + y**2 == 0
    • Multivariate functions with non point solution x**2 + y**2 == 1
    • System of equations x + y == 1 and x - y == 0
    • Relational x > 0
    • And the most important case "We don't Know"
  2. The input API is also a mess, there are a lot of parameter. Many of them are not needed and they makes it hard for the user and the developers to work on solvers.

  3. There are cases like finding the maxima and minima of function using critical points where it is important to know if it has returned all the solutions. solve does not guarantee this.

Sailent Features of solveset

  • solveset has a cleaner input and output interface: solveset returns a set object and a set object take care of all the types of the output. For cases where it doesn't "know" all the solutions a NotImplementedError is raised. For input only takes the equation and the variables for which the equations has to be solved.

  • solveset can return infinitely many solutions. For example solving for sin(x) = 0 returns {2⋅n⋅π | n ∊ ℤ} ∪ {2⋅n⋅π + π | n ∊ ℤ} Whereas solve only returns [0, π]

  • There is a clear code level and interface level separation between solvers for equations in complex domain and equations in real domain. For example solving exp(x) = 1 when x is complex returns the set of all solutions that is {2⋅n⋅ⅈ⋅π | n ∊ ℤ} . Whereas if x is a real symbol then only {0} is returned.

  • solveset returns a solution only when it can guarantee that it is returning all the solutions.

Status

GSoC 2014 Project: Harsh Gupta

During the summer of 2014 Harsh Gupta worked to improve solvers as part of his GSoC project. Instead of making changes in the current solve function a new submodule named solveset was written.

GSoC 2015 Project: Amit Kumar

In the summer of 2015 Amit Kumar worked on this project to improve solveset, implement complex sets as a part of his GSoC project.

GSoC 2016 Project: Kshitij Saraogi | GSoC 2016 Project: Shekhar Rajak

In the summer of 2016, two projects were selected to participate in Google Summer of Code to work on the Solvers. New solver helper functions such as solve_decomposition and nonlinsolve were implemented to facilitate the porting from solve to solveset. Also, the inequality solver solve_univariate_inequality was refactored and added to solveset. Several methods related to functional analysis, such as periodicty, continuous_domain and function_range were implemented.

TODOs

  • Transcendental Equation solver : solve uses _tsolve and bivariate.py to handle transcendental equation. _tsolve does a very good job in solving a large space of transcendental equations. The problem lies in the fact that it's codebase is very messy and not extensible. Also, adding a Set output interface to _tsolve is quite complicated. Hence, it's very important to write a more modular and extensible transcendental equation solver. We encourage you to leverage the power of bivariate module to implement the new transolve.

  • Integrating helper solvers with solveset: Currently, solveset only solves a single equation for a single variable. In the future, we expect it to be capable of solving a system of equations and for more than one variable. linsolve: Solves a system of linear equations nonlinsolve: Solves a system of non-linear equations solve_decomposition: Solves a varied class of equations using the concept of Rewriting and Decomposition These are the helper functions that have been implemented in solveset during the past few years. We would like to have all these solvers(including transolve) to be integrating in solveset so as to increase its power.

  • Build the set infrastructure: This includes implementing functions to handle multidimensional ImageSet etc., This part must go hand in hand with the improvements in the solvers as set module can be a universe in itself. Also there can be fundamental limits on the things you can do.

  • nonlinsolve is not able to handle system having trigonometric/transcendental equations correctly all the time. Improve solveset's trigonometric solver and handle trig system of equations separately in nonlinsolve.

References There had been a lot of discussion during and before the project and you should know why we did what we did. Here are some links:

Involved Software

SymPy

Difficulty

This project is difficult because it requires a good deal of thought in the application period. You should have a clear plan of most of what you plan to do in your application: waiting until the Summer to do the designing will not work.

#10006 and #8711 can be good entry points.

Prerequisite Knowledge

Algebraic and differential equations

Group theory

Idea

Continue developing the group theory functionality of the combinatorics module. You should take a look at the GAP library, as this is the canonical group theory computation system right now.

Algorithms to think about implementing:

  • Computation of various subgroups of infinite finitely presented groups
  • Computation of Galois groups for a given polynomial
  • Deciding if two groups are isomorphic
  • Finding kernels of homomorphisms with infinite domains
  • Using an automaton for word reduction in rewriting systems
  • Polycyclic presentation of groups and related algorithms
  • Quotient groups
  • Automorphism groups

Status

Previous projects on the topic include:

Quite a lot of work has been done on permutation groups, but still some things remain (some of those mentioned in GSoC 2012 Report by Aleksandar Makelov are still relevant, e.g. subgroup intersection). Some work is already done on discrete groups. Nonetheless there is still much that can be done both for discrete groups and for Lie groups.

Some major algorithms for finitely presented groups include coset enumeration (there's been work on modified Todd-Coxeter but it wasn't finished: see this PR), low index subgroup search and Reidemeister-Schreier algorithm for subgroup presentation. Rewriting systems together with the Knuth-Bendix completion algorithm are available but could be made more efficient.

Additionally, the 2017 project implemented group homomorphisms.

See the 2016 and 2017 reports for suggestions on where the work could continue.

Difficulty: Medium/Difficult

Resources: Handbook of Computational Group Theory by Derek F. Holt, Bettina Eick and Eamonn A. O'Brien

Prerequisite Knowledge: Basic knowledge of Abstract Algebra

Risch algorithm for symbolic integration

Idea

The Risch algorithm is a complete algorithm to integrate any elementary function. Given an elementary function, it will either produce an antiderivative, or prove that none exists. The algorithm described in Bronstein's book deals with transcendental functions (functions that do not have algebraic functions, so log(x) is transcendental, but sqrt(x) and sqrt(log(x)) are not).

Status

The project is to continue where Aaron Meurer left off in his 2010 GSoC project, implementing the algorithm from Manuel Bronstein's book, Symbolic Integration I: Transcendental Functions. If you want to do this project, be sure to ask on the mailing list or our IRC channel to get the status of the current project.

The algorithm has already been partially implemented, but there is plenty of work remaining to do. Contact Aaron Meurer for more information. There was also work done in 2013, which hasn't been completely merged yet. A good place to start would be to look at finishing this work: https://github.com/sympy/sympy/pulls/cheatiiit. See https://groups.google.com/forum/#!msg/sympy/bYHtVOmKEFs/UZoyDX81eP4J for some more details on this project (nothing has changed since that email thread).

Involved Software

Difficulty

Prerequisite Knowledge

You should have at least a semester's worth of knowledge in abstract algebra. Knowing more, especially about differential algebra, will be beneficial, as you will be starting from the middle of a project. Take a look at the first chapter of Bronstein's book (you should be able to read it for free via Google Books) and see how much of that you already know. If you are unsure, discuss this with Aaron Meurer (asmeurer).

Ordinary Differential Equations

Idea

Currently, SymPy only supports many basic types of differential equations, but there are plenty of methods that are not implemented. Maybe support for using Lie groups to help solve ODEs. See the ODE docs and the current source for information on what methods are currently implemented.

There is limited support for solving systems of ODEs. Possible additions: linear systems with constant coefficients and nonconstant forcing term; general linear systems of more than two equations; determination and analysis of stability of equilibria of nonlinear autonomous systems.

You also might want to look at Manuel Bronstein's sumit.

  • Separation ansatz:

    • "A simple method to find out when an ordinary differential equation is separable" by José ́Ángel Cid
  • "Solving Differential Equations in Terms of Bessel Functions" by Ruben Debeerst.

  • Lie groups and symmetry related:

    • An implementation of these methods was done for first order ODEs during gsoc13. But we can do the same tricks for second order ODEs too.
    • "Computer Algebra Solving of First Order ODEs Using Symmetry Methods" by E.S. Cheb-Terrab, L.G.S. Duarte and L.A.C.P. da Mota. There is a short (15 pages) and an updated (24 pages) version of this paper.
    • "Computer Algebra Solving of Second Order ODEs Using Symmetry Methods" by E.S. Cheb-Terrab, L.G.S. Duarte, L.A.C.P. da Mota
    • "Integrating factors for second order ODEs" by E.S. Cheb-Terrab and A.D. Roche
    • "Symmetries and First Order ODE Patterns" by E.S. Cheb-Terrab and A.D. Roche
    • "Abel ODEs: Equivalence and Integrable Classes" by E.S. Cheb-Terrab and A.D. Roche Note: Original version (12 pages): July 1999. Revised version (31 pages): January 2000
    • "First order ODEs, Symmetries and Linear Transformations" by E.S. Cheb-Terrab and T. Kolokolnikov
    • "Non-Liouvillian solutions for second order linear ODEs" by L. Chan, E.S. Cheb-Terrab
    • And probably some more by these authors ...

Status

Involved Software

Difficulty

Medium

Prerequisite Knowledge

Differential equations

Series expansions

Idea

This includes numerous smaller subprojects.

  • improve series expansions
  • improve formal power series
  • improve limits - make sure all basic limits work
    • limit of series
  • asymptotic series
  • Better support for Order term arithmetic (for example, expression of the order term of the series around a point that is not 0, like O((x - a)**3)).
  • All other problems, which are described in wiki page about series and current situation

Status

There is already a fast implementation called rs_series in SymPy. This project would extend it to work for all functions and then make it the default series expansion in SymPy.

SymPy now has support for Formal Power Series (series.formal). The algorithm is more or less complete. The module should be made faster. There are also a lot of XFAIL tests that can be made to pass.

A new algorithm for computing limits of sequences has also been added (series.limitseq). There are still XFAIL tests that can be made to pass.

Some references

  1. "Formal Power Series" by Dominik Gruntz and Wolfram Koepf
  2. "A New Algorithm Computing for Asymptotic Series" by Dominik Gruntz
  3. "Computing limits of Sequences" by Manuel Kauers
  4. "Symbolic Asymptotics: Functions of Two Variables, Implicit Functions" by Bruno Savly and John Shackell
  5. "Symbolic Asymptotics: Multiseries of Inverse Functions" by Bruno Savly and John Shackell

Involved Software

SymPy

Difficulty

Medium

Prerequisite Knowledge

Calculus

Probability

The Probability/Statistics module supports many univariate distributions and basic operations such as computing probabilities and expected values. Things to consider adding:

  • Multivariate distributions
  • Compound distributions (e.g. make it possible to create a random variable in which some parameter is another random variable).
  • Ability to export expressions of random variables to external libraries (e.g. to PyStan),
  • Support assumptions of dependence between random variables.
  • Random processes: Markov chains, random walks...

References

  1. "An Introduction to Probability Theorem and its Applications" by William Feller
  2. "Principles of Random Walk" by Frank Spitzer

Involved Software

SymPy

Difficulty

Medium

Prerequisite Knowledge

Probability theory

Cylindrical algebraic decomposition

Idea

  • Implement the Cylindrical algebraic decomposition algorithm

  • Use CAD to do quantifier elimination

  • Provide an interface for solving systems of polynomial inequalities

  • Some references:

Status

Involved Software

Difficulty

Prerequisite Knowledge

Efficient Groebner bases and their applications

Idea

Groebner bases computation is one of the most important tools in computer algebra, which can be used for computing multivariate polynomial LCM and GCD, solving systems of polynomial equations, symbolic integration, simplification of rational expressions, etc. Currently there is an efficient version of Buchberger algorithm implemented and of the F5B algorithm, along with naive multivariate polynomial arithmetic in monomial form. There is also the FGLM algorithm converting reduced Groebner bases of zero-dimensional ideals from one ordering to another.

Improve efficiency of Groebner basis algorithm by using better selection strategy (e.g. sugar method) and implement Faugere F4 algorithm and analyze which approach is better in what contexts. Implement the generic Groebner walk converting between Groebner basis of finite-dimensional ideals; there are efficient algorithms for it, by Tran (2000) and Fukuda et al. (2005).

Apply Groebner bases in integration of rational and transcendental functions and simplification of rational expressions modulo a polynomial ideal (e.g. trigonometric functions).

Status

There was a project last year relating to Groebner bases. Please take a look a the source and discuss things with us to see what remains to be done.

Some Groebner bases algorithms, in particular F4, require strong linear algebra. Thus, if you want to do that, you may have to first improve our matrices (see the ideas relating to this above).

Involved Software

Difficulty

Prerequisite Knowledge

Multivariate polynomials and factorization

Idea

Factorization of multivariate polynomials is an important tool in algebra systems, very useful by its own, also used in symbolic integration algorithms, simplification of expressions, partial fractions, etc. Currently multivariate factorization algorithm is based on Kronecker's method, which is impractical for real life problems. Undergo there is implementation of Wang's algorithm, the most widely used method for the task.

Start with implementing efficient multivariate polynomial arithmetic and GCD algorithm. You do this by improving existing code, which is based on recursive dense representation or implement new methods based on your research in the field. There are many interesting methods, like Yan's geobuckets or heap based algorithms (Monagan & Pearce). Having this, implement efficient GCD algorithm over integers, which is not a heuristic, e.g. Zippel's SPMOD, Musser's EZ-GCD, Wang's EEZ-GCD. Help with implementing Wang's EEZ factorization algorithm or implement your favorite method, e.g. Gao's partial differential equations approach. You can go further and extend all this to polynomials with coefficients in algebraic domains or implement efficient multivariate factorization over finite fields.

Status

Some work on this may already be done. Take a look at sympy/polys/factortools.py in the SymPy source code.

Involved Software

Difficulty

Advanced

Prerequisite Knowledge

Univariate polynomials over algebraic domains

Idea

Choose a univariate polynomial representation in which elements of algebraic domains will be efficiently encoded. By algebraic domains we mean algebraic numbers and algebraic function fields. Having a good representation, implement efficient arithmetic and GCD algorithm. You should refer to work due to Monagan, Pearce, van Hoeij et. al. Having this, implement your favorite algorithm for factorization over discussed domains. This will require algorithms for computing minimal polynomials (this can be done by using LLL or Groebner bases). You can also go ahead and do all this in multivariate case.

Status

Currently SymPy features efficient univariate polynomial arithmetic, GCD and factorization over modular rings and integers (rationals). This is, however, insufficient in solving real life problems, and has limited use for symbolic integration and simplification algorithms. For example, the support for finite fields GF(p^n) is missing.

Involved Software

Difficulty

Advanced

Prerequisite Knowledge

Concrete module: Implement Karr algorithm, a decision procedure for symbolic summation

Idea

Algorithm due to Karr is the most powerful tool in the field of symbolic summation, which you will implement in SymPy. There are strong similarities between this method and Risch algorithm for the integration problem. You will start with implementing the indefinite case and later can extend it to support definite summation (see work due to Schneider and Kauers). Possibly you will also need to work on solving difference equations.

  • Some references:
    • "A=B" by Marko Petkovsek, Herbert S. Wilf, Doron Zeilberger
    • "Symbolic Summation with Radical Expressions" by Manuel Kauers and Carsten Schneider
    • "An Implementation of Karr's Summation Algorithm in Mathematica" by Carsten Schneider
    • Manuel Kauers, webpage: http://www.risc.jku.at/home/mkauers
    • Carsten Schneider, webpage: http://www.risc.jku.at/people/cschneid
    • "Algorithmen für mehrfache Summen", by Torsten Sprenger

Status

SymPy currently features Gosper algorithm and some heuristics for computing sums of expressions. Special preference is for summations of hypergeometric type. It would be very convenient to support more classes of expressions, like (generalized) harmonic numbers etc. There is already an complete algorithm rational expression summation.

Involved Software

Difficulty

Advanced

Prerequisite Knowledge

Linear Algebra: Tensor core

Idea

Build a Tensor core that can serve as a base to connect these projects and others. This core should be able to seamlessly support a broad range of applications ranging from very abstract (vector spaces, geometry, multi-linear operators) to very numerical (explicit matrices, NumPy integration, code generation). This project should include both a general Tensor Expression class and a general refactoring of the existing code-base. See Linear-Algebra-Vision. This project requires experience both in abstract linear algebra and in good code organization.

Status

SymPy has a number of disconnected projects related to Tensor/Linear algebra. These include Matrices, Sparse Matrices, Matrix Expressions, Indexed (for code generation), Geometric Algebra, Differential Geometry, Tensor Canonicalization, and various projects in Physics.

Involved Software

Difficulty

Prerequisite Knowledge

Implementation of vector integration

Idea

The idea is to build proper helper functions and class structure to support vector integration over lines, surfaces and volumes.

Status

sympy.vector supports vector derivatives, but not proper functionality for vector integration - over lines, surfaces etc. For help, Prasoon's PR with his work in GSoC 2013 can be taken as a starting point.

Involved Software

Python, Git

Difficulty

Intermediate

Prerequisite Knowledge

Physics Projects

Symbolic quantum mechanics (sympy.physics.quantum)

Idea

In the past, Brian Granger was the maintainer of the sympy.physics.quantum subpackage. He has stepped down from this position. Until someone takes over maintenance of this subpackage, we will not be able to mentor any GSoC projects in this area. If you have questions about this, please contact Ondřej Čertík.

Status

Involved Software

Difficulty

Prerequisite Knowledge

Continuum Mechanics: Create a Rich 2D Beam Solving System

Idea

Singularity functions are a popular tool for solving beam bending stress and deflection problems in mechanical design. This is traditionally done by hand calculations and can be very tedious and error prone. This process could be improved greatly by a CAS implementation of the functions and some high level abstractions for constructing beam loading profiles.

The deliverable would be a unit tested and documented sub-package for SymPy 2D beams that can solve many beam problems, add in arbitrary cross sections, plotting, be robust, and add any other relevant features.

Status

A 2016 GSoC project made the first implementations of this:

https://github.com/sympy/sympy/pulls?utf8=%E2%9C%93&q=is%3Apr+author%3Asampadsaha5+

Involved Software

Python, Git

Difficulty

Beginner

Prerequisite Knowledge

No specific prerequisite knowledge is necessary but it would help if the student had some knowledge of how singularity functions and beam stress/strain analysis methods.

Classical Mechanics: Implement an O(N) Equation of Motions Method

Idea

Roy Featherstone, Abhi Jain, and others developed recursive methods of forming the right hand side of the differential equations for complex multibody systems that have an evaluation time of O(N) instead of O(N^3). This project would be dedicated to implementing a symbolic O(N) method to compliment the LagrangesMethod and KanesMethod classes. This would give a significant speed boost in numerical evaluation for systems with bodies greater than 20 or so.

Status

Sampad Saha's work in 20

Brandom Milam made significant headway in this project in 2016. See:

Additionally, Sahil Shekewat worked on implemeting a joint based descriptor for systems that should be finished and utilized for this work.

Involved Software

Python, Git

Difficulty

Advanced

Prerequisite Knowledge

This project requires familiarity with multibody dynamics. At the least, one should know how to form the equations of motion of complex systems with one method. The ideal candidate will have experience forming the equations of motion with the aforementioned Featherstone or Jain methods.

Classical Mechanics: Efficient Equation of Motion Generation with Python

Idea

Currently we have basic equation of motion generation with automated Kane's and Lagrange's methods. These methods work well but can take many minutes to complete for hard problems. The algorithms that derive these equations of motion can be improved in both speed of computation and the resulting simplification of the equations of motion. This project would involve cleaning up the code base, profiling to find the slow functions, and digging into the SymPy codebase for trigonometric simpification and other relevant function calls to speed up the EoM generation. Utilizatin of SymEngine as a backend can also be explored. These modification will help speed up both the entire SymPy codebase and the Mechanics package.

Status

There is no previous work on this topic.

Involved Software

Python, Git

Difficulty

Beginner

Prerequisite Knowledge

There are no prequisites to this project.

Classical Mechanics: Efficient Equation of Motion Generation with C++

Idea

Recently, a C++ implementation of the SymPy core has matured (SymEngine). Use of this library could significantly increase the speed of derivation of the equations of motion of complex multibody systems. This project would be dedicated to ensuring SymEngine worked with all operations typically used in the sympy.physics.vector and sympy.physics.mechanics packages. Work is needed to add in core data types and algorthims in SymEngine to mimic the equivalent in SymPy and work will need to be done to provide seemless wrappers for optional use in SymPy. Large benchmark multibody problems would be developed in Python to test the speed.

Status

There has been some work on this in the SymEngine project. There are some benchmarks that test matrix differentiation speed and the like. Please ask on the mailing list about the previous work.

Involved Software

C++, Python, CMake, Git

Difficulty

Intermediate

Prerequisite Knowledge

There are no prequisites to this project.

Classical Mechanics: Generalize the Equation of Motion Generation Classes

Idea

We need to create an abstract base class for equations of motion methods (LagrangesMethod, KanesMethod) so that new methods are easy to add. This project would focus on the generalization and creating at least one new method class for example NewtonEulersMethod or HamiltonsMethod. This abstract base class would support a standard interface to access the system's states, constants, exogenous inputs, mass matrix, right hand side, etc.

Status

Initial work has been done here: https://github.com/sympy/sympy/pull/11431, but it needs to be integrated with the other classes and tied into PyDy.

Involved Software

Python, Git

Difficulty

Intermediate

Prerequisite Knowledge

This project requires basic understanding of dynamical systems and at least understanding of one method of generating the equations of motion for a multi-body system.

Classical Mechanics: Autolev Parser

Idea

sympy.physics.mechanics is able to solve similar problems that the now defunct proprietary program, Autolev, can solve. To help people transition from Autolev code to SymPy, it would be nice to have an Autolev parser that could generate SymPy code. The results would be checked evaluating the code in the two languages and ensuring that the numerical results are the same.

It may be worth looking into using existing parser technology, for example ANTLR may be a good option.

Status

There has been no work on this project yet.

Involved Software

Python, Git, Autolev, other

Difficulty

Beginner

Prerequisite Knowledge

None

Computer Science, Graphics, and Infrastructure Projects

Code Generation

Idea

There are quite a few potential projects for codegen.

The code generation system in SymPy has been overhauled to use AST nodes from sympy.codegen.ast, there are however lot of more nodes that can be added for e.g. Fortran in sympy.codegen.fnodes. It could also be useful if the code printers could output parallel code using OpenMP directives (e.g. parallel for loops for C and Fortran, including use of reduction). Most printers do not yet support the new AST nodes, it would be useful if those were extended so that they can express ASTs created e.g. by functions in sympy.codegen.algorithms.

Another idea for codegen is to add more support for directly working with matrices. For instance, matrix expressions (sympy.matrices.expressions objects) should print LAPACK calls.

Status

We have support for a number of backends and basic code gen classes in place. There is work on updating the system ongoing. Please ask on the mailing list.

Involved Software

Fortran, C, C++, Julia, Rust, Python, LLVM, Javascript, Octave, Matlab, etc.

Difficulty

Intermediate to Advanced

Prerequisite Knowledge

Parsing

Idea

Currently SymPy has the ability to generate Python, C, and Fortran code from SymPy expressions.

It would be very interesting to go the other way. Can we parse Python, C, and Fortran code and produce SymPy expressions? This would allow SymPy to easily read in, alter, and write out computational code. This project would enable many other projects in the future. As a first step take a look at the current code generation and autowrap functionality. Ideally this project would create a general framework for parsers and then use this system to implement parsers for a few of the languages listed above. See the other parsing ideas on this page, as well as Parsing.

Status

Involved Software

Difficulty

Prerequisite Knowledge

Benchmarks and performance

Idea

Speed is important for SymPy. One issue is that it's difficult to tell what is too slow, and, more importantly, if a given change makes things faster or slower.

SymPy needs more benchmarks. It also needs an automated system to run them. That way, when someone adds some code that slows things down in an unexpected way, we will know about it.

There are already some benchmarks at https://github.com/sympy/sympy_benchmarks, and some others in the main SymPy repo. But not all benchmarks are in the sympy_benchmarks repo. Also, the repo uses asv, but the results are run and hosted ad hoc, as we don't have a dedicated machine to run the benchmarks.

This project should do the following:

  • Move benchmarks from the sympy repo to the sympy_benchmarks repo.
  • Add new benchmarks as needed.
  • Work with the community to set up a dedicated machine that can constantly run asv to warn about benchmarks. It would also be nice if this could be set up to warn for performance regressions on PRs.
  • Make improvements to SymPy to improve performance issues found throughout the project.

Some prior art:

Improve the plotting module

Idea

A very approximate difficulty guesstimate is given.

  • medium-hard: Manipulate parameters in graphs (needs some kind of GUI (matplotlib provides widgets)) (IPython's notebook?)
  • medium: Animations (in matplotlib and IPython's notebook)
  • easy-medium: Write/fix/extend/port backends: matplotlib, Google Chart API link, pyglet, asciart, d3.js
  • easy: The old pyglet module does not work with the new module. Simplify the old module and write a backend for it.
  • medium-hard: Write an openGL backend for matplotlib (that should be discussed with the matplotlib team) (you may start with our pyglet module)
  • hard: Implement an intelligent routine that decides on the sampling rate so that sharp edges are better plotted (it is done for 2D however it would be nice to have it for 3D). An "asymptote detector" would be nice.
  • easy /medium: Implement a intelligent routine that automatically determines the regions of interest for plotting.
  • easy: Plot:
    • objects from the geometry module
    • 2D and 3D linear operators (the effect of a matrix on a plane/3D space)
    • the effect of complex maps
    • vector fields
    • contours
  • easy / medium: Implement a backend for Mayavi for 3D plotting.
  • Fix related things/bugs in SymPy
  • Implement high level features, so that it works akin to Mathematica (http://reference.wolfram.com/mathematica/ref/Plot.html)
  • Improve textplot so that it can support all kinds of plots using ASCII/Unicode characters right in the terminal (with no dependencies).

Status

Involved Software

HTML, Javascript, CSS, Python

Difficulty

Prerequisite Knowledge

User Application Projects

SymPy Live and SymPy Gamma (on Google App Engine)

Idea:

WolframAlpha recently released a big update. You can now pay them and get a bunch of features. They also do things like save your search history.

Right now, our competition to WolframAlpha is SymPy Live (http://live.sympy.org/), but this works a little differently. SymPy Live is an exact duplicate of the console version of SymPy, running on the App Engine, but WolframAlpha tries to be smart about what the user wants. A while back, Ondřej whipped up a thing called SymPy Gamma (http://sympygamma.com/), which is a little closer to WolframAlpha.

The GSoC project would be to improve one or both of these projects. SymPy Gamma could be improved a lot, by making it more intelligent about what output it produces for different inputs, making it parse expressions that aren't given in exact SymPy syntax (e.g. natural language queries like Wolfram|Alpha allows), making it produce plots, perhaps replacing the notebook with an IPython notebook. SymPy Live could use a lot of the same features. Furthermore, SymPy Live has bugs with pickling, which could be fixed or eliminated by converting Live to use dill instead of pickle and/or by improving the pickling support in SymPy itself.

Many of these things should be implemented in SymPy and simply called from the web applications, for example, improved parsing for SymPy Gamma. Look at Mathics (http://www.mathics.org/) for inspiration.

Status

Involved Software

HTML, Javascript, CSS, Python

Difficulty

Prerequisite Knowledge

SymEngine Projects

SymEngine is a standalone fast C++ symbolic manipulation library. Optional thin Python wrappers allow easy usage from Python and integration with SymPy.

Please contact the SymEngine list (or Ondřej Čertík) for questions about the SymEngine related topics. You can also ask on SymEngine's gitter: https://gitter.im/symengine/symengine and propose something that is not listed below.

Polynomials and the rest of SymEngine

Idea

Build on the already existing univariate/multivariate polynomial module and have seamless interop with the rest of SymEngine. Keeping in mind the eventual goal of being a fast core, this is of high importance for SymPy as well.

Status

  • Interop Proposal
  • Univariate class improvements:
  • Multivariate class improvements
    • Currently implemented as a hashmap from vector of ints (degrees of respective symbols) to the coefficient.
    • For any operations between two multivariate polynomials, the vectors of all of the entries in the map must be updated to a common format (representing the union of the two symbols sets)
    • This is slow
    • We can try and use a ordered_map/hashmap instead of a vector for storing the degree of each symbol (in a particular monomial)
    • Operations should become much faster
  • Multivariate bindings for Piranha
    • Write wrappers for the multivariate piranha class for easy use within SymEngine
  • Miscellaneous
    • Groebner basis
    • Square free decomposition
    • Factorization

Involved Software

C++, Cython, Python, Git

  • For implementing/testing as an interface one may need to kindle with SymEngine.py and SymPy as well.

Difficulty

Intermediate

Prerequisite Knowledge

One needs to go through the code of the currently existing Polynomial module in SymEngine.

Additional reading

Improve SymPy integration

Idea

SymEngine can be used as the symbolic backend for all the functions in sympy.physics.mechanics instead of SymPy's core by setting a environment variable. This project is to expand it to more modules in SymPy and also implement missing features from SymPy core in SymEngine.

A good proposal should briefly outline the changes that needs to be implemented in order to get good results. Some prior work was done on this by @ShikharJ during GSoC 2017, and the tasks undertaken should build up on the prior work.

Related Issues: https://github.com/symengine/symengine/issues/912 https://github.com/symengine/symengine/issues/1324

Related Pull Requests: https://github.com/symengine/symengine/pull/1332 https://github.com/sympy/sympy/pulls/ShikharJ

Involved Software

C++, Cython, Python, Git

Difficulty

Medium

Prerequisite Knowledge

Implement solvers for SymEngine

Idea

A lot of work is actively going on in SymPy as to the design of solvers. In this project, the student will be expected to implement expression solvers for SymEngine along the lines of those in SymPy. At least implement most of the robust solvers that are done right in SymPy.

Status

Solvers are not present currently in SymEngine

Involved Software

C++, CMake, Git

Difficulty

Intermediate

Prerequisite Knowledge

Improve Ruby wrappers

Idea

The motivation for SymEngine is to develop the Computer Algebra System once in C++ and then use it from other languages rather than doing the same thing all over again for each language that it is required in. Not all the SymEngine classes present are wrapped in Ruby, student can dive right in and look as to what else needs to be done. Few things that the project involves are:

  • Extending the C interface of SymEngine library.
  • Wrapping up the C interface for Ruby using Ruby C API, including error handling.
  • Designing the Ruby interface.
  • Integrating IRuby with symengine gem for better printing and preparing new IRuby notebooks.
  • Integrating the gem with existing gems like gmp, mpfr and mpc.
  • Making the installation of symengine gem easier.

SciRuby organisation may also accept the student working in this project.

Status

The ruby wrappers, symengine.rb, are a result of Abinash's GSoC. Have a look at the blog post. Improve them by wrapping the rest of SymEngine.

Involved Software

Ruby, C++, CMake, Git

Difficulty

Beginner

Prerequisite Knowledge

The Beginner Contributor Guide - Ruby Extensions and the resources mentioned there contain everything the student needs to know to get started.

Idea Prompts

  • Linear algebra
    • Rewrite the Matrices module to be more like the polys module, i.e., allow Matrix to use the polys ground types, and separate the internal data (sparse vs. dense) from the Matrix interface. The goal is to make the matrices in SymPy much faster and more modular than they are now.
    • Refactor the matrices module to have a cleaner orthogonal organization
  • improve the integration algorithm
    • integration of functions on domains of maximum extent, etc.
    • Interesting idea: "SYMBOLIC COMPUTATION OF INTEGRALS BY RECURRENCE" by MICHAEL P. BARNETT
  • definite integration & integration on complex plane using residues. Note that we already have a strong algorithm that uses Meijer G-Functions implemented. So we need to first determine if such an algorithm would be worthwhile, or if it would be better to extend the current algorithm. Note that there are many integrals that are easy to compute using residues that cannot be computed by the current engine. Other possibilities: the ability to closed path integrals in the complex plane, which is not possible with the Meijer G algorithm.
  • Groebner bases and their applications in geometry, simplification and integration
    • improve Buchberger's algorithm and implement Faugere F4 (compare their speed) Note: This has already been implemented by a previous GSoC student. Please check with us to see the current state of Groebner bases in SymPy
  • improve polynomial algorithms (gcd, factorization) by allowing coefficients in algebraic extensions of the ground domain
  • implement efficient multivariate polynomials (arithmetic, gcd, factorization)
    • Implement a sparse representation for polynomials (see the dummy files in sympy/polys/ starting with "sparse" in the SymPy source code for a start to this project).
    • Figure out which representations to use where (sparse vs. dense).
    • implement efficient arithmetic (e.g. using geobuckets (Yan) or heaps (Monagan & Pearce))
  • improve SymPy's pattern matching abilities (efficiency and generality)
    • implement similarity measure between expression trees
    • expression complexity measures (e.g. Kolmogorov's complexity)
    • implement expressions signatures and heuristic equivalence testing
    • implement semantic matching (e.g. expression: cos(x), pattern: sin(a*x) + b)
      • e.g by using power series for this purpose (improve series speed)
    • Expand the capabilities of Wild() and match() to support regular expression-like quantifiers.
  • improve simplification and term rewriting algorithms
    • add (improve) verbatim and semi-verbatim modes (more control on expression rewriting)
    • implement more expression rewrite functions (to an exact form that user specifies). This may involve rewriting the rewrite framework to be more expressive. For example, should cos(x).rewrite(sin) return sqrt(1 - sin(x)**2) or sin(pi/2 - x)?
    • maybe put transformation rules in an external database (e.g. prolog), what about speed?
    • improve context (e.g. input) depended simplification steps in different algorithms
      • e.g. the integrator needs different sets of rules to return "better" output for different input
      • but there are more: recurrences, summations, solvers, polynomials with arbitrary coefficients
    • what about information carried by expressions?
      • what is simpler: chebyshevt(1, x) or x ?
      • what is simpler: chebyshevt(1000, x) or (...) ?
    • improve trigonometric simplification. See for example the paper by fu et. al.
  • implement symbolic (formal) logic and set theory
    • implement predicate (e.g. first-order), modal, temporal, description logic
    • implement multivalued logic; fuzzy and uncertain logic and variables
    • implement rewriting, minimization, normalization (e.g. Skolem) of expressions
    • implement set theory, cardinal numbers, relations etc.
    • This task is heavily tied to the assumptions system.
  • implement symbolic global optimization (value, argument) with/without constraints, use assumptions
  • continue work on objects with indices (tensors)
    • include the index simplification algorithms used in xAct and cadabra.
  • generalized functions - Dirac delta, P(1/x), etc... Convolution, Fourier and Laplace transforms
    • Fourier and Laplace transforms are implemented but we can not do many cases involving distributions Is this enough alone for a project though? -Aaron
  • vector calculus, differential fields, maybe Lie algebras & groups
  • parametric integrals asymptotic expansion (integral series)
  • Integral equations. See for example the work started at http://code.google.com/p/sympy/issues/detail?id=2344. This could be part of a project on ODEs, for example.
  • partial differential equations. Currently, SymPy can't solve any PDEs, though a few tools related to separation of variables are implemented. The PDE module should be structured similarly to the ODE module (see the source code of sympy/solvers/ode.py).
  • improve SymPy's Common Subexpression Elimination (CSE) abilities.
  • Singular analysis and test continuous.
    • find singularities of the function and classify them.
    • test the function whether it is continuous at some point or not. And in the interval. Note: Please discuss this idea with us if you are interested, as as it currently presented, it is somewhat vague.
  • Control theory. systems for Maple and Mathematica might provide insight here. http://www.mcs.anl.gov/~wozniak/papers/wozniak_mmath.pdf might be useful.
  • Diophantine Equations: SymPy does have substantial support for solving these, never the less there is more work possible to improve the solver.

Other Related Projects

Theano

There are 2 ideas in the Theano GSoC ideas list that are of interest to SymPy user's. 1) Lower Theano overhead (significant for scalars) and 2) generate dynamic libraries. They are interesting to SymPy user's via the SymPy -> Theano bridge that allow to compile a SymPy graph for faster execution. If you are interested in them, contact them.

PyDy

The classical mechanics package is tightly coupled with the PyDy project which enhances the mechanics package with numeric and visualization capabilities. All of the ideas for sympy.physics.vector and sympy.physics.mechanics are listed on the PyDy wiki in addition to other projects that are not in the SymPy code base but related. See https://github.com/pydy/pydy/wiki/GSoC-2016-Ideas.

Non-Ideas

Every year, people ask about implementing various things that we have already decided do not belong in SymPy. Among these are:

  • Graph theory. The NetworkX package already does a great job of graph theory in Python. If you are interested in working in graph theory, you should contact them.
  • Numerical solvers. SymPy is a symbolic library, so the code should focus on solving things symbolically. There are already many libraries for solving problems numerically (NumPy, SciPy, ...).
Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.