GSoC 2017 Application Shikhar Jaiswal: Improving SymEngine's Python Wrappers and SymPy SymEngine Integration

Shikhar Jaiswal edited this page Apr 3, 2017 · 19 revisions
Clone this wiki locally

Table of Contents

Personal Background

Details

Name : Shikhar Jaiswal

University : Indian Institute of Technology, Patna

Email : jaiswalshikhar87@gmail.com

GitHub : ShikharJ

Blog : shikharj.github.io

Time-zone : IST (UTC+5:30)

Age : 18

About Me

I am a first year undergraduate student pursuing a Bachelors of Technology in Computer Science and Engineering at Indian Institute of Technology, Patna. I was introduced to programming about three years ago. I have previously programmed a file-based organic and inorganic chemical analysis module in C++, which provides the user with an initial set of characteristics to select from, and returns the name of the test to be carried out to exactly determine the functional group of that compound.

I also had the opportunity to implement a steganography tool (in C++), using the algorithm employed in the original Enigma machine used by the Axis powers during World War II.

I am comfortable with STL and algorithms. I am also currently improving my skills in competitive programming, apart from software development. I was introduced to Python programming through the book “The Python Crash Course” by Eric Matthes, and have developed a 2-D space shooting game, using the PyGame library, and also worked on data visualisation using Pygal. I am familiar with git for version control, and currently working on a project involving image processing and gesture recognition, requiring the use of MATLAB/GNU Octave and OpenCV respectively. I am also currently trying my hands at Cython through the book “Cython: A Guide for Python Programmers” by Kurt W. Smith.

Platform Details

OS : Ubuntu 16.10

Hardware Configuration : i7 7700HQ/ 16GB

IDE : C/C++ - CLion
Python - PyCharm

Editor : SublimeText 3

Contributions to SymEngine/SymPy

SymEngine

  • Removed unimplemented constructor declaration in SymEngine::Min (Merged)
  • Added more functions to LLVMDoubleVisitor (Pending)
  • Worked on changing the test clause to CHECK() (Pending)
  • Worked on increasing code coverage (Merged)
  • Implemented the derivative of Dirichlet_eta function and added tests (Merged)
  • Improved test cases in test_infinity.cpp (Merged)
  • Updated functions.cpp and added tests (Merged)
  • Implemented NaN class and made subsequent changes in the code-base (Merged)
  • Implemented automatic evaluation of powers to and of constants (with @isuruf) (Merged)
  • Implemented class ComplexBase and is_a_complex() virtual function in class Number (Merged)
  • Improved upon Zeta function derivative and added tests (Merged)
  • Implemented Infty::pow and Infty::rpow functions (Merged)
  • Worked on replacing calls to rcp_static_cast() with calls to down_cast() (Merged)
  • Implemented the derivative of LeviCivita function (Merged)
  • Incorporated the derivative of Zeta, UpperGamma and LowerGamma functions and added tests (with @isuruf) (Merged)
  • Restructured the code-base to convert public and protected data members of select classes to private (Merged)
  • Applied -Wconversion flag and reported the errors (with @isuruf) (Pending)
  • Restructured the mul_dense_dense() function (Merged)
  • Added -ftrapv flag to clang builds for checking integer overflows (Merged)
  • Wrapped as_numer_denom() function in C (Merged)
  • Added a GCC 6 build to TravisCI (with @certik) (Merged)
  • Worked on changing the print format of exponential expressions and added tests (Merged)
  • Added support for down_cast() and made changes in the code-base (with @isuruf) (Merged)
  • Implemented complex_double_get() function (Merged)
  • Added clang sanitizer checks and worked on test consolidation (Pending)
  • Implemented symengine_have_component() function (Merged)
  • Implemented rational_get_mpq() function (Merged)
  • Minor change in UndefinedError class (Merged)

Issues

SymEngine.py

  • Added CodeCov check (Pending)
  • Wrapped sech, csch, acsch and asech functions (Merged)
  • Wrapped DenseMatrix::reshape() function (Merged)
  • Formatted .py files according to PEP8 (Merged)
  • Refactored DenseMatrix class and introduced MutableDenseMatrix and ImmutableDenseMatrix classes (Pending)
  • Refactored Subs and Derivative classes (Pending)
  • Wrapped SymEngine::Min and SymEngine::Max functions and added tests (Pending)
  • Minor improvement in atoms() function and added tests (Merged)

SymPy

  • Added a bunch of constants to sympy/physics/units (Merged)
  • Refactored printing/pretty/pretty.py to use pretty_symbology.py (Pending)
  • Worked on porting SymEngine to physics/optics (Pending)

Project Overview

Speed is of the utmost importance for any Computer Algebra System.

SymEngine, was initially developed with the aim of serving as an optional core for the SymPy CAS in the future. Over the years, it has matured enough to be used as a symbolic backend. Using SymEngine can significantly increase speeds of various symbolic operations, and hence make SymPy an ideal choice for projects requiring fast manipulations, by giving them the option to switch over to SymEngine’s routines.

On the other hand, this will also lead to the development of a number of features currently lacking in SymEngine and its Python wrapper, which would be ported over from SymPy in order to provide smooth wrappers for optional use.

An added advantage is that SymEngine can be used in SymPy with minimal programming effort (as clearly demonstrated here), requiring less time in porting and hence more time can be dedicated to expanding and implementing additional functionality in SymEngine and SymEngine.py, that can again be integrated between SymEngine and SymPy.

Discussions

I initially wanted to implement my proposal on a module-by-module basis (i.e. working on improving the backend of a single module at a time). However, after talks with Isuru, it soon turned out that this was a longer approach. As such, this proposal currently takes on a routine-by-routine approach, for which a number of routines are shortlisted through our discussions. The main idea of this approach is that the majority of the implementation and wrapping related work should occur first, succeeded by introducing changes and tackling conflicts in the SymPy repository. Thanks to Isuru, this proposal now has a much improved layout and timeline breakup.

Project Details

Currently, there are roughly 16 modules (or specialised directories), out of a total 37, in SymPy that are under the present scope of improvement. Since the period of GSoC may lead to the further development of modules and sub-modules through other contributions, the exact figure would be a variant. I plan on executing this proposal in three inter-mixing phases (list of specific functions is given later):

Phase I: Working with Existing Wrapped Functionality

In this phase, the idea, basically, is to refurbish (completely or partially) all the modules that currently import routines that are already implemented in SymEngine, and available in the SymEngine.py wrapper. As such, no new functionality is expected to be implemented in either SymEngine, or its Python wrapper, though minor changes may be required. Only those modules are worked upon in which all of the imported routines are either available in the SymEngine.py wrapper, or are beyond the scope of the development of this project (for example, integration heuristics and assumption routines). This work should require making trivial changes such as changing:

from sympy.core import ...

to

from sympy.core.backend import ...

Testing (for compatibility issues) and benchmarking, if required, for these modules will also occur during this period. It will also serve as a warm-up for the next two phases, which would be more coding intensive, and can be initiated before and during the Community Bonding period.

Phase II: Implementing Specific Functionalities and Wrapping

This phase will primarily focus on implementing specific functionalities that aren’t currently available in SymEngine or in SymEngine.py or both, but can be implemented within a stipulated amount of time. This includes, implementing routines in SymEngine in a manner similar to SymPy, updating the python wrapper with the latest development, and testing all the implementations. Having worked extensively with SymEngine, this should be comparatively an extensive, yet intermediately challenging task. Since most of the work will be centered around implementation in SymEngine and SymEngine.py, no major change is expected in the SymPy repository. USE_SYMENGINE clause requires the latest version of SymEngine’s python wrapper, and as such, SymEngine.py release will also have to be necessarily updated.

Phase III: Augmenting Import Routines, Testing and Benchmarking

This phase would largely be a follow-up of the first two phases, especially the second phase. All the new functionality implemented in SymEngine and SymEngine.py, will be ported over to SymPy. All of the modules, left uncovered in the first phase, will also be updated here, along with final testing and benchmarking of the changes made up till then. Remaining compatibility issues, if they arise, will also be dealt with during this phase. The proposal would be finished off with a final update to the documentation and instructions wiki.

Functions and Classes

Currently Implemented in SymEngine and SymEngine.py

PHASE I: MIGHT THROW COMPATIBILITY ISSUES WITH SYMPY

The following functions are to be inspected once for conflicts with their SymPy counterparts and minor changes. Also some of these are yet to be made available through sympy_compat.py file:

  • Symbol
  • Integer
  • sympify
  • S
  • SympifyError
  • exp
  • log
  • gamma
  • sqrt
  • I
  • E
  • pi
  • Matrix
  • lambdify
  • symarray
  • diff
  • zeros
  • eye
  • symbols
  • diag
  • ones
  • expand
  • AppliedUndef
  • Function
  • symbols
  • var
  • Add
  • Mul
  • Derivative
  • Basic
  • Pow
  • Rational
  • Abs
  • Number
  • Float
  • Dict
  • factorial
  • sieve
  • gcd
  • lcm
  • factor
  • nextprime
  • mod_inverse
  • totient
  • primitive_root
  • atan2
  • MatrixBase
  • DenseMatrix
  • Trigonometric Functions (sin, cos, tan, cot, csc, sec, asin, acos, atan, acot, acsc, asec, sinh, cosh, tanh, coth, asinh, acosh, atanh, acoth)

Currently Unimplemented in SymEngine or SymEngine.py or Both

PHASE II: TO BE IMPLEMENTED IN A MANNER SIMILAR TO SYMPY

These functions, after being implemented, will have to be selectively checked for compatibility (initially between SymEngine and SymEngine.py and later between SymEngine.py and SymPy). Some of these are pre-implemented, but are needed to be refurbished:

SymEngine and SymEngine.py

  • Relational Operators (Rel, Eq, Ne, Lt, Le, Gt, Ge)
  • Nor
  • sign
  • NumberSymbol
  • isprime
  • Range
  • Intersection
  • Complement
  • Mod
  • _symbol
  • floor
  • igcdex
  • _symbol
  • ceiling
  • igcd
  • ilcm
  • isqrt
  • Tuple
  • integer_nthroot
  • perfect_power
  • sqrt_mod
  • gcdex
  • divisors
  • ProductSet
  • conjugate

SymEngine.py Only

  • _sympify
  • And
  • Not
  • Or
  • Expr
  • Interval
  • FiniteSet
  • Union
  • EmptySet
  • Set
  • as_int
  • KroneckerDelta
  • Zeta
  • MutableDenseMatrix
  • MatrixSymbol
  • Error Functions (erf, erfc, erfi)
  • NaN
  • Infinity
  • NegativeInfinity
  • LambertW
  • Piecewise
  • expand_mul
  • BooleanAtom
  • nan
  • oo
  • zoo
  • Lambda
  • Min
  • Max
  • Contains
  • Xor
  • Nand
  • Nor
  • col
  • rowadd
  • rowmul
  • SparseMatrix
  • MutableMatrix
  • mgamma
  • diophantine
  • Eulergamma
  • lowergamma
  • uppergamma
  • ImmutableMatrix
  • ImmutableSparseMatrix

Modules Under Present Scope of Improvement

The modules list has been added only to portray the potential benefit of implementing this proposal. What I intend on doing is to port over and wrap up the mentioned functions and classes only (which are universal to use). I do not plan on implementing any other stuff that is unique to these modules (that are also not implemented in SymEngine).

Phase I Modules

  • Parsing
  • Physics

Phase I and Phase II Modules

  • Categories
  • CodeGen
  • Combinatronics
  • Crypto
  • DiffGeom
  • Geometry
  • LieAlgebras
  • Ntheory
  • Polys
  • Sets
  • Simplify
  • Strategies
  • Tensor
  • Utilities

Additional Goals (Time Permitting)

Increasing Code Coverage in SymEngine

Code coverage is one of the most fundamental methods of software testing. A program with high code coverage, has a lower chance of containing undetected bugs. Currently, the SymEngine master branch stands at ~82.75% coverage as reported by Codecov. Raising code coverage, though not explicitly challenging, is a very time consuming task, requiring the implementation of proper test cases and conditions, and subsequently, debugging the errors obtained, if any. Though SymEngine currently deploys the Codecov check as pre-condition for the incoming pull requests, some of the already existing files in the code-base have a coverage as low as 20-25%. Hence I would like to devote some time in implementing tests to increase SymEngine’s coverage.

Adding (Partial) Documentation

From my experiences of contributing to SymEngine over the course of the past 5 months, I have felt that the SymEngine library currently lacks a proper documentation. As a newbie to SymEngine, it took me a considerable amount of time to completely understand some of the very core functionality of the library, along with frequent clarifications from Ondřej and Isuru. Also, this would be helpful to a lot of people who would like to integrate SymEngine into their projects, but are finding the lack of documentation to be a hindrance. Though I won’t be able to document the entire library, as it would still require a lot of time, I would certainly like to work on documenting the functionality on which I have worked upon, both pre-GSoC and while implementing my proposal, as a side task. Also, as suggested by Isuru, I would like to write a tutorial (possibly a Jupyter notebook) for using SymEngine in Python through SymEngine.py.

Implementing Global Interpreter Lock (GIL) Acquisition Routine

Thread safety is a current issue with SymEngine.py that needs to be worked upon. One probable way of implementing thread safety is through setting up a GIL acquisition routine in pywrapper.cpp. Though I don’t know how this is likely to be implemented, I plan on helping in its development, and work alongside my mentors to get it finished.

Timeline

I have no major commitments for the coming summer, except for maybe a couple of days of family vacation during the first week of May. As such, I will be able to contribute a total of 50 hours per week, or more if required. My summer break starts from the 29th of April, and regular classes would commence from the 31st of July. I also do not have any examinations before mid-September. Hence, the following timeline is planned to finish up on major areas of work before my college semester kicks off. I will also maintain my Github blog to show my progress and get feedback from the SymPy community.

Pre-GSoC + Community Bonding Period (Present - May 30)

Specific Goals

  • Make all the pre-implemented functions available through sympy_compat.py.
  • Introduce SymEngine as a backend in Parsing and Physics modules (Phase I Modules).

Side Goals

  • Talk to Isuru Fernando, Ondřej Certik, Aaron Meurer and others regarding the feasibility (with respect to time) of porting functionalities (in addition to the ones mentioned above) from SymPy to SymEngine and back.
  • Finalise the complete set of classes and routines to be ported, so as to save time later on.

Week 1 and Week 2 (May 30 – June 13)

Specific Goals

  • Implement Relational Operations in SymEngine as an initiation to Phase 2.
  • Import changes for pre-implemented functions/classes in the remaining modules (Phase I and II Modules).
  • Benchmark the results obtained, and make changes to the Phase II and Phase III work approaches if required.

Side Goal

  • Examine the SymPy source code for the implementation of the proposed functions in SymEngine and SymEngine.py wrapper, in preparation of later phases.

Week 3 and Week 4 (June 13 – June 27)

Specific Goals

  • Implement the rest of the mentioned Phase II functionalities to be ported over to SymEngine, along with tests (classes NumberSymbol, Complement, Mod and others).
  • Finish up on all the conflicted (with respect to naming, internal representation or output types/formats) Phase II functionalities pre-implemented in SymEngine (such as oo, zoo, rowadd, rowmul and others).

My goal up till Phase 1 evaluations would be to finish off with Phase I and SymEngine’s side of implementation work in Phase II.

Week 5 and Week 6 (June 27 – July 11)

Specific Goals

  • Work on the existing issues in the SymEngine.py repository related to the proposal such as #17, #76 and #91.
  • Wrap up functionalities (first 20 under Phase II) in SymEngine.py along with extensive testing.

Week 7 and Week 8 (July 11 – July 25)

Specific Goals

  • Finish up on the (28 remaining) shortlisted functionalities and classes, effectively finishing up on Phase II.
  • Exhaustively clear out compatibility issues between SymEngine repository and the wrapper, if they arise.

Side Goal

  • Setup CodeCov check for SymEngine.py to maintain a healthy coverage.

By the time for Phase 2 evaluations, I plan to be finished off with SymEngine.py’s side of implementation work and Phase II.

Week 9 and Week 10 (July 25 – August 08)

Specific Goals

  • Update the first 9 mentioned Phase I and II modules in SymPy after finishing up on SymEngine and SymEngine.py work, as a part of Phase 3.
  • Fix the compatibility issues thrown up.

Side Goal

  • Finish off miscellaneous implementations that might be required along the path of a future release.

Week 11 and Week 12 (August 08 – August 22)

Specific Goals

  • Update the remaining 5 mentioned Phase I and II modules in SymPy, bringing an end to Phase III.
  • Investigate SparseMatrix algorithms in SymPy, for a possible update on their usage with SymEngine.py objects.
  • Final check for any issues or conflicts between the wrapper and SymPy repository.
  • Final benchmarking and update to the SymPy and SymEngine wikis.

Week 13 (August 22 – August 29)

  • Buffer time for finishing up on documentation or any other piece of implementation that needs refactoring, or wrapping up any functionality left untouched due to delays.
  • Work on the additional goals planned (subject to the availability of time).

Post-GSoC

SymEngine is the first open-source project that I contributed to, and the journey has been simply amazing. Over time, I have realised that collaborating with the sharpest minds of the world is a pleasure beyond words. The experience I have gained so far is enriching in itself. I have the following plans post-GSoC:

  • I realise that the entire SymPy library cannot be covered presently due to various constraints. As such, I hope to continue upon my work on the modules and functionalities left untouched by the changes proposed above.
  • While thinking of a project idea, I had the opportunity to go through some amount of work done/required in SymPy and SymEngine (related to assumptions). Hence I wish to be a part of the team that develops the assumptions module in SymEngine.
  • Lastly, I also hope to represent team SymEngine at the upcoming conventions (PyCon India and SciPy India 2017), and talks organised at my college.

References

  • SymPy Core Upgrade to SymEngine

  • I would also like to mention that the structure and format of this proposal is inspired from a number of outstanding proposals from previous year GSoCers, available at SymPy's wiki.