GSoC 2014 Application Sushant Hiray: Extending Elementary Functions CSymPy

Sushant Hiray edited this page Mar 20, 2014 · 14 revisions
Clone this wiki locally

Personal Details

Name: Sushant Hiray

University: Indian Institute of Technology, Bombay

Email: hiraysushant@gmail.com

Github/IRC: sushant-hiray

Website: sushant-hiray.in

I am a third year undergraduate pursuing B.Tech. in Computer Science and Engineering at Indian Institute of Technology, Bombay. My semester will complete in late April leaving me enough time to get ready for my GSoC project If I am selected. I will be able to work 40 hrs a week on the project.

Personal Background

I have been programming since over 3 years now. I have covered a wide range of subjects ranging from Data Structures and Algorithms, Object oriented programming, Operating systems, Computer Architecture as a part of my curriculum. I had taken an elementary Discrete Mathematics course module which covered most of the things in this proposal, particularly the various special functions targeted at the end. It would provide me a good foundation to successfully finish this project.

#Programming Details

##Platorm Details I use Ubuntu 12.04 LTS as my primary work machine (my Laptop has this and so does my Digital Ocean droplet) . I am quite familiar with the concept of version control. I primarily favor using Git, though I use hg(when I contribute to Mozilla). I use Windows 8 to play games! I use Vim as my primary text editor because Vim is a language in itself. It is keyboard based so I save a lot of time while typing plus I can define any number of macros! Also the best thing about Vim is that it runs everywhere. Yes, it’s nice to have my favorite editor available when I SSH into a machine.

##Programming Experience

I am quite proficient in C, C++, Java and Python for programming. I am currently working on my Operating Systems Project to add functionality of Virtual Memory to an inhouse made OS. We are using C++ to build that and the codebase is quite extensive (running over 10K+ LOC) I've as well worked on C to create a small file sharing network as a part of our Networks course. I usually use python for developing in my leisure time. I built a small Hacker News Scraper. I've also played around python in making a twitter sentiment analysis system. I used Django for my Database Systems Project. A project management system.

##Python and C++, Sympy and CSympy! I have been using Python since over 3 years now and I have become quite proficient at it. Particularly using it for scripting and in my free time solving problems from Project Euler. I am particularly fond of Functional Programming. I miss having the lamda function in native c++. I've extensively used python for the list comprehensions. Also the other thing I love about python is the ease of file handling. Which is why I prefer python for any scripting job over c++!

As far as C/C++ is concerned, I have been using it since over 3 years now! I started programming 3 years prior and I was lucky enough to be introduced to python and c++ almost at the same time.

When I'm going to performance, the first and last choice is C/C++. I guess that sort off sums up the belief of CSympy and when I prefer using C++ over python! Also C++ offers a decent level of abstraction that doesn't hamper its performance in general.

I was introduced to Sympy around the same time last year, when one of my senior was applying into GSoC. I was particularly new to open source then. I looked into Sympy and I was particularly drawn by the number theory module. During that time I looked into the primality testing function and had some good insights after discussing a particular proposal of improving it. The one thing which struck me the most was using mix of algorithms for solving the same problem my considering which algorithms performed better in which range.

This year when I looked into GSoC ideas I was particularly drawn by CSympy. I caught up with Ondrej in IRC and he helped me get started and since then we discussed quite a few ideas which could possibly improve the current performance.

Contributions to CSympy

I've started working on CSympy quite recently and have helped fix a couple of small issues:

  1. PR 119: Merged: Fixed a printing issue of symbols such as -1x by changing it to -x

  2. PR 120: Merged: A minor fixing of links in documentation

  3. Working on Implementing DSW to check its performance wrt unordered_map in current implementation

Contributions to SymPy

  1. PR 7300: Merged: Added taylor_term for sec(x) in the trignometric module.

Project

Project Abstract

Currently there are very few functions implemented in CSympy[11]. I aim to extend the module of the elementary functions further, on the similar lines of Sympy.

I expect to implement a certain subset of special functions which are already implemented in SymPy. Since CSymPy is particularly new, almost all the special functions have certain dependencies. To give this a more modular approach, I aim to develop the following modules:

  • Trignometric
  • Exponential
  • Hyperbolic
  • Complex
  • Special Numbers
  • Special Functions

These modules as such are quite big in nature and have been refined for a long time in SymPy. I've mentioned specific additions for each module which will be implemented as a part of the GSoC Project.

Why this Project?

Currently there are various libraries available which have been specifically designed to perform High Speed Number Theory calculations . Most of them focus on numerical calculations rather than the symbolic manipulations. This project is aimed towards building symbolic representation for some special functions which use various special numbers from combinatorial number theory as a part of their representation. So this project involves adding symbolic capabilities to the existing fast evaluation of special functions.

CSymPy has been primarily been designed to boost the performance of symbolic manipulation of long expressions for which, SymPy, in general takes more time to perform. This project will essentially act as a base platform from which adding new special functions will be quite easier. As by then most of the dependencies which occur would have been taken care of.

Also this project has a lot of scope to be extended further way beyond the GSoC time slot. Thus, this project will help to add quite a few capabilities to the current existing API of CSymPy.

The project details and the further implementation will give a more thorough idea of what will be implemented and what will not be implemented as a part of the project.

Project Details

Trignometric Module

Current Status:

  • Sin
  • Cos

Implemented Functions:

  • is_canonical
  • hash
  • eq
  • compare
  • str
  • diff
  • subs

Expected Additions (functions):

  • eval ( a basic eval using lookup tables[1][2])
  • rewrite as exponential, and other trignometric symbols.
  • expand ( will be implemented after discussing with mentor)

More definitions to be added:

  • Tan
  • Cot
  • Trignometric Inverses (asin, acos, atan, acot , atan2: takes two args)

Exponential Module

Current Status

Unimplemented. pow class is implemented.

###Expected Addition Currently pow class is implemented. Inherit the exp module from it.

Some more functions:

  • rewrite as cos, sin
  • There are quite a few more classes implemented such as log, MRVlog,LambertW in SymPy. These will not be implemented
  • Look into library for fast implementation of eval.

Hyperbolic Module

Current Status

Unimplemented.

Expected Addition

  • sinh,cosh,tanh,coth,asinh,acosh,atanh,acoth
  • Symbolic representation will be completed. Dependecy: Exponential Module
  • Fast evaluation. Use of some library or use cmath just for starters.

##Complex Module

Current Status

Unimplemented. After an elaborate discussion[14], it was decided that Complex Module should be built on top of the existing Numbers module and use the existing GMP functions[15] for various manipulation. The functions are similar to the current implementation of SymPy.

###Expected Addition

  • re
  • img
  • arg
  • sign
  • abs
  • eval
  • conjugate

Note: Handling of polar numbers will not be handled in current project

##Special Numbers Module After selecting the special functions the following special numbers lie in the dependency list and have to be implemented.

  • Factorial
  • Bernoulli

##Special Functions

There are quite a few interesting functions which have been implemented in SymPy[12], I've looked through the functions and decided which ones to implement and thus in turn I had to implement the above said modules, to complete the dependency list.

Zeta Function

Expected Addition:

  • Riemann Zeta[7] => eval,rewrite_as_dirichlet_eta
  • Dirichlet_eta[8] => eval , rewrite_as_zeta

###Gamma Function Expected Addition:

  • Gamma [13]
  • UpperGamma
  • LowerGamma

Tensor Functions

Expected Addition:

  • LeviCivita [4]
  • KroneckerDelta [5]

Spherical Harmonics

Expected Addition:

  • Ynm
  • Znm

Note: Spherical Harmonics will be opted only if there is some spare time left in the end. Relevant discussion about it[16] #Implementation Details

##Trignometric Module

  class TF : public Function {
private:
  RCP<const Basic> arg_; // The 'arg' in tf(arg)
public:
  TF(const RCP<const Basic> &arg);
  virtual std::size_t __hash__() const;
  virtual bool __eq__(const Basic &o) const;
  virtual int compare(const Basic &o) const;
  virtual std::string __str__() const;
  bool is_canonical(const RCP<const Basic> &arg);
  virtual RCP<const Basic> diff(const RCP<const Symbol> &x) const;
  virtual RCP<const Basic> subs(const map_basic_basic &subs_dict) const;
  --------------------------------------------------------------------
  RCP<const Basic> eval(); //eval will be a basic implementation
                                   //for instance sin using lookup tables
                                   //alternate libraries will be used for faster eval
  RCP<const Basic> rewrite_as_other_tf(); // for eg: rewrite_as_sin() rewrite_as_tan();
  
};

##Exponential Module

  class exp : public Basic {
  private:
      RCP<const Basic> exp_; // base^exp
  
  public:
      exp(const RCP<const Basic> &exp);
      virtual std::size_t __hash__() const;
      virtual bool __eq__(const Basic &o) const;
      virtual int compare(const Basic &o) const;
      virtual std::string __str__() const;
      bool is_canonical(const RCP<const Basic> &exp);
      inline RCP<const Basic> get_exp() const { return exp_; }
      virtual RCP<const Basic> diff(const RCP<const Symbol> &x) const;
      virtual RCP<const Basic> subs(const map_basic_basic &subs_dict) const;
      virtual RCP<const Basic> rewrite_as_other_tf(); // for eg: rewrite_as_sin() rewrite_as_cos();
      RCP<const Basic> eval(); //alternate libraries will be used for faster eval
      
  };

##Complex Module class Complex : public Number { private: mpz_class real; mpz_class imaginary;

    public:
        Complex(rational r, rational im);
        Complex(mpz_class r, mpz_class im);
        virtual std::size_t __hash__() const;
        virtual bool __eq__(const Basic &o) const;
        virtual int compare(const Basic &o) const;
        virtual std::string __str__() const;
        RCP<const Complex> addcomplex(const Complex &other);
        RCP<const Complex> subcomplex(const Complex &other);
        RCP<const Complex> mulcomplex(const Complex &other);
        RCP<const Basic> real();
        RCP<const Basic> imaginary();
        RCP<const Basic> arg();
        RCP<const Basic> abs();
        RCP<const Basic> eval();
        RCP<const Complex> conjugate();
    }

##Hyperbolic Module class HB : public Function { private: RCP arg_; // The 'arg' in tf(arg) public: HB(const RCP &arg); virtual std::size_t hash() const; virtual bool eq(const Basic &o) const; virtual int compare(const Basic &o) const; virtual std::string str() const; bool is_canonical(const RCP &arg); virtual RCP diff(const RCP &x) const; virtual RCP subs(const map_basic_basic &subs_dict) const; RCP eval(); //alternate libraries will be used for faster eval RCP rewrite_as_other_tf(); // for eg: rewrite_as_exp() rewrite_as_cosh(); RCP _eval_conjugate(): //needs a dependency of complex module };

##Special Numbers Module class Factorial: public Function { private: RCP arg_; // The 'arg' in tf(arg) public: Factorial(const RCP &arg); virtual std::size_t hash() const; virtual bool eq(const Basic &o) const; virtual std::string str() const; bool is_canonical(const RCP &arg); RCP eval(); //gmp itself provides a fast version of eval };

  //to be decided after having a discussion with mentor
  //since bernoulli is an infinite sum, we might just get away with 
  //implementing the eval for integers
  //so a function just like in ntheory.cpp would be sufficient rather than a class
  class Bernoulli: public function {
        private:
              RCP<const Basic> arg_; // The 'arg' in tf(arg)
        public:
              Bernoulli(const RCP<const Basic> &arg);
              virtual std::size_t __hash__() const;
              virtual bool __eq__(const Basic &o) const;
              virtual std::string __str__() const;
              bool is_canonical(const RCP<const Basic> &arg);
              RCP<const Basic> eval(); // arb[3] itself provides a fast version of eval
  };

##Special Functions

Zeta Function

  class Zeta: public Function {
        private:
              RCP<const Basic> arg_; // The 'arg' in tf(arg)
        public:
              Zeta(const RCP<const Basic> &arg);
              virtual std::size_t __hash__() const;
              virtual bool __eq__(const Basic &o) const;
              virtual std::string __str__() const;
              bool is_canonical(const RCP<const Basic> &arg);
              RCP<const Basic> eval(); //dependencies: bernoulli, factorial
              RCP <const Basic> rewrite_as_dirichlet_eta();
  };

Dirichlet_eta Function

  class Dirichlet_eta: public Function {
        private:
              RCP<const Basic> arg_; 
        public:
              Dirichlet_eta(const RCP<const Basic> &arg);
              virtual std::size_t __hash__() const;
              virtual bool __eq__(const Basic &o) const;
              virtual std::string __str__() const;
              bool is_canonical(const RCP<const Basic> &arg);
              RCP<const Basic> eval(); //dependencies: zeta
              RCP <const Basic> rewrite_as_zeta();
  };

LeviCivita

  class LeviCivita: public Function {
        private:
              umap_basic_int dict_;
        public:
              LeviCivita(const umap_basic_int& dict);
              virtual std::size_t __hash__() const;
              virtual bool __eq__(const Basic &o) const;
              virtual std::string __str__() const;
              RCP<const Basic> eval(); //dependencies: factorial
  };

KroneckerDelta

  class KroneckerDelta: public Function {
        private:
              RCP<const Basic> arg_1_ , arg_2_; 
        public:
              LeviCivita(RCP<const Basic> arg_1_ ,RCP<const Basic> arg_2_);
              virtual std::size_t __hash__() const;
              virtual bool __eq__(const Basic &o) const;
              virtual std::string __str__() const;
              RCP<const Basic> eval();
              bool  is_below_fermi();
              bool  is_above_fermi();
              bool  is_only_below_fermi();
              bool  is_only_above_fermi();
              RCP<const Basic> _get_preferred_index();
  };

Gamma

  class Gamma : public Function {
      private:
        RCP<const Basic> arg_; // The 'arg' in tf(arg)
      public:
        Gamma(const RCP<const Basic> &arg);
        virtual std::size_t __hash__() const;
        virtual bool __eq__(const Basic &o) const;
        virtual int compare(const Basic &o) const;
        virtual std::string __str__() const;
        bool is_canonical(const RCP<const Basic> &arg);
        RCP<const Basic> eval(); //dependency factorial
        RCP<const Basic> _eval_conjugate(): //needs a dependency of complex module
   };

Lower Gamma

  class LowerGamma : public Function {
      private:
        RCP<const Basic> arg_1_;
        RCP<const Basic> arg_2_;
      public:
        LowerGamma(const RCP<const Basic> &arg1, const RCP<const Basic> &arg2);
        virtual std::size_t __hash__() const;
        virtual bool __eq__(const Basic &o) const;
        virtual int compare(const Basic &o) const;
        virtual std::string __str__() const;
        bool is_canonical(const RCP<const Basic> &arg);
        RCP<const Basic> eval();
        RCP<const Basic> _eval_conjugate(): //needs a dependency of complex module
   };

Upper Gamma

  class UpperGamma : public Function {
      private:
        RCP<const Basic> arg_1_;
        RCP<const Basic> arg_2_;
      public:
        UpperGamma(const RCP<const Basic> &arg1, const RCP<const Basic> &arg2);
        virtual std::size_t __hash__() const;
        virtual bool __eq__(const Basic &o) const;
        virtual int compare(const Basic &o) const;
        virtual std::string __str__() const;
        bool is_canonical(const RCP<const Basic> &arg);
        RCP<const Basic> eval();
        RCP<const Basic> _eval_conjugate(): //needs a dependency of complex module
   };

Spherical Harmonics

  class Ynm: public Function {
        private:
              RCP<const Basic> n;
              RCP<const Basic> m;
              RCP<const Basic> theta;
              RCP<const Basic> phi;
        public:
              Ynm(RCP<Basic> n, RCP<Basic> m, RCP<Basic> theta, RCP<Basic> phi);
              virtual std::size_t __hash__() const;
              virtual bool __eq__(const Basic &o) const;
              virtual std::string __str__() const;
              RCP<const Basic> eval(); 
  }; 

  class Znm: public Function {
        private:
              RCP<const Basic> n;
              RCP<const Basic> m;
              RCP<const Basic> theta;
              RCP<const Basic> phi;
        public:
              Znm(RCP<Basic> n, RCP<Basic> m, RCP<Basic> theta, RCP<Basic> phi);
              virtual std::size_t __hash__() const;
              virtual bool __eq__(const Basic &o) const;
              virtual std::string __str__() const;
              RCP<const Basic> eval(); 
  }; 

Timeline

##Community Bonding Looking through the code base and understanding the code flow. I do have a brief idea now, though it needs to be more clear. Getting used to RCP Memory Management and GMP functions for Integer and Rationals. It would be used during the Complex Module. Also look into using some libraries for fast numeric evaluation of various functions. Getting the planned API's approved by the mentor. Re-iterate over the design if need be. Also discuss about the documentation for CSymPy. Currently there are no specific guidelines for a uniform documentation, using something like Doxygen might help.

##Week 1 Implementing the basic API for tan,cot,sec,cosec. Bringing them at par with the current implementation of sin,cos. Ideally this phase doesn't require 7 days, it is more of reusing the stuff. But I have accounted so time to get used to following the style guides particularly while writing API's. Send a PR to get the basic API up and running.

##Week 2-3 Port the t.py into a c++ version. This should be roughly completed in a week. Add unit tests the next week. Fix bugs if any. Send a PR after the unit tests are complete. Also with the help of mentor work on adding a python wrapper. Send a PR once that is done.

##Week 4 Fix some loose threads (if any) from the unit tests. Add basic API's for Inverse Trignometric functions asin, acos, atan, acot , atan2. They will be implemented at par with the current implementation of trignometric functions. Add basic unit tests. Send a PR to check through the basic API for these functions.

##Week 5 Implement the Exponential Module. Complete the basic API in a day or two. Mirror functions present in SymPy and implement them. Can be done in a week's time. Add basic unit tests, importantly for rewriting_as_cos etc. Also the eval can be implemented using a library. This will be looked into during the community bonding period. Send a PR.

##Week 6 Implement the Hyperbolic Module. Complete the basic API first. More focus on implementing functions like rewrite_as_x. Add basic unit tests. Also the eval can be implemented using a library. This will be looked into during the community bonding period. Send a PR for this module.

##Week 7-8 Week-7: Look into the basic wrapper module for Complex Module. {Should be implementing the basic structure along the lines of SymPy}. Extending the numbers module and building on top of GMP. Week-8: Reverse the couple of starting days (if need be) for the basic wrapper, if there are some issues in interfacing with GMP. Remaining of Week-8, look into various rewrite_as_x which can now be possible as a result of adding complex module.

##Week 9 Adding the basic dependencies from Numbers Module such as Factorial,Bernoulli which will be used in the upcoming special functions module. Factorial is inbuilt in GMP. So more functions around factorial such as RisingFactorial and FallingFactorial are not, so implement them. Primarily RF and FF are more useful if we consider symbolic manipulation. Add Bernoulli using the ARB library. Add basic unit tests. Send a PR.

##Week 10 Zeta Function Module. Implement the basic Zeta Functions Riemann_Zeta and Dirichlet_eta. If the implementation completes earlier than expected, look into some other functions in zeta module. There are a couple of more functions, which haven't been explicitly mentioned as a part of the proposal. Send a PR for reviewing.

##Week 11 Implement the Gamma Function Module. This module is the biggest amongst the 3 special functions. Should be completed within a week though. Add appropriate unit tests for basic checks. Send a PR post completion.

##Week 12 Implement the Tensor Functions Module. This is probably the smallest module amongst the 3 special functions. Should be over in less than a weeks time. If I'm able to squeeze in the 3 special modules pretty quick, it might give me some time to start working on basic taylor_term. There are a couple of libraries[9][10] which can be used as a base class. Or as per the suggestion of Ondrej, work on implementing some more special functions. In such a scenario I've looked into Spherical Harmonics module.

##Week 13 This is essentially a buffer week. In case there are some unimplemented definitions, or some TODO's, try to finish them off this week. Try and get the PR's merged in case they haven't. Polish the documentation.

##Post GSoC CSymPy is currently quite new and there are tons of features which can be added/refined to give even a better performance. I would certainly be more interested to look into some fast Data Structure implementations which could possibly help the speed up. Also prioritize and work towards features in which SymPy in general is lacking in speed. That being said, the possibilities are more than enough to keep any interested person occupied over a prolonged period of time.

References

[1] http://stackoverflow.com/questions/3688649/create-sine-lookup-table-in-c

[2] https://github.com/certik/sympy/blob/trig/sympy/functions/elementary/trigonometric.py

[3] http://fredrikj.net/arb/bernoulli.html

[4] http://en.wikipedia.org/wiki/Levi-Civita_symbol

[5] http://en.wikipedia.org/wiki/Kronecker_delta

[6] https://gmplib.org/manual/Number-Theoretic-Functions.html#Number-Theoretic-Functions

[7] http://en.wikipedia.org/wiki/Riemann_zeta_function

[8] http://en.wikipedia.org/wiki/Dirichlet_eta_function

[9] http://code.google.com/p/libtaylor/

[10] https://ctaylor.codeplex.com/

[11] https://github.com/certik/csympy/blob/master/src/functions.h

[12] https://github.com/sympy/sympy/tree/master/sympy/functions/special

[13] https://en.wikipedia.org/wiki/Gamma_function

[14] https://github.com/certik/csympy/issues/47

[15] https://gmplib.org/manual/Function-Index.html#Function-Index

[16] https://groups.google.com/forum/#!topic/sympy/V1zGL4YM_kE