GSoC 2016 Application Srajan Garg: Polynomial manipulation & Multivariate Polys in Cpp

Srajan Garg edited this page Mar 25, 2016 · 19 revisions

#Personal Background

###Details

Name : Srajan Garg

University : Indian Institute of Technology, Bombay

Email : srajangarg@gmail.com

GitHub : srajangarg

Blog : srajangarg.github.io

Time-zone : IST (UTC+5:30)

Age : 19

###About Me

I am a second year undergraduate student pursuing Bachelor of Technology in Computer Science and Engineering at Indian Institute of Technology, Bombay. I was introduced to computer programming in grade 10, and I've never stopped since. I have explored a lot of areas related to programming ranging from Django/Flask python web-frameworks to C++ physics libraries like Box2D. I've also actively participated in hacking style CTFs and competitive programming contests with my batch mates, apart from software development.

I was already aware of Sympy, and was introduced to SymEngine by Sumith. I was fascinated by the fact that SymEngine might be the (optional) core of Sympy in the near future. SymEngine was also the first open source project I started contributing to, and it's been a great experience to get to know open source inside out and has been a great learning experience thus far.

###Platform details

OS : Ubuntu 15.10

Hardware Configuration : i7 5600U/ 8GB

Editor : SublimeText 3

###Programming experience I have been programming in C++ for the past two years, and have greatly improved my knowledge from when I began. I've used C++ mainly for competitive programming contests. I had qualified for the regionals of the ACM ICPC, last year. I have also developed an 8 Ball Pool Game using the Box2D physics library. Also, as a beginner project I had made the solver (plus game) of the popular 2048 game, right on the console! (It used the Monte Carlo algorithm for solving)

I am also fairly experienced in Python having used it for about one and a half years. I have also written various python apps/ web-apps which I made mainly for personal use. One of these is a youtube downloader app, which is used to rip music off of YouTube along with all the metadata and album art (which are scraped off Google). I've also worked at at a startup, which had Django as their primary backend web-framework. I've also developed a small project for IIT Bombay, which helped freshmen find their room-mates before coming to college.

I am also familiar with git for version control, and also familiar with most of the workflow associated with it. Apart from this I am also experienced with makefiles, PHP, MATLAB/Octave and Latex.

###Contributions to SymEngine/Sympy

  • Implemented as_numerator_denominator function for SymEgnine

  • Implemented a fast parser for text to expression conversions within SymEngine, also made small changes afterwards

  • Added simplifications for trigonometric equations of the form sin(arcsin(x)) = x

  • Implemented the min and max functions for SymEngine

  • Implemented Horner's method for evaluating univariate polynomials

  • Redid the mul_poly function of the Univariate class, and also made smaller changes (pending)

  • Set up clang-format for SymEngine (on Travis), to keep consistent formatting throughout the code base (pending)

  • Worked on creating creating code quality checks through a python script

  • Fixed a minor bug in the LambertW function

  • Added few more test cases for hyperbolic functions (pending)

  • Fixed bug related to printing of sets and intervals in Sympy (pending)


#Project Overview

SymEngine currently has two univariate polynomial class, which lack much needed functionality. I intend to amend this by adding the missing functionality by fully furnishing the existing polynomial class. Also, I would like to implement a univariate class for rationals coefficients (discussed below) and also improve the existing general univariate class, while working alongside the UC Davis Team.

Also a good CAS must have fast implementations of multivariate polynomials. I would like to add a efficient multinomial class for SymEngine, following the same pattern as that of the univariate case. I would also like to play with the idea of multivariate polynomials being the super class of univariate polynomials. This would ensure that multivariate polynomials are as fast as our univariate polynomials, and univariate polynomials are just a special case with only one variable.

Also, discussions with the mentors led to possibly wrapping FLINT and Piranha for even faster methods for polynomials with integer and rational coefficients (discussed below). I also plan to complete this interfacing with the other two libraries.

Implementation of a fast and robust polynomial class forwards the progress in other features like Solvers, Integrals and Series Expansions too.


#Discussions

After discussions with Isuru, he explained to me that it is still not sure whether SymEngine should focus on polynomials with integer and rational coefficients. He feels that this should be left for a specialized polynomial library like flint or piranha. However, I had written most of this proposal before I got to know of this, hence most of my proposal focuses on integer and rational coefficient polynomials. I believe that SymEngine should have it's own implementations of the methods, with FLINT and Piranha being optional (for even faster computation). So, the proposal deals with the SymEngine implementation, as well as how it will be interfaced with Piranha and FLINT.

The main idea here is, if a user really wants fast computations they may optionally install Piranha and/or FLINT. Otherwise, they can use the (a little) slower SymEngine functions. For use with the other libraries, SymEngine's polynomials must be converted to the other libraries' structure before the algorithm can be used.


#FLINT

###Overview FLINT provides very good documentation of their whole library.

FLINT provide us with fmpz_poly which are polynomials over integer coefficients, and fmpq_poly which are polynomials over rational coefficients. As FLINT is a C library, it ships wrappers for C++ interfacing in the form of fmpq_polyxx and fmpz_polxx (which are currently being used by SymEngine). In general, if a FLINT C interface is called module and resides in module.h, then the C++ version is called modulexx and resides in modulexx.h. All flintxx classes live inside namespace flint.

Also FLINT uses expression templates for lazily evaluating expressions. This means that even if x and y are of type fmpzxx(say), then x + y will not be of type fmpzxx. Instead it will be an object which for most purposes behaves just like fmpzxx, but really only expresses the fact that it represents the sum of x and y. For the poly classes, this object is called Fmpz_poly_expr and Fmpq_poly_expr.

The wrapper deatils and function implementations are given in the FLINT documentation on page 574 for fmpz_polyxx and page 591 for fmpq_polyxx.

###Interconversion SymEgnine Polynomials can be converted into fmpz_polyxx by using strings, or by setting each and every coefficient.

fmpz_polyxx::fmpz_polyxx(const char * str)          // using strings

fmpz_polyxx::fmpz_polyxx(slong alloc_len)           // zero polynomial, allocated space
void Fmpz_poly_target::set_coeff(slong i, fmpzxx l) // sets ith coefficient to l

Interconversion will be done by iterating over the dictionary of the SymEngine polynomial, either converting into string or setting coefficients one by one. I will have to test to see which is faster.

fmpz_polyxx new_poly(dict.length());

for(const auto &it : a.get_dict())
    new_poly.set_coeff(it->first, it->second);

A similar strategy is to be used for conversion back to SymEngine polynomials.

fmpz_polyxx to_conv;        // the poly we want back in SymEngine

for(i = 0; i < to_conv.length(); i++)
    if (to_conv.get_coeff(i) != 0)
        dict[i] = to_conv.get_coeff(i);

The SymEngine polynomial can now be constructed from the dictionary created.

###Shortfalls FLINT polynomial representation is dense i.e. it uses an array type of structure internally. We use sparse polynomial representation in SymEngine. A dense representation will perform poorly for sparse polynomials. eg.

x^10000 + x^5000 + 1

Thus besides conversion from SymEngine::UnivariatePolynomial to FLINT:polynomial we also need to check the sparse-ness of the polynomial. We should only use FLINT if the sparseness is below a certain level.


#Piranha

###Overview Pirahna is currently used by SymEngine for SeriesBase class. It is an optional dependency. Piranha has recently added functionality for many polynomial operations like Division and GCD.

Piranha's main class is the piranha::polynomial class.

piranha::polynomial<Cf, Key>

  Cf  : The coefficient type. We may choose to use either
        piranha::integer, piranha::rational or SymEngine::integer_class
        depending on the polynomial to be constructed.

  Key : The monomial to be used
        piranha::monomial or pirhana::kronecker_monomial

###Interconversion Most of the conversion to polynomial can already be seen in the SeriesBase module. Anyhow, we'll need to make a dedicated function for the conversion of a SymEngine polynomial instance to a Piranha polynomial instance (and vice versa).

talk a LOT more about these functions

UnivariateIntPolynomial ppoly_to_symunipoly()

ppoly symunipoly_to_ppoly()

Here too, we can play around with the idea of using the rational piranha::polynomial, for integer coefficient SymEngine::UnivariatePolynomial. If the UnivariateIntPolynomial stays, we will have to make a decision, otherwise it will be mandatory to use rational piranha::polynomials, as will be discussed later.

###Shortfalls Piranha uses an unordered map for it's sparse representation of polynomials (hash map). While SymEngine uses ordered maps for the same. Both have their own pros and cons, but some algorithms require a monomial ordering to be efficient for eg. the current GCD algorithm in Piranha is slow, due to this reason. Also, conversion back to SymEngine Polynomial might be affected (due to random insertion into an ordered dictionary).

Apart from this, Piranha still lacks many of the functions required for polynomial manipulations.


#Univariate Polynomials

###Overview

Currently in SymEngine, a UnivariateIntPolynomial class exists, which captures integer coefficients. Also, another UnivariatePolynomial has been recently introduced by the UC Davis team. This class deals with polynomials with all other types of coefficients including rationals and expressions. The Int polynomial class will have many efficient algorithms for various functions, which are not possible to implement for a general coefficient class.

###Multiplication

Currently, the UnivariateIntPolynomial class, uses the Kronecker substitution trick to multiply two Univariate Polynomials. To be more precise, the algorithm evaluates the polynomials at a single sufficiently large power of 2. After multiplying the values obtained, it is able to interpolate the coefficients of the new polynomial. These slides call this the KS1 algorithm.

There exists another variant of this algorithm, called KS2. The basic idea remains the same, the only difference being that in KS2, polynomial evaluation happens at four carefully chosen points. The coefficients of the resulting polynomial are then interpolated using combinations of products from the previous evaluation. This paper also gives an explanation of multi-point Kronecker substitution.

We can see the relative performance of the KS2 algorithm with respect to KS1 below.

Thus, we see that KS2 starts performing significantly better for polynomials with larger length (> 80). Thus I propose that the mul_poly implementation will switch between KS1 and KS2 depending on the length of the polynomials. This will have to be carefully benchmarked, as to find a suitable switching point value.

The Fast Fourier Algorithm for polynomial multiplication can also be added made as an optional core for multiplying UnivariateIntPolynomial. This algorithm also has to be used for the class with rational coefficients. This a very famous algorithm and many implementations and resources can be easily found for it.

The actual function format will probably remain the same.

#####Piranha

After converting the SymEngine::UnivariatePolynomial to piranha::polynomial, we can directly use the * operator to compute the multiplication, as Piranha has overloaded it.

#####FLINT

We must first convert the SymEngine::UnivariatePolynomial to either fmpz_poly or fmpq_poly depending on the coefficients and how the rational class spans out. After that,

// do not exist for fmpq
Fmpz_poly_expr mul_karatsuba (Fmpz_poly_expr , Fmpz_poly_expr)  // Karatsuba algorithm
Fmpz_poly_expr mul_SS (Fmpz_poly_expr , Fmpz_poly_expr)         // Kronecker algorithm
Fmpz_poly_expr mul_SS (Fmpz_poly_expr , Fmpz_poly_expr)         // Schonhage-Strassen

// exists for both fmpq and fmpz
Use * operator, which has been overloaded                       // chooses best one auto

We then convert res back to a SymEngine::UnivariatePolynomial.

Note : If the KS1/KS2 algorithm is implemented, SymEngine will most likely perform better than FLINT & Piranha which do not use the multi point kronecker scheme at all.

###Division

Currently there is no support for univariate polynomial division in SymEngine. I propose to implement the standard remainder division, which outputs q and r when f is divided by g, such that f = q*g + r and deg(r) < deg(f). The prototype function will be called in this fashion.

(void) div_poly(f, g, outArg(q), outArg(r));

eg. div_poly(x^2 + x + 1, x + 1, q, r);
	q = x
	r = 1

As very succinctly explained by this paper, polynomial division takes order polynomial multiplication time. The paper also fully explains the algorithm, and even provides pseudo code for the same. The division is carried out by defining rev_k on every polynomial,

with rev_k(P) = x^k * P(1/x). When k is the degree of the polynomial, rev_k return the polynomial with reversed coefficients. The detailed algorithm is explained in the paper.

This algorithm assumes g to be a monic polynomial. What if an input g isn't monic? This gives rise to UnivariateRationalPolynomial class. I will discuss this later on in the project proposal.

#####Piranha

After converting to both the polynomials to piranha::polynomial, we can use

pair<ppoly, ppoly> udivrem(const ppoly &n, const ppoly &d)

to fast calculate the remainder and quotient.

#####FLINT

The division functions in flint are very similar.

// exists for both fmpq and fmpz
Ltuple <fmpz_polyxx , fmpz_polyxx> _expr divrem (Fmpq_poly_exprA , Fmpq_poly_expr B)        // auto best choice
Also, the / and the % operators are overloaded, to provide the same functionality

###GCD & LCM

The GCD of two polynomials is a polynomial, of the highest possible degree, that is a factor of both the two original polynomials. This paper (from page 41 onwards) explains how the simple euclidean algorithm for finding GCD (as learnt in high school) can be extended to polynomials in some field F. Thus the same algorithm also works Univariate integer polynomials. This algorithm simply put is :

GCD(f, g):
	if deg(f) < deg(g) : swap f, g
	if g = 0           : return f
	else               : return GCD(g, f mod g)

Where mod is calculated using our division algorithm. Since this program is tail recursive, we can also define a iterative function which produces the same result! This algorithm is however O(N^2).

The paper extends the algorithm to a 'Fast Extended Euclidean Algorithm' which is done through a divide and conquer strategy which reduces the time to almost linear. The complete algorithm has be specified on page 60.

The LCM of two polynomials is a polynomial, of the lowest possible degree, of which both the polynomials are a factor. The LCM can be easily calculated once the GCD function is implemented.

LCM(f, g):
	return (f * g)/GCD(f, g)

The function implementation will be of the form :

(UnivariateRatPoly) gcd_poly(f, g);

eg.	gcd_poly(x^2 - 1, x + 1)
	x + 1
	
(UnivariateRatPoly) lcm_poly(f, g);

eg.	lcm_poly(x^2 - 1, x + 2)
	x^3 + 2x^2 - x - 2

#####Piranha After converting to both the polynomials to piranha::polynomial, we can use

ppoly gcd(const ppoly &a, const ppoly &b, algorithm)

to calculate the GCD of the two polynomials. The algorithm parameter determines which GCD algorithm piranha will use. Currently it has two algorithms, "Subresultant PRS" and "Heuristic GCD". By default Piranha tries the heuristic algorithm first, and if that fails, it tries the subresultant algorithm. Of course, the LCM can be calculated as mentioned above.

#####FLINT The GCD & LCM functions in flint are also pretty straightforward.

// do not exist for fmpq
Fmpz_poly_expr gcd_subresultant(Fmpz_poly_expr ,Fmpz_poly_expr) // GCD Subresultant
Fmpz_poly_expr gcd_modular(Fmpz_poly_expr ,Fmpz_poly_expr)      // GCD Modular
Fmpz_poly_expr gcd_heuristic(Fmpz_poly_expr ,Fmpz_poly_expr)    // GCD Heuristic

// exists for both fmpq and fmpz
Fmpz_poly_expr gcd(Fmpz_poly_expr, Fmpz_poly_expr)              // auto best choice
Fmpz_poly_expr lcm(Fmpz_poly_expr, Fmpz_poly_expr)              // LCM

###Derivatives and Indefinite Integrals

Helper functions will be made which will return the derivate or indefinite integral of the polynomial with respect to constituent variable. These functions will be relatively easier to implement.

(UnivariateRatPoly) derivate(f)

eg.	derivative(2y^3 + y + 1)
	6y^2 + 1

(UnivariateRatPoly) iintegral(f)

eg.	iintegral(2y^3 + y + 1)
	(1/2)y^4 + (1/2)y^2 + y

#####Piranha & FLINT Piranha and FLINT are not required! This is a simple symbolic calculation which is best handled by SymEngine itself.

###Multipoint evaluation

This method deals with evaluating polynomials at more than one point at a time, and has wide number of applications in maths and physics. Standard evaluation of each of the given points gives rise to a O(N^2) algorithm. An almost linear time algorithm exists, as indicated by the paper here. Another good explanation is provided here. It uses a divide and conquer algorithm. This will speed up the time to O(MlogN) where M is the order of time of multiplication. The same algorithm can be used for both Univariate and Multivariate polynomials.

(vec_rat) multi_eval_poly(f, {p0, p1, ..})

eg.	f = x^2 - x
	multi_eval_poly(f, {0, 1, 3, 7})
	{0, 0, 6, 42}

#####Piranha & FLINT Piranha and FLINT do not provide fast multipoint evaluation of univariate integer/rational polynomials as of now! (Though FLINT has multipoint evaluations for polynomials over Zn)

###Interpolation

Polynomial interpolation stated simply is : given some points, find a polynomial which goes exactly through these points. The problem of interpolating polynomials is fundamental to many problems for numeric computation and manipulation. Interpolation of a points to produce a polynomial, will most likely have rational coefficients, which pushes the need for a separate class for Rational coefficients even more.

The problem of multipoint evaluation and interpolation are solved very similarly, using divide and conquer. The papers provided in multipoint evaluation also cover polynomial interpolation, and provides succinct explanation.

(UnivariateRatPoly) interpolate_poly({x0, x1, ..}, {y0, y1, ..})

eg.	interpolate_poly({0, 1, -1}, {0, 3, 1})
	2x^2 + x

The function will throw an error, when interpolating such a function isn't possible.

#####Piranha Piranha does not provide interpolation of polynomials as of now!

#####FLINT A simple call to this function can be made after appropriate conversions.

// works for both fmpz & fmpq
static Fmpz_poly_expr fmpz_polyxx::interpolate(Fmpz_vec_expr xs, Fmpz_vec_expr ys)

xs and ys are fmpz/fmpq vectors containing pairs of points. It is assumed none of the xs are equal. Also, in the fmpz_poly implementation, if the resulting polynomial is of non-integer type, the result will be undefined (we need to take care of this).

###Factorization Polynomial factorization is useful in all types of computations especially in a future Solver module. I do not plan to implement a factorization module for SymEngine. But I do plan to make wrappers for the other libraries which allow polynomial factorization.

#####Piranha Piranha does not currently support polynomial factorization at all.

#####FLINT FLINT has a special class for integer coefficients polynomials factorization. The fmpz_poly_factorxx class

fmpz_poly_factorxx :: fmpz_poly_factorxx ()          // initialising a factor class

Then, we decompose the polynomial in two steps

void fmpz_poly_factorxx::set_factor_squarefree(Fmpz_poly_expr)   // squarefree decomposition
void fmpz_poly_factorxx::set_factor_zassenhaus(Fmpz_poly_expr)   // Zassenhaus factoring

The factored objects are bing store in the factorxx object we have called the function on. After this we can use the following functions to extract the factorized polynomial.

ulong fmpz_poly_factorxx::size ()               // number of stored factors
slong fmpz_poly_factorxx::exp(slong i)          // exponent of the ith factor
fmpz_polyxx_ref fmpz_poly_factorxx::p(slong i)  // the ith factor (polynomial)

Also, a function exists to check whether a polynomial is square free or not.

// works for both fmpz/fmpq
bool Fmpz_poly_expr::is_squarefree()

#Multivariate Polynomials

###Overview

SymEngine currently has no class for Multivariate Polynomials. From the discussions on gitter, I found out that the UC Davis team has already started working on the multivariate class. I will work alongside them on this, and contribute to the design of the class. The following are the ideas I had about implementing the class.

I will make this class for either the rational coefficients or integer coefficients depending on how things have shaped up with the univariate counterpart. The class will have the following structure :

  • A dictionary : The keys of the dictionary can be thought of as the tuple formed by the powers of the variables in each term. The values will be the coefficients associated with said term. The representation of tuples will be discussed below.
  • Symbols : The number of variables, and the symbols associated with them. Also, these variables will be indexed, so there is some sort of ordering within them.
  • Constructor methods and other internal methods.

Canonicalization will have a very similar approach as the univariate counterpart. The tuples will be stored in increasing order of sum, ties being decided by which term has a smaller power on the variables in order.

So, 3*x^2*y^3 + 2*x^3*y^2 + x^2*y - x*y^2 + 3 will be stored in the order

3   ->  -x*y^2  ->  x^2*y    ->    3*x^2*y^3   ->    2*x^3*y^2

Also, after taking derivative, evaluating and addition/subtraction of multivariate polynomials, one or more variables may disappear from it, and this too needs to be updated. A helper method can also be made for this purpose, and can be used elsewhere.

FLINT & Piranha currently do not support any operations pertaining to multivariate polynomials yet (though Pirahna is capable of representing them). Thus, no interfacing need be done for these functions.

###Goal

My main objective is to implement a fast sparse multivariate polynomial class. Additionally, I believe that the only a single multivariate class should exist. All univariate polynomials are in fact multivariate polynomials with only one variable. This is also an ideal design structure for the class.

Thus, when only one variable will be declared, it itself uses the whole integer to store the powers (acting like the dictionary in the univariate case). Similarly, all functions in this class are defined for n number of variables, and should work when n=1. It may be difficult to achieve the speeds of just the univariate class in the multivariate parent class, but the results must be benchmarked, and a trade-off between design and speeds.

If speeds are severely affected by using the multivariate class as a univariate one, I will have to implement separate functions for both the classes, and some functions which combine the two classes together too (discussed later).

###Implementation of monomial tuples

The trivial way to implement the dictionary is to construct a map from a tuple of integers (corresponding to the respective powers of each variable in the monomial) to another integer, which corresponds to the coefficient of this monomial. This tuple can be implemented as a vector in C++.

For example, 3*x^2*y^4 - x^3 will be represented as {(2, 4): 3, (3, 0): -1}, where the tuples are being stored in a std::vector or std::array (faster).

As mentioned by the ideas page, we can substitute the tuple by packing it into a single integer (say a 128 bit machine int). This provides a faster map (int -> int), and also some operations become simpler (multiplication fof monomials just becomes addition of integers). But this ideas also has many limitations.

  • If the degree of a particular variable in the monomial exceeds the allocated amount of bits, in the integer
  • Uneven distribution of bits for variables (say we have 128 bits, and 5 variables)

There might be even more issues to address, before this scheme actually becomes a reality. Many logistics will have to be mapped before such a optimization can be done.

###Addition & Subtraction Because of this sparse approach, addition and subtraction of multivariate polynomials are trivial. We just need to take care of disappearing variables, as mentioned above.

(MultiRatPoly) add_poly(f, g)
(MultiRatPoly) sub_poly(f, g)

eg.	add_poly(x^2*y - 1, x^2*y + x)
	2*x^2*y + x - 1

eg.	sub_poly(x^2 - y, x^3 - y)	// decrease in symbols
	x^2 - x^3

The above methods may also be overloaded for differentiating between univariate and multivariate addition/subtraction.

###Partial derivatives and integrals As in the univariate case, helper functions will be made which will return the derivate or indefinite integral of the polynomial with respect to a given variable. These functions can then be called by the diff/integrate method externally.

(MultivariateRatPoly) derivate(f, wrt_symbol)

eg.	derivative(x^2*y - x*y, 'x')
	2*x*y - y
	
(MultivariateRatPoly) iintegral(f, wrt_symbol)

eg.	iintegral(x^2*y - x*y, 'y')
	(1/2)*x^2*y^2 - (1/2)*x*y^2

###Multiplication

The most common algorithm for multivariate polynomial multiplication is by using Kronecker's trick. On a high level, what it basically does is convert the two multivariate polynomials to their univariate counterpart (by using a suitable power of one of the variables) and then use a fast univariate multiplication technique, to get the resulting univariate polynomial. Now the Kronecker inverse of this new polynomial, gives us back the required polynomials. This paper and these slides cover up on this technique and provide algorithms for the same.

This will require a new method which converts multivariate polynomials to univariate polynomials efficiently (using the given encoding to the variables). Also, I would need evaluation methods ready before multiplication can be done.

(MultivariateRat) mul_poly(f, g)

eg.	mul_poly(x*y, x^2 - y + z)
	x^3*y - x*y^2 + x*y*z

Care must be taken while implementing these methods, to check all the different cases. A univariate polynomial with different variables x and y, should give rise to a multivariate polynomial. Also different multivariate polynomials (with mismatch of variables) getting multiplied should also be handled correctly.

Multivariate division requires Gröbner basis, which I'm not proposing to implement.

###Evaluation

Multivariate polynomial evaluation can be done using a generalization of Horner's method proposed by Pena & Sauer here (page 3). These keeps in memory more than one registers to compute the terms of the polynomial iteratively. The algorithm works like this :

Chooses a variable and apply the Horner's evaluation scheme in that variable. Thereby we will be treating the other variables as constants. Afterwards another variable is chosen and the same process is applied again. This is repeated until all variables are processed. The function returns an integer (or a pair for the rational case).

Another evaluation method should be partial evaluation. Which evaluates the multivariate polynomial at some given values of variables, i.e. not all symbols may have a value. This will return another multivariate/univariate polynomial.

(MultivariateRatPoly) eval(f, dict_values)

eg.	eval(x*y + 1, {x:1, y:2})
    3

	eval(x*y + 1, {x:1})
    y + 1

	eval(x*y + 1, {x:1, z:2})
    ERROR: Symbol not present

#Additional Goals

###Class for Rational coefficients (as a replacement)

According to the current polynomial class structure in SymEngine, polynomials with rational coefficients will be treated under the general class of coefficients UnivariatePolynomial. I propose, that a new class should be implemented for Rational coefficient polynomials. Rational coefficient polynomials are very common in all types of computations, and hence their function implementations must be as fast as integer coefficients polynomials. If they fall under the general class of polynomials (as they do now), they will use the general methods of the UnivariatePolynomial class and will not be able to benefit from the fast methods available to them.

The only difference between the UnivariateInt and the UnivariateRat class will be that the rational class, will store a map from degree to a pair of numbers.

typedef map<unsigned, pair<intclass, intclass> > map_uint_pair_mpz 

These numbers represent the numerator and denominator respectively. Maybe the UnivariateInt class, can altogether be removed, (as integers are just rationals with denominator 1). This will have to be decided by benchmarks of how the rational class performs as an exclusive integer class only.

All the above mentioned algorithms, and the already existing ones will work even for rational coefficients (except for the Kronecker substitution trick for multiplication). The Fast Fourier Transform algorithm can be used as a substitute (which works perfectly for rational coefficients). This can also be used as an optional substitute for multiplication in the integer class as well. (if the integer class stays). Also, slight modifications would have to be done to evaluation functions, which will return a pair of integers as a result.

The same idea can be used for the Multivariate class as well. Ideally, we should look forward to implementing only a single class which captures both Univariate and Multivariate polynomials, and both Integer and Rational coefficient. This superclass should be made as fast as the individual classes.

###Seamless operation with the rest of SymEngine

As in Sympy, the polynomial class and other symbolics must seamless operate with one another. Whenever basic operations occur between a Basic and a Polynomial, it must be checked whether the resulting expression can be represented using Polynomial. For eg. add(x, Poly(x**2 + x)) should give Poly(x**2 +2x) instead of creating an instance of the Add class. For this to happen, we need a helper function can_be_poly which will tell us if the other operand can be converted to a polynomial or not. If it can, the result should always be a polynomial for basic operations.

The can_be_poly function can be even given a form of convert_to_poly for constructing polynomial instances out of Basic instances consisting of Mul,Pow and Add. Another functionality that can be implemented is a poly_to_basic which does the reverse of the above mentioned function. It converts polynomial instances to Basic.

This will mainly induce changes in add, sub, mul, div, pow. These are basically all the operations that can be performed on polynomials, which may still result in a polynomial. Only after these seamless operations are implemented, the SymEngine polynomial module can be said to be complete. I will need guidance from mentors on how this integration has to be done, and also have to read up the Sympy code base to see how it's implemented there.

###Functionality for general coefficient class

The UnivariatePolynomial class deals with polynomials with symbolic coefficients. This class is still not complete and is missing functionality like division, interpolation (in terms of symbolic points) and GCD. Though these cannot be optimized like for integer and rational coefficients, a simple implementation is required. Also just like the other classes, this class should have seamless operations with the rest of the SymEngine symbolics. We also need to implement a similar class for the multivariate counterpart, too.

###SymEngine's own Integer Class

Couple of discussions over the polynomial module, and also the GSoC Idea page indicate the need for SymEngine's own Integer class. The need arises because, machine integers though fast, are limited to a max of 128 bits. While integer libraries like GMP don't have a restriction on size but are not as fast as machine ints. SymEngine's Integer class should ideally use machine ints till the numbers being dealt with are within limit, and switch to other libraries (like GMP) when the numbers exceed the limit, automatically.

Though I do not know how this class will be implemented, I plan to help in it's development and work alongside my mentors to get it finished, and ready to use.


#Timeline

I also plan to do an internship at a local startup here in Mumbai, along with the Summer of Code. I will still be able to devote 36 hours a week to this project. I plan to spend 4 hours per weekday and 16 hours on the weekend on getting the project complete. My summer break starts at the end of April. While I believe my internship starts in the first week of May. I can use this time to get a head start into the tasks at hand too. Also, I can efficiently use the time before the summers to gather all the necessary information and details on my project, so I can begin work without any delays. My internship will end by July second week, so I dedicatedly work from there onwards too.

I will also maintain a blog so that mentors can regulate my work, and also monitor my progress.

####Pre-GSoC

  • Examine the Sympy source code to look at how the above mentioned structures and algorithms are implemented
  • Discuss the choice and implementation of the structures and algorithms with mentors
  • Chalk out all design details of implementation and decide on undecided areas of implementation
  • Get all the pseudo code for the complex algorithms ready before summers starts, so there is no delay in implementing them

####Week 1

  • Implement Fast Fourier Transform for multiplication and the above mentioned KS2 algorithm
  • Add tests and benchmark against old mul_poly

####Week 2

  • Introduce the UnivariateRatPolynomial class and model the functions and methods
  • Benchmark the UnivariateRatPolynomial class against UnivariateIntPolynomial on processing only integer coefficient polynomials. This will determine how the class shapes up in the future. From this point on, the functions will be implemented for both the classes (i.e. if Int class still exists)
  • Implement the fast division algorithm for the class

####Week 3

  • Also start setting up basic wrappers to both the libraries
  • Add in wrappers for Multiplication and Division, using the two libraries

####Week 4

  • Discuss with @certik and @bluescarni about the GCD algorithm, and implement it for the class
  • Implement the derivative and integral functions

####Week 5

  • Add in wrappers for GCD, using the two libraries
  • Add in wrappers for Factorization of polynomials, using FLINT

####Week 6

  • Implement interpolation and multi-point evaluation for the class, also write tests and increase overall coverage of the whole module

####Week 7

  • Wind up all the leftover work for the the whole of the univariate class
  • Finish up basic algorithms and get all wrappers for the univariate class up and running
  • Benchmark the performance of other libraries (via wrappers) with SymEngine's own implementation for each method

####Week 8

  • Work on integrating the univariate polynomial class with the rest of the SymEngine symbolics
  • Start working on the multivariate polynomial class (or refactor it if already merged in by the UC Davis team)

####Week 9

  • Benchmark MultivariateUniPoly against UnivariateIntPoly extensively, to see if the former can be used as a superclass or not
  • Work on a MultivaraiteRatPoly. Again, the progress will depend on how the rational class fared in the univariate counterpart

####Week 10

  • Work on optimizing the internal structure of the multivariate polynomial
  • Add basic functionality like addition and subtraction
  • Improve test coverage

####Week 11

  • Write function for integrals and derivatives
  • Finish up all required helper functions, so the class is complete as is

####Week 12

  • Write up the remaining functions for the multivariate class including Evaluation, Multiplication and Division

####Week 13

  • Finish off the module, by adding functions to interface multivariate polynomials with the rest of SymEngine
  • Wrap up the whole module, and finish off any leftover work
  • Benchmark the performance of the module, and compare statistics.
  • Noting down where the shortcomings and scopes of improvement for the module

####Post GSoC

I will always stay a part of SymEngine, and keep contributing for years to come. It still excites me to know SymEngine is possibly the fastest CAS right now, and I would always want it to have as many functionalities as possible. Google Summer of Code has given me a great opportunity to get started with open source contributions, and be a part of the open source community.

I hope to stay part of SymEngine when (hopefully) it moves ahead of Sympy in the CAS competition and continues to be the fastest CAS ever!


Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.