Skip to content

GSoC 2018 Application Ashish Kumar Gaurav: RUBI

Ashish Kumar Gaurav edited this page Mar 23, 2018 · 64 revisions

About Me

Username and Contact Information

Name : Ashish Kumar Gaurav

University : Indian Institute of Technology, Kharagpur

email : ashishkg0022@gmail.com

github username : ashishkg0022

Time Zone : IST (UTC +5:30)

Personal Background

Hello, I am Ashish Kumar Gaurav a second year undergraduate student at IIT Kharagpur, India. I'm pursuing a degree in Mathematics and Computing. I work on Ubuntu 16.04 LTS with Sublime Text 3 as my primary text editor.I like Sublime Text 3 because of its simplicity, user-friendly GUI. I'm proficient in C and Python. I was introduced to Object Oriented Programming at an early age and I have been coding for more than 6 years now. I started coding in Java and then followed it up with C, C++ and Python. Coding in Python makes thing more simpler and short. I am familiar with git and github. I have been using them since a year.

Contributions to Sympy

I started contributing to SymPy in early November 2017 by solving Easy-to-Fix bugs. After that in Decmber I implemented ordinal arithmetics. I learned a lot through this implementation. Here is a list of all my contributions so far in chronological order:

  • (Merged) added is_number property for meijerg#13558

  • (Merged) removes error from meijerint_definite()#13562

  • (Merged) Resolves RecursionError in latex representation#13564

  • (Merged) modifying accessibility of S.Reals and S.Complexes #13572

  • (Merged) Added two possibilities in Mod( )#13581

  • (Merged) passing a printer to codegen()#13587

  • (Merged) Fixed bug for polytool.poly( )#13588

  • (Merged) Added check of integration while finding Fourier Transform #13595

  • (Merged) Removed bug from Complement#13615

  • (Merged) Implementation of ordinal arithmetics#13682

  • (Merged) Minor change in RUBI#14090

  • (Merged) Fixed Bugs in Utility Function of RUBI #14173. This Pr added few utility functions like Dist, etc and fixed few bugs in SimpHelp , _FixSimplfy etc.

  • (Merged) Added few functions used in rules of RUBI#14279. This Pr added few functions that were used in rules of RUBI but definitions were missing like DerivativeDivides, LogIntegral.

  • (Open) Removed repeated definition of some CustomConstraints in RUBI#14391. This Pr reduces the loading time of rules. It is not complete . I need to update parsetools . It will soon be completed.

The Project

Brief Overview

Rubi is an entirely new inclusion into SymPy. It consists of more than 10,000 rules to cover a wide variety of indefinite integration. Currently SymPy uses algorithms for indefinite integration which are slow and presents results which are not simplified. Rubi utilizes a set of well defined rules which makes it smart to present the results in a more symmetric and simplified manner.

Last year, Abdullah Javed Nesar and Arihant Parsoya, worked on it as Google Summer of Code project. But there is still a lot of work to do, as the rules raise exceptions or give wrong results.

Problem and Motivation

Currently in rubi module of sympy, the structure has been ported. But the code has only been tested for Algebraic rules. There are many other test sets for different types of integrands. They need to be tested. While testing, it is expected to face a lot of errors and bugs. They need to be fixed. Also it takes a lot of time to load the rules. The loading time needs to be improved. Next important thing is version of original RUBI. Last year Arihant and Abdullah worked with was 4.10.8, but now it has been upgraded to 4.14.7. So the changes needs to be updated too.

Work Overview

1. Loading of all rules in rubi

Currently there are following rules file in rubi module :

  • miscellaneous_algebraic.py
  • binomial_products.py
  • miscellaneous_integration.py
  • exponential.py
  • miscellaneous_trig.py
  • hyperbolic.py
  • piecewise_linear.py
  • quadratic_products.py
  • integrand_simplification.py
  • secant.py
  • inverse_hyperbolic.py
  • sine.py
  • inverse_trig.py
  • tangent.py
  • linear_products.py
  • trinomial_products.py
  • logarithms.py

But all these rules can not be loaded due to large time that it requires. As of now algebraic rules, which consist of linear_products, quadratic_products and binomial_products are being loaded.

if only algebraic rules are being loaded :

%time from sympy.integrals.rubi.rubi import rubi_integrate
CPU times: user 2min 12s, sys: 421 ms, total: 2min 13s
Wall time: 2min 15s
  • Repeated definition of same CustomConstraint

One of the main reason for this is the repeated definitons of the same CustomConstraint , which currently uses the definition of lambda functions.

For example

CustomConstraint(lambda a, x: FreeQ(a, x))

This definition has been used a large number of times in the rules (~500 times only in quadratic products). Similarly there are large number of redefining of the same CustomConstraint.

One way to fix this is by defining a constraint only once in some constraints.py and importing instead of redefining. Like the following:

Instead of

pattern = Pattern(..., CustomConstraint(lambda a, x: FreeQ(a, x)))

we could write

constraint_freeq_a = CustomConstraint(lambda a, x: FreeQ(a, x))
...
pattern = Pattern(..., constraint_freeq_a)

Then we can reuse constraint_freeq_a for every occurence of that constraint. This could reduce the loading time sufficiently.

  • Dependencies on lambda functions

The above code can be extended to remove the dependence of rules on lambda functions like the following example:

def free_q(a, x):
    return FreeQ(a, x)

constraint_freeq_a = CustomConstraint(free_q)
pattern = Pattern(..., constraint_freeq_a)

def replacement(x, m):
    return x**(m + S(1))/(m + S(1))

rule = ReplacementRule(pattern, replacement)
rubi.add(rule)

So defining proper functions as above, can remove the dependence of rules on lambda functions. Hence loading time will be much less than current state. While defining customConstraints using proper function it should be taken care that the parameter names have to be the same as the the variable names in the expression

For the above changes, parsetools needs to be updated. As it is very tough to change whole rules manually. We need to make a parsetools that can take care of any change in original rubi . With few coomands, we shall be able to generate new rules in matchpy format.

Initially I am planning to work on RUBI 4.10.8 . Later it can be updated, that will not require much time. The rules of this version can be found here. I have saved the DownValues of the rules in different files here. For getting these, first a rule is loaded and then we can get it by Int // Downvalues. These DownValues are parsed using the parsetools

2. Identification and fixation errors and bugs

This is the major and important task that needs to be done. Rubi currently requires a lot of debugging. In many cases, these rules give wrong result or may raise exceptions. There are many undefined functions, which needs to be defined properly. Also rules give some times wrong results. Major part of my project's work will be to debug all errors and bugs. We need to have a correct set of rules.

for example

rubi_integrate(x**4*Sqrt(a + b*x**2),x)

gives

RecursionError: maximum recursion depth exceeded
  • Recursion errors due to errors in rules

Some constraints are giving wrong results or are wrongly written, which result in wrong matching and no simplification which results in recursion errors

For example, the above intgeral rubi_integrate(x**4*Sqrt(a + b*x**2),x) is reulting in recursion error because of a wrong constraint. It matches with rule 79 and does not result in simplification. currently rule(wrong) is as follows:

pattern79 = Pattern(Integral(x_**WC('m', S(1))*(a_ + x_**n_*WC('b', S(1)))**p_, x_), constraint_freeq_a, constraint_freeq_b, constraint_freeq_p, CustomConstraint(lambda n: PositiveIntegerQ(n)), CustomConstraint(lambda m: IntegerQ(m)), CustomConstraint(lambda k, a, p, n, b, m, x: Unequal(k, S(1))))
    def With79(b, n, a, p, m, x):
        k = GCD(m + S(1), n)
        return Subst(Int(x**(S(-1) + (m + S(1))/k)*(a + b*x**(n/k))**p, x), x, x**k)/k
    rule79 = ReplacementRule(pattern79, lambda b, n, a, p, m, x : With79(b, n, a, p, m, x))
    rubi.add(rule79)

This is wrong. It can be seen from this same rule written in original RUBI (mathematica).

Int[x_^m_.*(a_+b_.*x_^n_)^p_,x_Symbol] :=
  With[{k=GCD[m+1,n]},
  1/k*Subst[Int[x^((m+1)/k-1)*(a+b*x^(n/k))^p,x],x,x^k] /;
 k!=1] /;
FreeQ[{a,b,p},x] && PositiveIntegerQ[n] && IntegerQ[m]

As the condition k != 1 is checked within the With block. This is missing in our current implementation .

So the correct implementation will look like following.

def With79(b, n, a, p, m, x):
        k = GCD(m + S(1), n)
        if Unequal(k, S(1)):
            return Subst(Int(x**(S(-1) + (m + S(1))/k)*(a + b*x**(n/k))**p, x), x, x**k)/k
        return False
    pattern79 = Pattern(Integral(x_**WC('m', S(1))*(a_ + x_**n_*WC('b', S(1)))**p_, x_), constraint_freeq_a, constraint_freeq_b, constraint_freeq_p, CustomConstraint(lambda n: PositiveIntegerQ(n)), CustomConstraint(lambda m: IntegerQ(m)), CustomConstraint(With79))
rule79 = ReplacementRule(pattern79, lambda b, n, a, p, m, x : With79(b, n, a, p, m, x))
    rubi.add(rule79)

A function named get_matching_rule_definition(expr, var) has been defined in rubi, which returns the list or rules which match to expr . In cases of wrong results this function proves to be helpful. We can go to the rule and compare with Mathematica's output.

  • Order of matching in case of multiple rules

There are cases when a pattern matches to multiple rules. Mathematica has a way to sort the rules. The DownValues returns the sorted rules. As we are parsing the DownValues, we already have the sorted rules. If we want matchpy to work in that order, we need to use multiple matchers and specify the order to matchpy. But order currently seems to have less effect. If needed , multiple matchers will be used. There is an ongoing discussion regarding this in matchpy's developers.

3. Utility Functions

The performance of rubi has a significant dependence on utility functions. Last year Abdullah mainly worked on utility functions and its testing. But still it gives wrong results in some places. Many of them has wrong definitons and they are not tested properly.

  • Missing Utility Functions

Some utility functions like Dist , FixRhsIntRule , DerivativeDivides etc were not implemented last year. As rubi has a high dependence on them, the set of utility functions needs to be complete. I have implemented till now Dist and DerivativeDivides and few small utility function. Some of the missing utility functions are as:

  • FixIntRule
  • FixIntRules
  • FixRhsIntRule
  • HeldFormQ
  • CalculusFreeQ
  • DeleteCases
  • SubstForInverseFunctionOfQuotientOfLinears
  • ContentFactorAux

etc.

Also functions which are pre-defined in Mathematica but not in sympy needs to be defined. It includes functions such as SinIntegral, CosIntegral, Optional, Hold, HoldForm etc.

As an example, here is the implementation of Hold and HeldFormQ :

def Hold(expr):
    from sympy import UnevaluatedExpr
    return UnevaluatedExpr(expr)

def HeldFormQ(u):
    if AtomQ(Head(u)):
        MemberQ([Hold],Head(u))

    else:
        HeldFormQ(Head(u))
  • Fixing bugs in the utility functions

There are probably lots of bugs in the utility functions. This is also a reason for wrong results of integral expression. The set of test of the utility funtion is not so enhanced. In cases of wrong integrals, the output of the utility function has to be matched with Mathematica's output. Also in some cases the bug is due to impropoer definitions.

Like the following very small example

def EvenQ(u):
    # gives True if expr is an even integer, and False otherwise.
    return u.is_Integer and u%2 == 0

A proper definition will look like this:

def EvenQ(u):
    # gives True if expr is an even integer, and False otherwise.
    if isinstance(u, (int, float)):
        u = S(u)

    return u.is_Integer and u%2 == 0

The above is just a small example. There are more bugs in utility function , which could be a major reason for wrong results. The test cases regarding this also needs to be improved.

  • Replacing SymPy’s pattern matcher with matchpy

Also in utility functions the SymPy’s naive pattern matcher which is quite slow is to be replaced with matchpy to enhance the performance. There are functions like ExpandAlgebraicFunction, MatchQ, CollectReciprocals, MergeMonomials, RationalFunctionExpand` etc, which uses sympy's matcher. These should be replaced with mathcpy.

For example

def MatchQ(expr, pattern, *var):
    # returns the matched arguments after matching pattern with expression
    match = expr.match(pattern)
    if match:
        return tuple(match[i] for i in var)
    else:
        return None

This should be replaced with matchpy's match like this:

def MatchQ(expr, pattern, *var):
    from matchpy import match as ma
    from matchpy import is_match
    # returns the matched arguments after matching pattern with expression
    m = list(ma(expr, pattern))
    if m:
        a = []
        for substitution in m:
            a.append(tuple(substitution[i] for i in var)) 
        return a
    else:
        return None

There is also is_match function in matchpy which returns if a pattern matches. So by using more enhanced functions of matchpy, utility functions can be improved.

4. Test set for integral expressions

There are large test sets on website of RUBI. Sympy hardly contains the required amount of tests. These tests need to be tested and set of tests are to be created in sympy itself, that will check the performance of rubi in sympy.

The integraton test suite results on rubi's website shows that it dramatically out-performs Maple and Mathematica. Rubi is optimal in 99.8 % cases of the test cases, which is very high when compared to Mathmetica (81.9 %) and Maple (62.1 %) .

While testing the cases, major part of debugging will be done.

5. Update rules and utility functions to new version of rubi

Last year Arihant and Abdullah worked with was 4.10.8, but now it has been upgraded to 4.14.7. So the changes needs to be updated too. There are new utility functions and also many utility functions are renamed. So accordingly the rules has to be updated. Also new rules has been added.

Few changes in utility function

  • ZeroQ (removed)
  • PossibleZeroQ (added)
  • NonzeroQ (removed)
  • ILtQ (added)
  • IGtQ (added)
  • LtQ (added)
  • GtQ (added)
  • RealNumberQ (added)
  • NormalizeHyperbolic (added)
  • SubstNonbinomial(added)

etc.

There are a lot of changes in utility functions. It has been improved a lot. So this enhancement of the utility functions will result in less errors in finding integrals.

  • Parsetools also needs to be updated in accordance with the new rules. With few commands , we may be able to generate whole new rules in sympy.

6. Proper test cases for parsetools

I think the set of tests for parsetools needs to be enhanced. Parsetools depends upon sympify, Function, Set, Symbol etc. Any changes in these part or ay changes in parts which affect these may cause parsetools to fail. So a proper set of test keeps a check on the correctnes of parsetools.

Timeline (tentative)

My prior aim will be to get all rules get corrected. So in accordance with this here is my proposed timeline. It is not fixed, as it may change later on discussion with mentors.

Community Bonding period (23rd April - 14th May) and Week 1

  • Regenerate rules with removal of repeated definition of Constraints
  • Remove any dependencies on lambda

Week 2, 3 and 4

  • Load rules of integrand_simplification , trinomial_products, miscellaneous_algebraic, exponential, logarithms.
  • Write test cases for above rules and fixing the bugs. Fixing bugs will take a major partof the time.
  • Define missing functions and improve the functions if needed as required in these rules.

Week 5 , 6 and 7

  • Load rules of sine, tangent, secant, miscellaneous_trig, inverse_trig.
  • Write test cases for above rules and fixing the bugs. Fixing bugs will take a major partof the time.
  • Define missing functions and improve the functions if needed as required in these rules.

Week 8, 9 and 10

  • Load rules of hyperbolic, inverse_hyperbolic, piecewise_linear, miscellaneous_integration.
  • Write test cases for above rules and fixing the bugs. Fixing bugs will take a major partof the time.
  • Define missing functions and improve the functions if needed as required in these rules.

Week 11 and 12

Week 13

Any Plans/Commitment (During GSoC)

I have no major plans for summer and I will contribute full time for 40 - 50 hours a week. My college restart in mid July but I will still be able to contribute full time since there will be no exams or tests.

How do I fit in?

I have been working with sympy from quite some time. I am familiar with a significant section of the rubi's code and related Mathematica's part and I am well aware of the challenges of this project. Some of my contributions to sympy are directly relevant to the this project. For example I have implemented Dist, DerivativeDivides and some of the missing utility functions. I also tried to reduce the loading time, but I have to update parsetools for automatic generating of rules.

I am confident that I have the skills and the experience to succesfully complete the project.

Though there are some loose ends like I don't have a very clear idea of how to generate decision tree and I don't claim to have read all about it. But I can fill these gaps in the pre gsoc period, and many other details can be worked out during the coding period.

Relevant Issues/ Discussions and References