GSoC 2018 Application Ashish Kumar Gaurav: RUBI
Name : Ashish Kumar Gaurav
University : Indian Institute of Technology, Kharagpur
email : ashishkg0022@gmail.com
github username : ashishkg0022
Time Zone : IST (UTC +5:30)
Hello, I am Ashish Kumar Gaurav a second year undergraduate student at IIT Kharagpur, India. I'm pursuing a degree in Mathematics and Computing. I work on Ubuntu 16.04 LTS with Sublime Text 3 as my primary text editor.I like Sublime Text 3 because of its simplicity, user-friendly GUI. I'm proficient in C and Python. I was introduced to Object Oriented Programming at an early age and I have been coding for more than 6 years now. I started coding in Java and then followed it up with C, C++ and Python. Coding in Python makes thing more simpler and short. I am familiar with git and github. I have been using them since a year.
I started contributing to SymPy in early November 2017 by solving Easy-to-Fix bugs. After that in Decmber I implemented ordinal arithmetics. I learned a lot through this implementation. Here is a list of all my contributions so far in chronological order:
-
(Merged)
added is_number property for meijerg
#13558 -
(Merged)
removes error from meijerint_definite()
#13562 -
(Merged)
Resolves RecursionError in latex representation
#13564 -
(Merged)
modifying accessibility of S.Reals and S.Complexes
#13572 -
(Merged)
Added two possibilities in Mod( )
#13581 -
(Merged)
passing a printer to codegen()
#13587 -
(Merged)
Fixed bug for polytool.poly( )
#13588 -
(Merged)
Added check of integration while finding Fourier Transform
#13595 -
(Merged)
Removed bug from Complement
#13615 -
(Merged)
Implementation of ordinal arithmetics
#13682 -
(Merged)
Minor change in RUBI
#14090 -
(Merged)
Fixed Bugs in Utility Function of RUBI
#14173. This Pr added few utility functions likeDist
, etc and fixed few bugs inSimpHelp
,_FixSimplfy
etc. -
(Merged)
Added few functions used in rules of RUBI
#14279. This Pr added few functions that were used in rules of RUBI but definitions were missing likeDerivativeDivides
,LogIntegral
. -
(Open)
Removed repeated definition of some CustomConstraints in RUBI
#14391. This Pr reduces the loading time of rules. It is not complete . I need to updateparsetools
. It will soon be completed.
Rubi is an entirely new inclusion into SymPy. It consists of more than 10,000 rules to cover a wide variety of indefinite integration. Currently SymPy uses algorithms for indefinite integration which are slow and presents results which are not simplified. Rubi utilizes a set of well defined rules which makes it smart to present the results in a more symmetric and simplified manner.
Last year, Abdullah Javed Nesar and Arihant Parsoya, worked on it as Google Summer of Code project. But there is still a lot of work to do, as the rules raise exceptions or give wrong results.
Currently in rubi module of sympy, the structure has been ported. But the code has only been tested for Algebraic rules. There are many other test sets for different types of integrands. They need to be tested. While testing, it is expected to face a lot of errors and bugs. They need to be fixed. Also it takes a lot of time to load the rules. The loading time needs to be improved. Next important thing is version of original RUBI. Last year Arihant and Abdullah worked with was 4.10.8, but now it has been upgraded to 4.14.7. So the changes needs to be updated too.
Currently there are following rules file in rubi module :
- miscellaneous_algebraic.py
- binomial_products.py
- miscellaneous_integration.py
- exponential.py
- miscellaneous_trig.py
- hyperbolic.py
- piecewise_linear.py
- quadratic_products.py
- integrand_simplification.py
- secant.py
- inverse_hyperbolic.py
- sine.py
- inverse_trig.py
- tangent.py
- linear_products.py
- trinomial_products.py
- logarithms.py
But all these rules can not be loaded due to large time that it requires. As of now algebraic rules, which consist of linear_products
, quadratic_products
and binomial_products
are being loaded.
if only algebraic rules are being loaded :
%time from sympy.integrals.rubi.rubi import rubi_integrate
CPU times: user 2min 12s, sys: 421 ms, total: 2min 13s
Wall time: 2min 15s
One of the main reason for this is the repeated definitons of the same CustomConstraint
, which currently uses the definition of lambda functions.
For example
CustomConstraint(lambda a, x: FreeQ(a, x))
This definition has been used a large number of times in the rules (~500 times only in quadratic products
). Similarly there are large number of redefining of the same CustomConstraint
.
One way to fix this is by defining a constraint only once in some constraints.py
and importing instead of redefining. Like the following:
Instead of
pattern = Pattern(..., CustomConstraint(lambda a, x: FreeQ(a, x)))
we could write
constraint_freeq_a = CustomConstraint(lambda a, x: FreeQ(a, x))
...
pattern = Pattern(..., constraint_freeq_a)
Then we can reuse constraint_freeq_a
for every occurence of that constraint. This could reduce the loading time sufficiently.
The above code can be extended to remove the dependence of rules on lambda
functions like the following example:
def free_q(a, x):
return FreeQ(a, x)
constraint_freeq_a = CustomConstraint(free_q)
pattern = Pattern(..., constraint_freeq_a)
def replacement(x, m):
return x**(m + S(1))/(m + S(1))
rule = ReplacementRule(pattern, replacement)
rubi.add(rule)
So defining proper functions as above, can remove the dependence of rules on lambda functions. Hence loading time will be much less than current state. While defining customConstraints using proper function it should be taken care that the parameter names have to be the same as the the variable names in the expression
For the above changes, parsetools needs to be updated. As it is very tough to change whole rules manually. We need to make a parsetools that can take care of any change in original rubi . With few coomands, we shall be able to generate new rules in matchpy format.
Initially I am planning to work on RUBI 4.10.8 . Later it can be updated, that will not require much time. The rules of this version can be found here. I have saved the DownValues
of the rules in different files here. For getting these, first a rule is loaded and then we can get it by Int // Downvalues
. These DownValues
are parsed using the parsetools
This is the major and important task that needs to be done. Rubi currently requires a lot of debugging. In many cases, these rules give wrong result or may raise exceptions. There are many undefined functions, which needs to be defined properly. Also rules give some times wrong results. Major part of my project's work will be to debug all errors and bugs. We need to have a correct set of rules.
for example
rubi_integrate(x**4*Sqrt(a + b*x**2),x)
gives
RecursionError: maximum recursion depth exceeded
Some constraints are giving wrong results or are wrongly written, which result in wrong matching and no simplification which results in recursion errors
For example, the above intgeral rubi_integrate(x**4*Sqrt(a + b*x**2),x)
is reulting in recursion error because of a wrong constraint. It matches with rule 79 and does not result in simplification.
currently rule(wrong) is as follows:
pattern79 = Pattern(Integral(x_**WC('m', S(1))*(a_ + x_**n_*WC('b', S(1)))**p_, x_), constraint_freeq_a, constraint_freeq_b, constraint_freeq_p, CustomConstraint(lambda n: PositiveIntegerQ(n)), CustomConstraint(lambda m: IntegerQ(m)), CustomConstraint(lambda k, a, p, n, b, m, x: Unequal(k, S(1))))
def With79(b, n, a, p, m, x):
k = GCD(m + S(1), n)
return Subst(Int(x**(S(-1) + (m + S(1))/k)*(a + b*x**(n/k))**p, x), x, x**k)/k
rule79 = ReplacementRule(pattern79, lambda b, n, a, p, m, x : With79(b, n, a, p, m, x))
rubi.add(rule79)
This is wrong. It can be seen from this same rule written in original RUBI (mathematica).
Int[x_^m_.*(a_+b_.*x_^n_)^p_,x_Symbol] :=
With[{k=GCD[m+1,n]},
1/k*Subst[Int[x^((m+1)/k-1)*(a+b*x^(n/k))^p,x],x,x^k] /;
k!=1] /;
FreeQ[{a,b,p},x] && PositiveIntegerQ[n] && IntegerQ[m]
As the condition k != 1
is checked within the With
block. This is missing in our current implementation .
So the correct implementation will look like following.
def With79(b, n, a, p, m, x):
k = GCD(m + S(1), n)
if Unequal(k, S(1)):
return Subst(Int(x**(S(-1) + (m + S(1))/k)*(a + b*x**(n/k))**p, x), x, x**k)/k
return False
pattern79 = Pattern(Integral(x_**WC('m', S(1))*(a_ + x_**n_*WC('b', S(1)))**p_, x_), constraint_freeq_a, constraint_freeq_b, constraint_freeq_p, CustomConstraint(lambda n: PositiveIntegerQ(n)), CustomConstraint(lambda m: IntegerQ(m)), CustomConstraint(With79))
rule79 = ReplacementRule(pattern79, lambda b, n, a, p, m, x : With79(b, n, a, p, m, x))
rubi.add(rule79)
A function named get_matching_rule_definition(expr, var)
has been defined in rubi, which returns the list or rules which match to expr
. In cases of wrong results this function proves to be helpful. We can go to the rule and compare with Mathematica's output.
There are cases when a pattern matches to multiple rules. Mathematica has a way to sort the rules. The DownValues
returns the sorted rules. As we are parsing the DownValues
, we already have the sorted rules. If we want matchpy to work in that order, we need to use multiple matchers and specify the order to matchpy. But order currently seems to have less effect. If needed , multiple matchers will be used. There is an ongoing discussion regarding this in matchpy's developers.
The performance of rubi has a significant dependence on utility functions. Last year Abdullah mainly worked on utility functions and its testing. But still it gives wrong results in some places. Many of them has wrong definitons and they are not tested properly.
Some utility functions like Dist
, FixRhsIntRule
, DerivativeDivides
etc were not implemented last year. As rubi has a high dependence on them, the set of utility functions needs to be complete. I have implemented till now Dist
and DerivativeDivides
and few small utility function. Some of the missing utility functions are as:
- FixIntRule
- FixIntRules
- FixRhsIntRule
- HeldFormQ
- CalculusFreeQ
- DeleteCases
- SubstForInverseFunctionOfQuotientOfLinears
- ContentFactorAux
etc.
Also functions which are pre-defined in Mathematica but not in sympy needs to be defined. It includes functions such as SinIntegral
, CosIntegral
, Optional
, Hold
, HoldForm
etc.
As an example, here is the implementation of Hold
and HeldFormQ
:
def Hold(expr):
from sympy import UnevaluatedExpr
return UnevaluatedExpr(expr)
def HeldFormQ(u):
if AtomQ(Head(u)):
MemberQ([Hold],Head(u))
else:
HeldFormQ(Head(u))
There are probably lots of bugs in the utility functions. This is also a reason for wrong results of integral expression. The set of test of the utility funtion is not so enhanced. In cases of wrong integrals, the output of the utility function has to be matched with Mathematica's output. Also in some cases the bug is due to impropoer definitions.
Like the following very small example
def EvenQ(u):
# gives True if expr is an even integer, and False otherwise.
return u.is_Integer and u%2 == 0
A proper definition will look like this:
def EvenQ(u):
# gives True if expr is an even integer, and False otherwise.
if isinstance(u, (int, float)):
u = S(u)
return u.is_Integer and u%2 == 0
The above is just a small example. There are more bugs in utility function , which could be a major reason for wrong results. The test cases regarding this also needs to be improved.
Also in utility functions the SymPy’s naive pattern matcher which is quite slow is to be replaced with matchpy to enhance the performance. There are functions like ExpandAlgebraicFunction
, MatchQ
, CollectReciprocals
, MergeMonomials,
RationalFunctionExpand` etc, which uses sympy's matcher. These should be replaced with mathcpy.
For example
def MatchQ(expr, pattern, *var):
# returns the matched arguments after matching pattern with expression
match = expr.match(pattern)
if match:
return tuple(match[i] for i in var)
else:
return None
This should be replaced with matchpy's match
like this:
def MatchQ(expr, pattern, *var):
from matchpy import match as ma
from matchpy import is_match
# returns the matched arguments after matching pattern with expression
m = list(ma(expr, pattern))
if m:
a = []
for substitution in m:
a.append(tuple(substitution[i] for i in var))
return a
else:
return None
There is also is_match
function in matchpy which returns if a pattern matches. So by using more enhanced functions of matchpy, utility functions can be improved.
There are large test sets on website of RUBI. Sympy hardly contains the required amount of tests. These tests need to be tested and set of tests are to be created in sympy itself, that will check the performance of rubi in sympy.
The integraton test suite results on rubi's website shows that it dramatically out-performs Maple and Mathematica. Rubi is optimal in 99.8 % cases of the test cases, which is very high when compared to Mathmetica (81.9 %) and Maple (62.1 %) .
While testing the cases, major part of debugging will be done.
Last year Arihant and Abdullah worked with was 4.10.8, but now it has been upgraded to 4.14.7. So the changes needs to be updated too. There are new utility functions and also many utility functions are renamed. So accordingly the rules has to be updated. Also new rules has been added.
- ZeroQ (removed)
- PossibleZeroQ (added)
- NonzeroQ (removed)
- ILtQ (added)
- IGtQ (added)
- LtQ (added)
- GtQ (added)
- RealNumberQ (added)
- NormalizeHyperbolic (added)
- SubstNonbinomial(added)
etc.
There are a lot of changes in utility functions. It has been improved a lot. So this enhancement of the utility functions will result in less errors in finding integrals.
- Parsetools also needs to be updated in accordance with the new rules. With few commands , we may be able to generate whole new rules in sympy.
I think the set of tests for parsetools needs to be enhanced. Parsetools depends upon sympify
, Function
, Set
, Symbol
etc. Any changes in these part or ay changes in parts which affect these may cause parsetools to fail. So a proper set of test keeps a check on the correctnes of parsetools.
My prior aim will be to get all rules get corrected. So in accordance with this here is my proposed timeline. It is not fixed, as it may change later on discussion with mentors.
- Regenerate rules with removal of repeated definition of Constraints
- Remove any dependencies on
lambda
- Load rules of
integrand_simplification
,trinomial_products
,miscellaneous_algebraic
,exponential
,logarithms
. - Write test cases for above rules and fixing the bugs. Fixing bugs will take a major partof the time.
- Define missing functions and improve the functions if needed as required in these rules.
- Load rules of
sine
,tangent
,secant
,miscellaneous_trig
,inverse_trig
. - Write test cases for above rules and fixing the bugs. Fixing bugs will take a major partof the time.
- Define missing functions and improve the functions if needed as required in these rules.
- Load rules of
hyperbolic
,inverse_hyperbolic
,piecewise_linear
,miscellaneous_integration
. - Write test cases for above rules and fixing the bugs. Fixing bugs will take a major partof the time.
- Define missing functions and improve the functions if needed as required in these rules.
I have no major plans for summer and I will contribute full time for 40 - 50 hours a week. My college restart in mid July but I will still be able to contribute full time since there will be no exams or tests.
I have been working with sympy from quite some time. I am familiar with
a significant section of the rubi's code and related Mathematica's part
and I am well aware of the challenges of this project. Some of my contributions
to sympy are directly relevant to the this project. For example I have implemented Dist
, DerivativeDivides
and
some of the missing utility functions. I also tried to reduce the loading time, but
I have to update parsetools for automatic generating of rules.
I am confident that I have the skills and the experience to succesfully complete the project.
Though there are some loose ends like I don't have a very clear idea of how to generate decision tree and I don't claim to have read all about it. But I can fill these gaps in the pre gsoc period, and many other details can be worked out during the coding period.
- Arihant's GSoC final report
- Abdullah's GSoC final report
- Gitter discussion : https://gitter.im/sympy/Rubi-GSoC17
- RUBI website
- Matchpy's issue #27
- Mathematica's reference of rules and pattern
- #13243 Improvements to Rubi
- Discussion in Matchpy's issue #30