New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

count_roots is extremely slow with Python ground types #12602

Open
asmeurer opened this Issue Apr 29, 2017 · 18 comments

Comments

Projects
None yet
4 participants
@asmeurer
Member

asmeurer commented Apr 29, 2017

GMPY ground types:

IPython console for SymPy 1.0.1.dev (Python 3.5.3-64-bit) (ground types: gmpy)

These commands were executed:
>>> from __future__ import division
>>> from sympy import *
>>> x, y, z, t = symbols('x y z t')
>>> k, m, n = symbols('k m n', integer=True)
>>> f, g, h = symbols('f g h', cls=Function)
>>> init_printing()

Documentation can be found at http://docs.sympy.org/dev


In [1]: expr = 11355363812949950368319856364342755712460471081301053527133568171268803160551855579764764406412332136789657466300880824616465555590045220022768132246969281371700283178427904690172215428157788636395727*t**14/500000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 1182614101238502509994197939548011046110362591360244720959032955996959698293886871005468894084128139099293801809189908060595593758885614886473934547400040763077455747185622083724725964710198605960741*t**13/6250000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 12922822504467751142249299122933324184092020356108036157731097049497758652003692943810675925067800052327142015387959211427374009396705154181837176763552511140169473943304565171121276347837419884681487*t**12/1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 2247204646780185719430022273864876084706708953097666720876560045907791931848809022047384483255204086759310635258105261945382035979316693256857004274231432751741774866992749719943120236265693542959121*t**11/10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 48361843519832766388699493325560944345496391872452676523436328806727211606456243561884964568166128629309073817110912281835520854136140406763166099011063597874739148993632932049857510934073377073756943*t**10/10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 80164052804759260531194459240730834126540153423610579212661871973340813173703351959915044156338949310408408075892534630817446213642840221172696829016781427774802300251456296296549939454943896121381103*t**9/1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 11820298688854366697091577677719451524251323356734720588058286064886307168520540386661816641085576247246659191024148765432674755172844550760156289246049795015707597236348994207857339854655564489007679*t**8/10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 14908242237501135776503697900202844008307102016637953004464689811953233096672981556176039254271036473296837438230462049062156575391853236190707958824171141585643979204278060814079164562347710212783143*t**7/1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 4004804534801340363315980630036975745885352263060800540982728634021250371208042415913703863593076428445232409745085269553713246947392078483528154594010983406996758112107177357774230732091992176355051*t**6/25000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 14405532019175131023474809758591882760843355517617477976264800133366074549575009123545658907344444270748700666056555232135755778765629022007752913521423634118196613981546114590366304386999628027814879*t**5/10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 10574051936471402714931747549059296527795248964344182731742838740900131997560377847217760142735497446739729085272580576056569967115897852649538335744262984346103108139783500273358254849041434565692381*t**4/1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 7621024600438344015600198602526834713218087596509021419939455089811715884880919464619193508300267251654333701818067555976498078593210932512841643006106774171602557804624587725019604901643333157908251*t**3/125000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 810596792271328855063337004546095119237768466927680286629788036582515398849767422695474356109018097281604030779219064519000249092915587584571381512608327847454533913096943271752401133736953700148513*t**2/3125000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 7250440010579498280072969338541700246700434564503594127678121996192953098480655821398616200867454166215020508242889954318334847876038061179796990134727332078272146610064625333734530143125317393151907*t/10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 1

In [2]: %time count_roots(expr)
CPU times: user 662 ms, sys: 11.9 ms, total: 674 ms
Wall time: 698 ms
Out[2]: 0

Python ground types (use ./bin/isympy -t python):

IPython console for SymPy 1.0.1.dev (Python 3.5.3-64-bit) (ground types: python)

These commands were executed:
>>> from __future__ import division
>>> from sympy import *
>>> x, y, z, t = symbols('x y z t')
>>> k, m, n = symbols('k m n', integer=True)
>>> f, g, h = symbols('f g h', cls=Function)
>>> init_printing()

Documentation can be found at http://docs.sympy.org/dev


In [1]: expr = 11355363812949950368319856364342755712460471081301053527133568171268803160551855579764764406412332136789657466300880824616465555590045220022768132246969281371700283178427904690172215428157788636395727*t**14/500000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 1182614101238502509994197939548011046110362591360244720959032955996959698293886871005468894084128139099293801809189908060595593758885614886473934547400040763077455747185622083724725964710198605960741*t**13/6250000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 12922822504467751142249299122933324184092020356108036157731097049497758652003692943810675925067800052327142015387959211427374009396705154181837176763552511140169473943304565171121276347837419884681487*t**12/1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 2247204646780185719430022273864876084706708953097666720876560045907791931848809022047384483255204086759310635258105261945382035979316693256857004274231432751741774866992749719943120236265693542959121*t**11/10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 48361843519832766388699493325560944345496391872452676523436328806727211606456243561884964568166128629309073817110912281835520854136140406763166099011063597874739148993632932049857510934073377073756943*t**10/10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 80164052804759260531194459240730834126540153423610579212661871973340813173703351959915044156338949310408408075892534630817446213642840221172696829016781427774802300251456296296549939454943896121381103*t**9/1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 11820298688854366697091577677719451524251323356734720588058286064886307168520540386661816641085576247246659191024148765432674755172844550760156289246049795015707597236348994207857339854655564489007679*t**8/10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 14908242237501135776503697900202844008307102016637953004464689811953233096672981556176039254271036473296837438230462049062156575391853236190707958824171141585643979204278060814079164562347710212783143*t**7/1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 4004804534801340363315980630036975745885352263060800540982728634021250371208042415913703863593076428445232409745085269553713246947392078483528154594010983406996758112107177357774230732091992176355051*t**6/25000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 14405532019175131023474809758591882760843355517617477976264800133366074549575009123545658907344444270748700666056555232135755778765629022007752913521423634118196613981546114590366304386999628027814879*t**5/10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 10574051936471402714931747549059296527795248964344182731742838740900131997560377847217760142735497446739729085272580576056569967115897852649538335744262984346103108139783500273358254849041434565692381*t**4/1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 7621024600438344015600198602526834713218087596509021419939455089811715884880919464619193508300267251654333701818067555976498078593210932512841643006106774171602557804624587725019604901643333157908251*t**3/125000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 810596792271328855063337004546095119237768466927680286629788036582515398849767422695474356109018097281604030779219064519000249092915587584571381512608327847454533913096943271752401133736953700148513*t**2/3125000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 7250440010579498280072969338541700246700434564503594127678121996192953098480655821398616200867454166215020508242889954318334847876038061179796990134727332078272146610064625333734530143125317393151907*t/10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 + 1

In [2]: %time count_roots(expr)
CPU times: user 2min 52s, sys: 2.04 s, total: 2min 54s
Wall time: 2min 57s
Out[2]: 0

Obviously the Python ground types are known to be slower than gmpy2, but I would hope that they could be a little faster than this.

TODO list (based on discussion below):

  • Use math.gcd for Python 3.5 or greater (when gmpy is not installed)
  • Consolidate gcd functions in SymPy (PythonRational shouldn't reimplement the gcd algorithm in its constructor
  • See if Lehmer's algorithm can be implemented in pure Python for Python < 3.5 (might be more difficult)
  • Use "crosswise" gcd in rational multiplication and division (see patches below)
  • Apply these performance improvement to the core Rational as well.
  • Profile and see if any other performance improvements are possible.
@jksuom

This comment has been minimized.

Member

jksuom commented Apr 29, 2017

It seems that the gcd computation in pythonrational is the bottleneck. With big numbers like those above the gcd is seldom greater than one, so there is no cancellation, and the numbers grow in size to nearly 30000 digits. Almost the same number of loop repetitions is necessary in the Euclidean algorithm for each single arithmetic operation with PythonRationals. Moreover, each loop involves long division with big (python) numbers.

gmp applies Lehmer's algoritm that essentially works with one machine word (the most significant one) at a time. It appears to be much more efficient.

@asmeurer

This comment has been minimized.

Member

asmeurer commented Apr 29, 2017

Looks like math.gcd uses Lehmer's algorithm. Sadly, it is Python 3.5 only, but we should definitely be using it in that version.

@asmeurer

This comment has been minimized.

Member

asmeurer commented Apr 29, 2017

This simple patch:

diff --git a/sympy/polys/domains/pythonrational.py b/sympy/polys/domains/pythonrational.py
index fb455d829a..99ea5073bb 100644
--- a/sympy/polys/domains/pythonrational.py
+++ b/sympy/polys/domains/pythonrational.py
@@ -55,10 +55,8 @@ def __init__(self, p, q=1):
             self.p = p
             self.q = q
         else:
-            x, y = p, q
-
-            while y:
-                x, y = y, x % y
+            import math
+            x = math.gcd(p, q)

             if x != 1:
                 p //= x

results in

In [2]: %time count_roots(expr)
CPU times: user 5.19 s, sys: 15.2 ms, total: 5.2 s
Wall time: 5.21 s
Out[2]: 0

which is still slow, but much more reasonable. There are other places in the code where gcd is implemented too (why is there duplication?). They should always use gmpy.gcd if gmpy is available, math.gcd otherwise if in Python 3.5, and the naive algorithm otherwise. We might also want to try to implement Lehmer in pure Python for Python < 3.5.

@asmeurer

This comment has been minimized.

Member

asmeurer commented Apr 29, 2017

I did a profile (with pyinstrument) and it seems with that patch most of the time is still spent in PythonRational.__init__. So I don't know if we can get much faster. Perhaps in __mul__, where it also spends a lot of time, it can be faster by making use of the fact that the two rationals are already in cancelled form?

@asmeurer

This comment has been minimized.

Member

asmeurer commented Apr 29, 2017

I did a little testing, and it does indeed seem to be a little faster (about 1.4x for fractions that don't cancel and 2x for fractions that cancel a lot) to compute a*b as

x1 = math.gcd(a.p, b.q)
x2 = math.gcd(b.p, a.q)
Rational((ap//x1)*(bp//x2), (aq//x2)*(bq//x1))
@asmeurer

This comment has been minimized.

Member

asmeurer commented Apr 29, 2017

With this patch

diff --git a/sympy/polys/domains/pythonrational.py b/sympy/polys/domains/pythonrational.py
index fb455d829a..90aa5c2e1e 100644
--- a/sympy/polys/domains/pythonrational.py
+++ b/sympy/polys/domains/pythonrational.py
@@ -42,7 +42,7 @@ def parent(self):
         from sympy.polys.domains import PythonRationalField
         return PythonRationalField()
 
-    def __init__(self, p, q=1):
+    def __init__(self, p, q=1, _gcd=True):
         if not q:
             raise ZeroDivisionError('rational number')
         elif q < 0:
@@ -55,14 +55,13 @@ def __init__(self, p, q=1):
             self.p = p
             self.q = q
         else:
-            x, y = p, q
+            if _gcd:
+                import math
+                x = math.gcd(p, q)
 
-            while y:
-                x, y = y, x % y
-
-            if x != 1:
-                p //= x
-                q //= x
+                if x != 1:
+                    p //= x
+                    q //= x
 
             self.p = p
             self.q = q
@@ -141,9 +140,12 @@ def __rsub__(self, other):
         return self.__class__(p, q)
 
     def __mul__(self, other):
+        import math
         if isinstance(other, PythonRational):
-            p = self.p*other.p
-            q = self.q*other.q
+            ap, aq, bp, bq = self.p, self.q, other.p, other.q
+            x1 = math.gcd(ap, bq)
+            x2 = math.gcd(bp, aq)
+            p, q = ((ap//x1)*(bp//x2), (aq//x2)*(bq//x1))
         elif isinstance(other, integer_types):
             p = self.p*other
             q = self.q

I get

In [2]: %time count_roots(expr)
CPU times: user 3.68 s, sys: 8.69 ms, total: 3.69 s
Wall time: 3.69 s
Out[2]: 0

I don't know how to make it faster, unless there is a similar trick we can do for addition.

@jksuom

This comment has been minimized.

Member

jksuom commented Apr 29, 2017

It looks like gmp is also using this 'crosswise' gcd for multiplication.

@asmeurer

This comment has been minimized.

Member

asmeurer commented Apr 29, 2017

Oh I forgot to actually disable gcd in the constructor. And a little more speedup from doing ints, and __div__ too

diff --git a/sympy/polys/domains/pythonrational.py b/sympy/polys/domains/pythonrational.py
index fb455d829a..441aa3add3 100644
--- a/sympy/polys/domains/pythonrational.py
+++ b/sympy/polys/domains/pythonrational.py
@@ -42,7 +42,7 @@ def parent(self):
         from sympy.polys.domains import PythonRationalField
         return PythonRationalField()
 
-    def __init__(self, p, q=1):
+    def __init__(self, p, q=1, _gcd=True):
         if not q:
             raise ZeroDivisionError('rational number')
         elif q < 0:
@@ -55,14 +55,13 @@ def __init__(self, p, q=1):
             self.p = p
             self.q = q
         else:
-            x, y = p, q
+            if _gcd:
+                import math
+                x = math.gcd(p, q)
 
-            while y:
-                x, y = y, x % y
-
-            if x != 1:
-                p //= x
-                q //= x
+                if x != 1:
+                    p //= x
+                    q //= x
 
             self.p = p
             self.q = q
@@ -141,16 +140,20 @@ def __rsub__(self, other):
         return self.__class__(p, q)
 
     def __mul__(self, other):
+        import math
         if isinstance(other, PythonRational):
-            p = self.p*other.p
-            q = self.q*other.q
+            ap, aq, bp, bq = self.p, self.q, other.p, other.q
+            x1 = math.gcd(ap, bq)
+            x2 = math.gcd(bp, aq)
+            p, q = ((ap//x1)*(bp//x2), (aq//x2)*(bq//x1))
         elif isinstance(other, integer_types):
-            p = self.p*other
-            q = self.q
+            x = math.gcd(other, self.q)
+            p = self.p*(other//x)
+            q = self.q//x
         else:
             return NotImplemented
 
-        return self.__class__(p, q)
+        return self.__class__(p, q, _gcd=False)
 
     def __rmul__(self, other):
         if not isinstance(other, integer_types):
@@ -162,16 +165,26 @@ def __rmul__(self, other):
         return self.__class__(p, q)
 
     def __div__(self, other):
+        import math
         if isinstance(other, PythonRational):
-            p = self.p*other.q
-            q = self.q*other.p
+            ap, aq, bp, bq = self.p, self.q, other.p, other.q
+            x1 = math.gcd(ap, bp)
+            x2 = math.gcd(bq, aq)
+            p, q = ((ap//x1)*(bq//x2), (aq//x2)*(bp//x1))
         elif isinstance(other, integer_types):
-            p = self.p
-            q = self.q*other
+            x = math.gcd(other, self.p)
+            p = self.p//x
+            q = self.q*(other//x)
+        # if isinstance(other, PythonRational):
+        #     p = self.p*other.q
+        #     q = self.q*other.p
+        # elif isinstance(other, integer_types):
+        #     p = self.p
+        #     q = self.q*other
         else:
             return NotImplemented
 
-        return self.__class__(p, q)
+        return self.__class__(p, q, _gcd=False)
 
     __truediv__ = __div__
In [2]: %time count_roots(expr)
CPU times: user 3.14 s, sys: 8.02 ms, total: 3.15 s
Wall time: 3.15 s
Out[2]: 0
@asmeurer

This comment has been minimized.

Member

asmeurer commented Apr 29, 2017

@asmeurer

This comment has been minimized.

Member

asmeurer commented Apr 29, 2017

I added a TODO list to the top. Everything except for implementing Lehmer's algorithm should be quite easy, if someone else wants to pick up the work.

@asmeurer

This comment has been minimized.

Member

asmeurer commented Apr 29, 2017

I also just remembered that the gcd in the core has a cache, so using that might speed things up even more, supposing the same inputs are encountered multiple times.

@isuruf

This comment has been minimized.

Member

isuruf commented Apr 30, 2017

More performance improvement on top of @asmeurer's patch.

diff --git a/sympy/polys/domains/pythonrational.py b/sympy/polys/domains/pythonrational.py
index 441aa3a..0d81458 100644
--- a/sympy/polys/domains/pythonrational.py
+++ b/sympy/polys/domains/pythonrational.py
@@ -13,6 +13,8 @@
 from sympy.printing.defaults import DefaultPrinting
 from sympy.utilities import public
 
+import math
+
 @public
 class PythonRational(DefaultPrinting, PicklableWithSlots, DomainElement):
     """
@@ -99,15 +101,21 @@ def __neg__(self):
 
     def __add__(self, other):
         if isinstance(other, PythonRational):
-            p = self.p*other.q + self.q*other.p
-            q = self.q*other.q
+            g = math.gcd(self.q, other.q)
+            if g == 1:
+                p, q = self.p*other.q + other.p*self.q, self.q*other.q
+            else:
+                q1, q2 = self.q//g, other.q//g
+                p, q = self.p*q2 + other.p*q1, q1*q2
+                g2 = math.gcd(p, g)
+                p, q = (p // g2), q * (g // g2)
         elif isinstance(other, integer_types):
             p = self.p + self.q*other
             q = self.q
         else:
             return NotImplemented
 
-        return self.__class__(p, q)
+        return self.__class__(p, q, _gcd=False)
 
     def __radd__(self, other):
         if not isinstance(other, integer_types):
@@ -116,19 +124,25 @@ def __radd__(self, other):
         p = self.p + self.q*other
         q = self.q
 
-        return self.__class__(p, q)
+        return self.__class__(p, q, _gcd=False)
 
     def __sub__(self, other):
         if isinstance(other, PythonRational):
-            p = self.p*other.q - self.q*other.p
-            q = self.q*other.q
+            g = math.gcd(self.q, other.q)
+            if g == 1:
+                p, q = self.p*other.q - other.p*self.q, self.q*other.q
+            else:
+                q1, q2 = self.q//g, other.q//g
+                p, q = self.p*q2 - other.p*q1, q1*q2
+                g2 = math.gcd(p, g)
+                p, q = (p // g2), q * (g // g2)
         elif isinstance(other, integer_types):
             p = self.p - self.q*other
             q = self.q
         else:
             return NotImplemented
 
-        return self.__class__(p, q)
+        return self.__class__(p, q, _gcd=False)
 
     def __rsub__(self, other):
         if not isinstance(other, integer_types):
@@ -137,7 +151,7 @@ def __rsub__(self, other):
         p = self.q*other - self.p
         q = self.q
 
-        return self.__class__(p, q)
+        return self.__class__(p, q, _gcd=False)
 
     def __mul__(self, other):
         import math
@@ -159,10 +173,11 @@ def __rmul__(self, other):
         if not isinstance(other, integer_types):
             return NotImplemented
 
-        p = self.p*other
-        q = self.q
+        x = math.gcd(other, self.q)
+        p = self.p*(other//x)
+        q = self.q//x
 
-        return self.__class__(p, q)
+        return self.__class__(p, q, _gcd=False)
 
     def __div__(self, other):
         import math
@@ -192,10 +207,11 @@ def __rdiv__(self, other):
         if not isinstance(other, integer_types):
             return NotImplemented
 
-        p = self.q*other
-        q = self.p
+        x = math.gcd(other, self.p)
+        p = self.q*(other//x)
+        q = self.p//x
 
-        return self.__class__(p, q)
+        return self.__class__(p, q, _gcd=False)
 
     __rtruediv__ = __rdiv__
@asmeurer

This comment has been minimized.

Member

asmeurer commented Apr 30, 2017

@isuruf nice trick. I had to work out the math to see why it works. With that, I get

In [2]: %time count_roots(expr)
CPU times: user 2 s, sys: 16.5 ms, total: 2.02 s
Wall time: 2.02 s
Out[2]: 0

Still 6x slower than gmpy, but that's probably about what we should expect anyway.

By the way, the first Google result for "lehmer's algorithm Python" gives https://bugs.python.org/file9464/lehmer_gcd.py, but there seems to be a bug in it. I tried using it, but I got a mod 0 error from the a, b = int(b), int(a%b) line. I didn't investigate further.

@skirpichev

This comment has been minimized.

Contributor

skirpichev commented May 1, 2017

In the Diofant I get

In [2]: %time count_roots(expr)
CPU times: user 15.1 s, sys: 128 ms, total: 15.3 s
Wall time: 17.3 s
Out[2]: 0

for recent CPython (3.5+).

Perhaps, it also does make sense for you - just drop PythonRational class at all and use Fraction. __mul__-type optimizations could be submitted to upstream.

I did a little testing, and it does indeed seem to be a little faster (about 1.4x for fractions that don't cancel and 2x for fractions that cancel a lot) to compute a*b as

There are paper by G.Collins (google for "Rational Number Arithmetics") with such optimizations. I think, last patch for add/sub is also mentioned there.

BTW this testing was for 3.5+?

@asmeurer

This comment has been minimized.

Member

asmeurer commented May 1, 2017

If someone wants to upstream fixes to the standard library Fraction, go ahead. I likely won't have time to do it myself. We should fix SymPy regardless, as it will be too long to wait for an upstream fix (at best we would have to wait until 2020 when we drop Python 2 support). I remember Fraction used to be used for the Python ground types, but it was dropped, I believe for performance reasons.

Actually, what I'd like to see is for Rational to be made faster (use gmpy when possible; use the faster algorithms). If we can make Rational as fast as PythonRational, then there would be no need to have a separate class. Even if we can't, it would make the core faster. Maybe it's beyond the scope of this particular issue, though.

Yes, all testing was in 3.5. I would like to get a pure Python Lehmer implementation to test as well, but the one I found didn't work, and I haven't messed around with it.

There are paper by G.Collins (google for "Rational Number Arithmetics") with such optimizations. I think, last patch for add/sub is also mentioned there.

This was all I found. Is it what you were referring to?
https://link.springer.com/chapter/10.1007%2F978-3-7091-3406-1_13#page-1

@asmeurer

This comment has been minimized.

Member

asmeurer commented May 1, 2017

Some git history digging:

  • Use of a separate gcd algorithm was a performance improvement: aac27c8
  • PythonRational is (supposedly) faster than Fraction because it doesn't use ABCs: aac27c8
@skirpichev

This comment has been minimized.

Contributor

skirpichev commented May 2, 2017

If we can make Rational as fast as PythonRational

I doubt. Rational has additional overhead with assumptions machinery and so on. Probably that's the reason why it's slow. There is not difference with implementation of arithmetic (Maybe gmpy could help if large integers - are typical use case, as for polys tests).

Is it what you were referring to?

http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=A891FFCCDE46AB48D9B39B7A8C864A0A?doi=10.1.1.114.3437&rep=rep1&type=pdf

faster than Fraction because it doesn't use ABCs: aac27c8

Perhaps, you are about c20b17e. Unfortunately, from the commit message it's not clear if there was performance issue with ABC (and how bad things were) or just compatibility problems.

@asmeurer

This comment has been minimized.

Member

asmeurer commented May 2, 2017

Yep sorry, meant c20b17e.

asmeurer added a commit to asmeurer/sympy that referenced this issue May 19, 2017

asmeurer added a commit to asmeurer/sympy that referenced this issue May 19, 2017

asmeurer added a commit to asmeurer/sympy that referenced this issue May 19, 2017

skirpichev added a commit to skirpichev/diofant that referenced this issue Oct 6, 2018

Add regression test
  On my system
    In [2]: %time count_roots(expr)  # with GMPY
    CPU times: user 900 ms, sys: 8 ms, total: 908 ms
    Wall time: 968 ms
    In [2]: %time count_roots(expr)  # without GMPY
    CPU times: user 14.5 s, sys: 16 ms, total: 14.5 s
    Wall time: 14.8 s

Closes sympy/sympy#12602

skirpichev added a commit to skirpichev/diofant that referenced this issue Oct 7, 2018

Add regression test
  On my system
    In [2]: %time count_roots(expr)  # with GMPY
    CPU times: user 900 ms, sys: 8 ms, total: 908 ms
    Wall time: 968 ms
    In [2]: %time count_roots(expr)  # without GMPY
    CPU times: user 14.5 s, sys: 16 ms, total: 14.5 s
    Wall time: 14.8 s

Closes sympy/sympy#12602
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment