divisor_sigma becomes sluggish over a range of values #14666

OisinMoran · 2018-04-27T16:27:56Z

I was delighted to find sympy has a built-in divisor_sigma function as I had previously been using the following (which still uses sympy's factorint):

from sympy.ntheory import divisor_sigma
from sympy.ntheory import factorint

def sum_divisors(n):
    divisor_sum = 1
    factors = factorint(n)

    for factor,power in factors.items():
        temp = 0
        for ex in range(power+1):
            temp += pow(factor, ex)
        divisor_sum *= temp
    return divisor_sum

So when I found about the built-in function I was eager to delete my code and get some speedup in my program and on first glance the built-in version seemed several orders of magnitude faster than my previous implementation. Yet actually including it in my code made things much slower. I have narrowed it down to the simple results shown below:

The built-in function is orders of magnitude faster for any particular value, but if called over a range of values is orders of magnitude slower. And for larger ranges it starts to become completely unusable:

What could possibly be causing this and how could it be fixed? Thanks :)

The text was updated successfully, but these errors were encountered:

sidhantnagpal · 2018-04-27T16:58:27Z

What could possibly be causing this

Yes, that is expected, divisor_sigma uses factorint for computing the sum. See factor_.py.

When you factorize n = 10**5 - 1 numbers (corresponding to your range of values example), the required number of operations become significant. (factorint uses different algorithms - trial division, pollard rho, p1)
If you are aware of big O notation, the time complexity would be:
sum {1 <= x <= n} x**(1/4) = O(n**(5/4)) with added constant factor from factorint (where O(x**(1/4)) is the complexity for pollard rho).

how could it be fixed

I think, if it has to be fixed (in terms of usefulness), the sum of divisors (k = 1), can be generated using a sieve which will be O(n lg(n)) (time complexity), but can not be used for large value as it will have a memory complexity of O(n).

ethankward · 2018-04-27T17:08:47Z

@sidhantnagpal, that is not correct. OisinMoran's custom method also uses factorint. Most of the time loss is coming from the inheritance from Function:

from sympy import *
import time

def timeit(method):
    def timed(*args, **kw):
        ts = time.time()
        result = method(*args, **kw)
        te = time.time()
        if 'log_time' in kw:
            name = kw.get('log_name', method.__name__.upper())
            kw['log_time'][name] = int((te - ts) * 1000)
        else:
            print ('%r  %2.2f ms' % (method.__name__, (te - ts) * 1000))
        return result
    return timed

def sum_divisors(n):
    divisor_sum = 1
    factors = factorint(n)

    for factor,power in factors.items():
        temp = 0
        for ex in range(power+1):
            temp += pow(factor, ex)
        divisor_sum *= temp
    return divisor_sum

def custom_divisor_sigma(n, k=1):
    n = sympify(n)
    k = sympify(k)
    if n.is_prime:
        return 1 + n ** k
    if n.is_Integer:
        if n <= 0:
            raise ValueError("n must be a positive integer")
        else:
            return Mul(*[(p ** (k * (e + 1)) - 1) / (p ** k - 1) if k != 0
                         else e + 1 for p, e in factorint(n).items()])

@timeit
def t1():
    for n in range(1,10000):
        sum_divisors(n)

@timeit
def t2():
    for n in range(1,10000):
        divisor_sigma(n)

@timeit
def t3():
    for n in range(1,10000):
        custom_divisor_sigma(n)

t1()
t2()
t3()

't1'  86.60 ms
't2'  2365.26 ms
't3'  727.48 ms

As you can see despite being implemented identically, the built in divisor sigma function (t2) is about three times slower than t3.

sidhantnagpal · 2018-04-27T17:19:31Z

You mentioned someone else.

I see your point. Your explanation gives an insight to the comparison of the two functions.
I was rather pointing out the sluggish behaviour caused by the way of achieving the intended goal.

ethankward · 2018-04-27T17:26:49Z

I'm wondering why divisor_sigma is implemented as a class instead of a bare function- it would be a lot faster to not have to instantiate it.

Answering my own question: it's so that you can use divisor_sigma with symbolic values.

oscarbenjamin added the ntheory label Feb 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

divisor_sigma becomes sluggish over a range of values #14666

divisor_sigma becomes sluggish over a range of values #14666

OisinMoran commented Apr 27, 2018

sidhantnagpal commented Apr 27, 2018

ethankward commented Apr 27, 2018 •

edited

sidhantnagpal commented Apr 27, 2018

ethankward commented Apr 27, 2018 •

edited

divisor_sigma becomes sluggish over a range of values #14666

divisor_sigma becomes sluggish over a range of values #14666

Comments

OisinMoran commented Apr 27, 2018

sidhantnagpal commented Apr 27, 2018

ethankward commented Apr 27, 2018 • edited

sidhantnagpal commented Apr 27, 2018

ethankward commented Apr 27, 2018 • edited

ethankward commented Apr 27, 2018 •

edited

ethankward commented Apr 27, 2018 •

edited