Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

divisor_sigma becomes sluggish over a range of values #14666

Open
OisinMoran opened this issue Apr 27, 2018 · 4 comments
Open

divisor_sigma becomes sluggish over a range of values #14666

OisinMoran opened this issue Apr 27, 2018 · 4 comments
Labels

Comments

@OisinMoran
Copy link

I was delighted to find sympy has a built-in divisor_sigma function as I had previously been using the following (which still uses sympy's factorint):

from sympy.ntheory import divisor_sigma
from sympy.ntheory import factorint

def sum_divisors(n):
    divisor_sum = 1
    factors = factorint(n)

    for factor,power in factors.items():
        temp = 0
        for ex in range(power+1):
            temp += pow(factor, ex)
        divisor_sum *= temp
    return divisor_sum

So when I found about the built-in function I was eager to delete my code and get some speedup in my program and on first glance the built-in version seemed several orders of magnitude faster than my previous implementation. Yet actually including it in my code made things much slower. I have narrowed it down to the simple results shown below:

image

The built-in function is orders of magnitude faster for any particular value, but if called over a range of values is orders of magnitude slower. And for larger ranges it starts to become completely unusable:

image

What could possibly be causing this and how could it be fixed? Thanks :)

@sidhantnagpal
Copy link
Member

What could possibly be causing this

Yes, that is expected, divisor_sigma uses factorint for computing the sum. See factor_.py.

When you factorize n = 10**5 - 1 numbers (corresponding to your range of values example), the required number of operations become significant. (factorint uses different algorithms - trial division, pollard rho, p1)
If you are aware of big O notation, the time complexity would be:
sum {1 <= x <= n} x**(1/4) = O(n**(5/4)) with added constant factor from factorint (where O(x**(1/4)) is the complexity for pollard rho).

how could it be fixed

I think, if it has to be fixed (in terms of usefulness), the sum of divisors (k = 1), can be generated using a sieve which will be O(n lg(n)) (time complexity), but can not be used for large value as it will have a memory complexity of O(n).

@ethankward
Copy link
Contributor

ethankward commented Apr 27, 2018

@sidhantnagpal, that is not correct. OisinMoran's custom method also uses factorint. Most of the time loss is coming from the inheritance from Function:

from sympy import *
import time

def timeit(method):
    def timed(*args, **kw):
        ts = time.time()
        result = method(*args, **kw)
        te = time.time()
        if 'log_time' in kw:
            name = kw.get('log_name', method.__name__.upper())
            kw['log_time'][name] = int((te - ts) * 1000)
        else:
            print ('%r  %2.2f ms' % (method.__name__, (te - ts) * 1000))
        return result
    return timed

def sum_divisors(n):
    divisor_sum = 1
    factors = factorint(n)

    for factor,power in factors.items():
        temp = 0
        for ex in range(power+1):
            temp += pow(factor, ex)
        divisor_sum *= temp
    return divisor_sum

def custom_divisor_sigma(n, k=1):
    n = sympify(n)
    k = sympify(k)
    if n.is_prime:
        return 1 + n ** k
    if n.is_Integer:
        if n <= 0:
            raise ValueError("n must be a positive integer")
        else:
            return Mul(*[(p ** (k * (e + 1)) - 1) / (p ** k - 1) if k != 0
                         else e + 1 for p, e in factorint(n).items()])

@timeit
def t1():
    for n in range(1,10000):
        sum_divisors(n)

@timeit
def t2():
    for n in range(1,10000):
        divisor_sigma(n)

@timeit
def t3():
    for n in range(1,10000):
        custom_divisor_sigma(n)

t1()
t2()
t3()
't1'  86.60 ms
't2'  2365.26 ms
't3'  727.48 ms

As you can see despite being implemented identically, the built in divisor sigma function (t2) is about three times slower than t3.

@sidhantnagpal
Copy link
Member

You mentioned someone else.

I see your point. Your explanation gives an insight to the comparison of the two functions.
I was rather pointing out the sluggish behaviour caused by the way of achieving the intended goal.

@ethankward
Copy link
Contributor

ethankward commented Apr 27, 2018

I'm wondering why divisor_sigma is implemented as a class instead of a bare function- it would be a lot faster to not have to instantiate it.

Answering my own question: it's so that you can use divisor_sigma with symbolic values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants