Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support negative exponents in pow() where a modulus is specified. #80208

Closed
rhettinger opened this issue Feb 18, 2019 · 24 comments
Closed

Support negative exponents in pow() where a modulus is specified. #80208

rhettinger opened this issue Feb 18, 2019 · 24 comments
Assignees
Labels
3.8 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@rhettinger
Copy link
Contributor

BPO 36027
Nosy @tim-one, @rhettinger, @mdickinson, @stevendaprano, @encukou, @skrah, @serhiy-storchaka, @pablogsal, @miss-islington, @lschoe
PRs
  • bpo-36027: Extend three-argument pow to negative second argument #13266
  • bpo-36027 bpo-36974: Fix "incompatible pointer type" compiler warnings #13758
  • bpo-36027: Really fix "incompatible pointer type" compiler warning #13761
  • bpo-36027: Fix a potential refcount bug in long_pow #26690
  • [3.10] bpo-36027: Fix a potential reference-counting bug in long_pow (GH-26690) #26703
  • [3.9] bpo-36027: Fix a potential reference-counting bug in long_pow (GH-26690) #26702
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/mdickinson'
    closed_at = <Date 2019-06-02.09:25:02.962>
    created_at = <Date 2019-02-18.20:10:59.398>
    labels = ['3.8', 'type-feature', 'library']
    title = 'Support negative exponents in pow() where a modulus is specified.'
    updated_at = <Date 2021-06-13.07:36:22.386>
    user = 'https://github.com/rhettinger'

    bugs.python.org fields:

    activity = <Date 2021-06-13.07:36:22.386>
    actor = 'miss-islington'
    assignee = 'mark.dickinson'
    closed = True
    closed_date = <Date 2019-06-02.09:25:02.962>
    closer = 'mark.dickinson'
    components = ['Library (Lib)']
    creation = <Date 2019-02-18.20:10:59.398>
    creator = 'rhettinger'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 36027
    keywords = ['patch']
    message_count = 24.0
    messages = ['335862', '335888', '335894', '335921', '335922', '335952', '335956', '335970', '335976', '335987', '335994', '336012', '336028', '336130', '344162', '344260', '344261', '344271', '344330', '344334', '344345', '344454', '362505', '368821']
    nosy_count = 11.0
    nosy_names = ['tim.peters', 'rhettinger', 'phr', 'mark.dickinson', 'steven.daprano', 'petr.viktorin', 'skrah', 'serhiy.storchaka', 'pablogsal', 'miss-islington', 'lschoe']
    pr_nums = ['13266', '13758', '13761', '26690', '26703', '26702']
    priority = 'low'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue36027'
    versions = ['Python 3.8']

    @rhettinger
    Copy link
    Contributor Author

    Having gcd() in the math module has been nice. Here is another number theory basic that I've needed every now and then:

        def multinv(modulus, value):
            '''Multiplicative inverse in a given modulus
                >>> multinv(191, 138)
                18
                >>> 18 * 138 % 191
                1
    
                >>> multinv(191, 38)
                186
                >>> 186 * 38 % 191
                1
    
                >>> multinv(120, 23)
                47
                >>> 47 * 23 % 120
                1
        '''
        # https://en.wikipedia.org/wiki/Modular_multiplicative_inverse
        # http://en.wikipedia.org/wiki/Extended_Euclidean_algorithm
        x, lastx = 0, 1
        a, b = modulus, value
        while b:
            a, q, b = b, a // b, a % b
            x, lastx = lastx - q * x, x
        result = (1 - lastx * modulus) // value
        if result < 0:
            result += modulus
        assert 0 <= result < modulus and value * result % modulus == 1
        return result
    

    @rhettinger rhettinger added 3.8 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Feb 18, 2019
    @tim-one
    Copy link
    Member

    tim-one commented Feb 19, 2019

    • Some form of this would be most welcome!

    • If it's spelled this way, put the modulus argument last? "Everyone expects" the modulus to come last, whether in code:

        x = (a+b) % m
        x = a*b % m
        x = pow(a, b, m)

    or in math:

    a^(k*(p-1)) = (a^(p-1))^k = 1^k = 1 (mod p)
    
    • Years ago Guido signed off on spelling this
        pow(value, -1, modulus)

    which currently raises an exception. Presumably

        pow(value, -n, modulus)
        
    for int n > 1 would mean the same as pow(pow(value, -1, modulus), n, modulus), if it were accepted at all.  I'd be happy to stop with -1.
    • An alternative could be to supply egcd(a, b) returning (g, x, y) such that

      ax + by == g == gcd(a, b)

    But I'm not sure anyone would use that _except_ to compute modular inverse. So probably not.

    @rhettinger
    Copy link
    Contributor Author

    If it's spelled this way, put the modulus argument last?

    Yes, that makes sense.

    Years ago Guido signed off on spelling this

    pow(value, -1, modulus)

    +1 ;-)

    @mdickinson
    Copy link
    Member

    +1 for the pow(value, -1, modulus) spelling. It should raise ValueError if value and modulus are not relatively prime.

    It would feel odd to me not to extend this to pow(value, n, modulus) for all negative n, again valid only only if value is relatively prime to modulus.

    @mdickinson
    Copy link
    Member

    Here's an example of some code in the standard library that would have benefited from the availability of pow(x, n, m) for arbitrary negative n:

    cpython/Lib/_pydecimal.py

    Lines 957 to 960 in e7a4bb5

    if self._exp >= 0:
    exp_hash = pow(10, self._exp, _PyHASH_MODULUS)
    else:
    exp_hash = pow(_PyHASH_10INV, -self._exp, _PyHASH_MODULUS)

        if self._exp >= 0:
            exp_hash = pow(10, self._exp, _PyHASH_MODULUS)
        else:
            exp_hash = pow(_PyHASH_10INV, -self._exp, _PyHASH_MODULUS)

    where:

        _PyHASH_10INV = pow(10, _PyHASH_MODULUS - 2, _PyHASH_MODULUS)

    With the proposed addition, that just becomes pow(10, self._exp, _PyHASH_MODULUS), and the _PyHASH_10INV constant isn't needed any more.

    @lschoe
    Copy link
    Mannequin

    lschoe mannequin commented Feb 19, 2019

    Agreed, extending pow(value, n, modulus) to negative n would be a great addition!

    To have modinv(value, modulus) next to that also makes a lot of sense to me, as this would avoid lots of confusion among users who are not so experienced with modular arithmetic. I know from working with generations of students and programmers how easy it is to make mistakes here (including lots of mistakes that I made myself;)

    One would implement pow() for negative n, anyway, by first computing the modular inverse and then raising it to the power -n. So, to expose the modinv() function to the outside world won't cost much effort.

    Modular powers, in particular, are often very confusing. Like for a prime modulus p, all of pow(a, -1,p), pow(a, p-2, p), pow(a, -p, p) are equal to eachother, but a common mistake is to take pow(a, p-1, p) instead. For a composite modulus things get much trickier still, as the exponent is then reduced in terms of the Euler phi function.

    And, even if you are not confused by these things, it's still a bit subtle that you have to use pow(a, -1,p) instead of pow(a, p-2, p) to let the modular inverse be computed efficiently. With modinv() available separately, one would expect --and get-- an efficient implementation with minimal overhead (e.g., not implemented via a complete extended-gcd).

    @mdickinson
    Copy link
    Member

    it's still a bit subtle that you have to use pow(a, -1,p) instead of pow(a, p-2, p) to let the modular inverse be computed efficiently

    That's not 100% clear: the binary powering algorithm used to compute pow(a, p-2, p) is fairly efficient; the extended gcd algorithm used to compute the inverse may or may not end up being comparable. I certainly wouldn't be surprised to see pow(a, p-2, p) beat a pure Python xgcd for computing the inverse.

    @lschoe
    Copy link
    Mannequin

    lschoe mannequin commented Feb 19, 2019

    ... to see pow(a, p-2, p) beat a pure Python xgcd for computing the inverse.

    OK, I'm indeed assuming that modinv() is implemented efficiently, in CPython, like pow() is. Then, it should be considerably faster, maybe like this:

    >>> timeit.timeit("gmpy2.invert(1023,p)", "import gmpy2; p=2**61-1")
    0.18928535383349754
    >>> timeit.timeit("gmpy2.invert(1023,p)", "import gmpy2; p=2**127-1")
    0.290736872836419
    >>> timeit.timeit("gmpy2.invert(1023,p)", "import gmpy2; p=2**521-1")
    0.33174844290715555
    >>> timeit.timeit("gmpy2.powmod(1023,p-2,p)", "import gmpy2; p=2**61-1")
    0.8771009990597349
    >>> timeit.timeit("gmpy2.powmod(1023,p-2,p)", "import gmpy2; p=2**127-1")
    3.460449585430979
    >>> timeit.timeit("gmpy2.powmod(1023,p-2,p)", "import gmpy2; p=2**521-1")
    84.38728888797652

    @mdickinson
    Copy link
    Member

    Then, it should be considerably faster

    Why would you expect that? Both algorithms involve a number of (bigint) operations that's proportional to log(p), so it's going to be down to the constants involved and the running times of the individual operations. Is there a clear reason for your expectation that the xgcd-based algorithm should be faster?

    Remember that Python has a subquadratic multiplication (via Karatsuba), but its division algorithm has quadratic running time.

    @lschoe
    Copy link
    Mannequin

    lschoe mannequin commented Feb 19, 2019

    Is there a clear reason for your expectation that the xgcd-based algorithm should be faster?

    Yeah, good question. Maybe I'm assuming too much, like assuming that it should be faster;) It may depend a lot on the constants indeed, but ultimately the xgcd style should prevail.

    So the pow-based algorithm needs to do log(p) full-size muls, plus log(p) modular reductions. Karatsuba helps a bit to speed up the muls, but as far as I know it only kicks in for quite sizeable inputs. I forgot how Python is dealing with the modular reductions, but presumably that's done without divisions.

    The xgcd-based algorithm needs to do a division per iteration, but the numbers are getting smaller over the course of the algorithm. And, the worst-case behavior occurs for things involving Fibonacci numbers only. In any case, the overall bit complexity is quadratic, even if division is quadratic. There may be a few expensive divisions along the way, but these also reduce the numbers a lot in size, which leads to good amortized complexity for each iteration.

    @rhettinger
    Copy link
    Contributor Author

    +1 for the pow(value, -1, modulus) spelling. It should raise
    ValueError if value and modulus are not relatively prime.

    It would feel odd to me not to extend this to
    pow(value, n, modulus) for all negative n, again
    valid only only if value is relatively prime to modulus.

    I'll work up a PR using the simplest implementation. Once that's in with tests and docs, it's fair game for someone to propose algorithmic optimizations.

    @tim-one
    Copy link
    Member

    tim-one commented Feb 19, 2019

    Raymond, I doubt we can do better (faster) than straightforward egcd without heroic effort anyway. We can't even know whether a modular inverse exists without checking whether the gcd is 1, and egcd builds on what we have to do for the latter anyway. Even if we did know in advance that a modular inverse exists, using modular exponentiation to find it requires knowing the totient of the modulus, and computing the totient is believed to be no easier than factoring.

    The only "optimization" I'd be inclined to _try_ for Python's use is an extended binary gcd algorithm (which requires no bigint multiplies or divides, the latter of which is especially sluggish in Python):

    https://www.ucl.ac.uk/~ucahcjm/combopt/ext_gcd_python_programs.pdf

    For the rest:

    • I'd also prefer than negative exponents other than -1 be supported. It's just that -1 by itself gets 95% of the value.

    • It's fine by me if pow(a, -1, m) is THE way to spell modular inverse. Adding a distinct modinv() function too strikes me as redundnt clutter, but not damaging enough to be worth whining about. So -0 on that.

    @rhettinger
    Copy link
    Contributor Author

    Changing the title to reflect a focus on building-out pow() instead of a function in the math module.

    @rhettinger rhettinger changed the title Consider adding modular multiplicative inverse to the math module Support negative exponents in pow() where a modulus is specified. Feb 20, 2019
    @lschoe
    Copy link
    Mannequin

    lschoe mannequin commented Feb 20, 2019

    In pure Python this seems to be the better option to compute inverses:

    def modinv(a, m):  # assuming m > 0
        b = m
        s, s1 = 1, 0
        while b:
            a, (q, b) = b, divmod(a, b)
            s, s1 = s1, s - q * s1
        if a != 1:
            raise ValueError('inverse does not exist')
        return s if s >= 0 else s + m

    Binary xgcd algorithms coded in pure Python run much slower.

    @mdickinson mdickinson self-assigned this May 12, 2019
    @mdickinson
    Copy link
    Member

    I think #57475 is ready to go, but I'd appreciate a second pair of eyes on it if anyone has time.

    @mdickinson
    Copy link
    Member

    New changeset c529967 by Mark Dickinson in branch 'master':
    bpo-36027: Extend three-argument pow to negative second argument (GH-13266)
    c529967

    @mdickinson
    Copy link
    Member

    Done. Closing.

    @serhiy-storchaka
    Copy link
    Member

    PR 13266 introduced a compiler warning.

    Objects/longobject.c: In functionlong_invmod’:
    Objects/longobject.c:4246:25: warning: passing argument 2 oflong_comparefrom incompatible pointer type [-Wincompatible-pointer-types]
         if (long_compare(a, _PyLong_One)) {
                             ^~~~~~~~~~~
    Objects/longobject.c:3057:1: note: expectedPyLongObject * {aka struct _longobject *}’ but argument is of typePyObject * {aka struct _object *}’
     long_compare(PyLongObject *a, PyLongObject *b)
     ^~~~~~~~~~~~

    @encukou
    Copy link
    Member

    encukou commented Jun 2, 2019

    I will fix the compiler warning along with another one that I just introduced.

    @encukou
    Copy link
    Member

    encukou commented Jun 2, 2019

    New changeset e584cbf by Petr Viktorin in branch 'master':
    bpo-36027 bpo-36974: Fix "incompatible pointer type" compiler warnings (GH-13758)
    e584cbf

    @encukou
    Copy link
    Member

    encukou commented Jun 3, 2019

    New changeset 1e375c6 by Petr Viktorin in branch 'master':
    bpo-36027: Really fix "incompatible pointer type" compiler warning (GH-13761)
    1e375c6

    @mdickinson
    Copy link
    Member

    @petr: Thanks for the quick fix!

    @mdickinson
    Copy link
    Member

    For tracker historians: see also bpo-457066

    @phr
    Copy link
    Mannequin

    phr mannequin commented May 14, 2020

    https://bugs.python.org/issue457066 The old is new again ;-).

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants