Getprime range and keygen speedup #101

RichardThiessen · 2017-10-09T23:24:35Z

This is the minimal change set.

The speedups could have been put in a separate branch but that's only ~20LOC and they simplify some of the other logic.

It adds a getprimebyrange() function, prime_sieve generator and associated small_primes list to prime.py
It adds a randrange() function to randnum.py
It adds several functions to parallel.py allowing for multiprocess execution of arbitrary functions

parallel.py currently includes a reimplementation of getprime()
I added these functions rather than putting the new getprime code in both prime.py and parallel.py

In key.py there are several changes:
changed find_p_q() to use the new getprimebyrange() range semantics and to use new logic with range bounds that ensure the generated primes produce a modulus of the correct bit size without any guess and check style reattempts.
it changes newkeys() to pass the getprimebyrange() into the keygen process and to convert it to a curried multiprocess version using the new functions in parallel.py when poolsize>1

If these changes go through, parallel.py can be refractored to remove dependency and references to other modules.

That change should be done regardless.

Speedup is roughly what I mentioned in pull request #99.

coveralls · 2017-10-09T23:49:44Z

Coverage decreased (-1.7%) to 90.153% when pulling 567e3f7 on RichardThiessen:getprime_range_and_keygen_speedup into 8affa13 on sybrenstuvel:master.

RichardThiessen · 2017-10-10T00:12:16Z

tests/test_key.py

@@ -52,20 +52,23 @@ def test_custom_getprime_func(self):
        # List of primes to test with, in order [p, q, p, q, ....]
        # By starting with two of the same primes, we test that this is
        # properly rejected.
-        primes = [64123, 64123, 64123, 50957, 39317, 33107]
+        primes = [64123, 64123,#discarded because identical


This unit test mostly worked with the new keygen logic but still needed a few minor changes.
The final accepted primes were changed along with the nbits argument to gen_keys. This prevents a modulo bit length check from tripping in the new keygen logic.

sybrenstuvel

Thanks for the patch. Please address the notes I placed in-line. Also the library (and thus your patch) should adhere to PEP-8 and the docstrings according to PEP-257.

Can you also include the script you used to calculate the timings? I'd like to run that myself too, and see the speedups for myself.

sybrenstuvel · 2018-02-05T13:31:59Z

rsa/key.py


        *Introduced in Python-RSA 3.1*

-    :param accurate: whether to enable accurate mode or not.


Please keep this parameter description but include a not that it is no longer in use (and as of which version of the library).

sybrenstuvel · 2018-02-05T13:34:00Z

rsa/key.py

+
+    maximum=2**nbits#up to but not including
+    #multiply by fractional representation of 1/sqrt(2)(rounded up) for minimum
+    minimum=(maximum*0xb504f333f9de6484597d89b3754abea0)


This really needs more explanation to explain why it is done this way (speed? precision? both? something else?). Furthermore, the code that is used to compute the constant needs to be callable and unit tested.

Also, avoid unnecessary parentheses.

sybrenstuvel · 2018-02-05T13:35:26Z

rsa/key.py

+    # value+=1#round up
+    # print(hex(value))
+
+    while 1:#loop allows for restarting keygen process if primes do not meet required conditions


Use while True instead.

sybrenstuvel · 2018-02-05T13:40:32Z

rsa/key.py

-        change_p = not change_p
-
-    # We want p > q as described on
-    # http://www.di-mgt.com.au/rsa_alg.html#crt


Is this no longer true? Why remove the comment but not the code?

sybrenstuvel · 2018-02-05T13:42:44Z

rsa/key.py

@@ -752,13 +739,11 @@ def newkeys(nbits, accurate=True, poolsize=1, exponent=DEFAULT_EXPONENT):
        raise ValueError('Pool size (%i) should be >= 1' % poolsize)

    # Determine which getprime function to use
+    getprime_func = rsa.prime.getprimebyrange


Don't assign and then re-assign three lines later. Just keep the if/else construct that was there. Unless you have a good reason to it this way (rather than personal preference), of course, but then include an explanation why it's done that way.

sybrenstuvel · 2018-02-05T13:52:54Z

rsa/prime.py

+        for p in small_primes:
+            if p<=start:yield p
+    start|=1#make start odd
+    #We use an offset when doing the trial divisions. It is much smaller than


It is much smaller than the full number.

This is an absolute statement, and as such it is not true. You handle the case of small primes, and that shows that start can actually be relatively small.

If you're writing about the typical usage scenario (when generating primes for a 2048-bit key, for example), make sure that this is clear, and don't write in absolutes.

sybrenstuvel · 2018-02-05T13:54:48Z

rsa/prime.py

+    """Returns a prime number randomly chosen from range(start,end)
+
+    randomly chooses an initial point within the range
+    This can be overriden with the optional initial argument


Why would anyone pass initial when they can also simply pass a higher start?

sybrenstuvel · 2018-02-05T13:55:34Z

rsa/prime.py

+        initial=rsa.randnum.randrange(start, end)
+    #check top part of range
+    for candidate in prime_sieve(initial, end):
+        # Test for primeness


Remove this comment (and similar ones below); it doesn't add anything that the code doesn't already clearly express.

sybrenstuvel · 2018-02-05T13:56:46Z

rsa/prime.py

@@ -171,6 +171,86 @@ def getprime(nbits):

            # Retry if not prime

+small_primes=(3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61,
+              67, 71, 73, 79, 83, 89, 97)
+def prime_sieve(start,end):


The name suggest the function yields primes, but this isn't the case. Probably it's better to call it composite_sieve.

Also include in the docstring that this is a generator.

sybrenstuvel · 2018-02-05T13:58:05Z

rsa/randnum.py

+    span=end-start
+    #get an int with at least 64 extra bits
+    #because of the extra bits, value%span wraps around at least 2^64 times
+    #the non-uniformity of the resulting distribution is below 2**-64


This comment doesn't make much sense unless you already know what's being done here. To me it reads "add extra bits, then handle the trouble we caused by adding extra bits".

sybrenstuvel · 2022-01-11T12:59:18Z

Closing this PR due to the inactivity of the original author. The patch is still welcome, but the notes should be addressed.

RichardThiessen added 3 commits October 9, 2017 18:54

these are the nessesary changes

57ba8bf

typo

34d4004

fixed no longer working unit test

567e3f7

RichardThiessen commented Oct 10, 2017

View reviewed changes

sybrenstuvel requested changes Feb 5, 2018

View reviewed changes

sybrenstuvel added the improvements needed label Sep 16, 2018

Base automatically changed from master to main February 15, 2021 20:06

sybrenstuvel closed this Jan 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getprime range and keygen speedup #101

Getprime range and keygen speedup #101

RichardThiessen commented Oct 9, 2017 •

edited by sybrenstuvel

Loading

coveralls commented Oct 9, 2017

RichardThiessen Oct 10, 2017 •

edited

Loading

sybrenstuvel left a comment

sybrenstuvel Feb 5, 2018

sybrenstuvel Feb 5, 2018

sybrenstuvel Feb 5, 2018

sybrenstuvel Feb 5, 2018

sybrenstuvel Feb 5, 2018

sybrenstuvel Feb 5, 2018

sybrenstuvel Feb 5, 2018

sybrenstuvel Feb 5, 2018

sybrenstuvel Feb 5, 2018

sybrenstuvel Feb 5, 2018

sybrenstuvel commented Jan 11, 2022


		Introduced in Python-RSA 3.1

		:param accurate: whether to enable accurate mode or not.

Getprime range and keygen speedup #101

Getprime range and keygen speedup #101

Conversation

RichardThiessen commented Oct 9, 2017 • edited by sybrenstuvel Loading

coveralls commented Oct 9, 2017

RichardThiessen Oct 10, 2017 • edited Loading

Choose a reason for hiding this comment

sybrenstuvel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sybrenstuvel commented Jan 11, 2022

RichardThiessen commented Oct 9, 2017 •

edited by sybrenstuvel

Loading

RichardThiessen Oct 10, 2017 •

edited

Loading