Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getprime range and keygen speedup #101

Closed
wants to merge 3 commits into from
Closed

Getprime range and keygen speedup #101

wants to merge 3 commits into from

Conversation

RichardThiessen
Copy link

@RichardThiessen RichardThiessen commented Oct 9, 2017

This is the minimal change set.

The speedups could have been put in a separate branch but that's only ~20LOC and they simplify some of the other logic.

It adds a getprimebyrange() function, prime_sieve generator and associated small_primes list to prime.py
It adds a randrange() function to randnum.py
It adds several functions to parallel.py allowing for multiprocess execution of arbitrary functions

  • parallel.py currently includes a reimplementation of getprime()
  • I added these functions rather than putting the new getprime code in both prime.py and parallel.py

In key.py there are several changes:
changed find_p_q() to use the new getprimebyrange() range semantics and to use new logic with range bounds that ensure the generated primes produce a modulus of the correct bit size without any guess and check style reattempts.
it changes newkeys() to pass the getprimebyrange() into the keygen process and to convert it to a curried multiprocess version using the new functions in parallel.py when poolsize>1

If these changes go through, parallel.py can be refractored to remove dependency and references to other modules.

That change should be done regardless.

Speedup is roughly what I mentioned in pull request #99.

@coveralls
Copy link

Coverage Status

Coverage decreased (-1.7%) to 90.153% when pulling 567e3f7 on RichardThiessen:getprime_range_and_keygen_speedup into 8affa13 on sybrenstuvel:master.

@@ -52,20 +52,23 @@ def test_custom_getprime_func(self):
# List of primes to test with, in order [p, q, p, q, ....]
# By starting with two of the same primes, we test that this is
# properly rejected.
primes = [64123, 64123, 64123, 50957, 39317, 33107]
primes = [64123, 64123,#discarded because identical
Copy link
Author

@RichardThiessen RichardThiessen Oct 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unit test mostly worked with the new keygen logic but still needed a few minor changes.
The final accepted primes were changed along with the nbits argument to gen_keys. This prevents a modulo bit length check from tripping in the new keygen logic.

Copy link
Owner

@sybrenstuvel sybrenstuvel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch. Please address the notes I placed in-line. Also the library (and thus your patch) should adhere to PEP-8 and the docstrings according to PEP-257.

Can you also include the script you used to calculate the timings? I'd like to run that myself too, and see the speedups for myself.


*Introduced in Python-RSA 3.1*

:param accurate: whether to enable accurate mode or not.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep this parameter description but include a not that it is no longer in use (and as of which version of the library).


maximum=2**nbits#up to but not including
#multiply by fractional representation of 1/sqrt(2)(rounded up) for minimum
minimum=(maximum*0xb504f333f9de6484597d89b3754abea0)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really needs more explanation to explain why it is done this way (speed? precision? both? something else?). Furthermore, the code that is used to compute the constant needs to be callable and unit tested.

Also, avoid unnecessary parentheses.

# value+=1#round up
# print(hex(value))

while 1:#loop allows for restarting keygen process if primes do not meet required conditions
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use while True instead.

change_p = not change_p

# We want p > q as described on
# http://www.di-mgt.com.au/rsa_alg.html#crt
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this no longer true? Why remove the comment but not the code?

@@ -752,13 +739,11 @@ def newkeys(nbits, accurate=True, poolsize=1, exponent=DEFAULT_EXPONENT):
raise ValueError('Pool size (%i) should be >= 1' % poolsize)

# Determine which getprime function to use
getprime_func = rsa.prime.getprimebyrange
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't assign and then re-assign three lines later. Just keep the if/else construct that was there. Unless you have a good reason to it this way (rather than personal preference), of course, but then include an explanation why it's done that way.

for p in small_primes:
if p<=start:yield p
start|=1#make start odd
#We use an offset when doing the trial divisions. It is much smaller than
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is much smaller than the full number.

This is an absolute statement, and as such it is not true. You handle the case of small primes, and that shows that start can actually be relatively small.

If you're writing about the typical usage scenario (when generating primes for a 2048-bit key, for example), make sure that this is clear, and don't write in absolutes.

"""Returns a prime number randomly chosen from range(start,end)

randomly chooses an initial point within the range
This can be overriden with the optional initial argument
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would anyone pass initial when they can also simply pass a higher start?

initial=rsa.randnum.randrange(start, end)
#check top part of range
for candidate in prime_sieve(initial, end):
# Test for primeness
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this comment (and similar ones below); it doesn't add anything that the code doesn't already clearly express.

@@ -171,6 +171,86 @@ def getprime(nbits):

# Retry if not prime

small_primes=(3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61,
67, 71, 73, 79, 83, 89, 97)
def prime_sieve(start,end):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name suggest the function yields primes, but this isn't the case. Probably it's better to call it composite_sieve.

Also include in the docstring that this is a generator.

span=end-start
#get an int with at least 64 extra bits
#because of the extra bits, value%span wraps around at least 2^64 times
#the non-uniformity of the resulting distribution is below 2**-64
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment doesn't make much sense unless you already know what's being done here. To me it reads "add extra bits, then handle the trouble we caused by adding extra bits".

Base automatically changed from master to main February 15, 2021 20:06
@sybrenstuvel
Copy link
Owner

Closing this PR due to the inactivity of the original author. The patch is still welcome, but the notes should be addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants