Use hash table instead of binary search to look up 32-bit pseudoprimes in ``n_is_prime`` by fredrik-johansson · Pull Request #2509 · flintlib/flint

fredrik-johansson · 2025-11-28T08:36:23Z

Instead of branchy binary search, do a branch-free O(1) hash table lookup. Inspired in part by discussion with David Sparks.

Gives a 5-10% speedup for random integers and 15-40% speedup for primes:

                  random                          primes               composites without small factors
bits |    old       new    speedup |     old       new    speedup   |     old       new    speedup   
16   | 1.19e-08  1.12e-08   1.062  |  4.22e-08  3.64e-08   1.159  |  3.42e-08  3.40e-08   1.006 
17   | 1.25e-08  1.17e-08   1.068  |  4.68e-08  3.94e-08   1.188  |  3.72e-08  3.72e-08   1.000 
18   | 1.28e-08  1.22e-08   1.049  |  4.93e-08  4.28e-08   1.152  |  4.04e-08  4.00e-08   1.010 
19   | 1.38e-08  1.29e-08   1.070  |  5.83e-08  4.74e-08   1.230  |  4.60e-08  4.43e-08   1.038 
20   | 1.45e-08  1.33e-08   1.090  |  6.58e-08  5.23e-08   1.258  |  5.06e-08  4.93e-08   1.026 
21   | 1.50e-08  1.37e-08   1.095  |  7.30e-08  5.65e-08   1.292  |  5.45e-08  5.39e-08   1.011 
22   | 1.55e-08  1.43e-08   1.084  |  7.53e-08  5.99e-08   1.257  |  5.78e-08  5.71e-08   1.012 
23   | 1.61e-08  1.48e-08   1.088  |  8.20e-08  6.36e-08   1.289  |  6.14e-08  6.08e-08   1.010 
24   | 1.66e-08  1.51e-08   1.099  |  8.73e-08  6.71e-08   1.301  |  6.47e-08  6.43e-08   1.006 
25   | 1.67e-08  1.55e-08   1.077  |  9.11e-08  7.03e-08   1.296  |  6.81e-08  6.76e-08   1.007 
26   | 1.75e-08  1.60e-08   1.094  |  9.71e-08  7.35e-08   1.321  |  7.11e-08  7.09e-08   1.003 
27   | 1.79e-08  1.64e-08   1.091  |  1.02e-07  7.68e-08   1.328  |  7.44e-08  7.41e-08   1.004 
28   | 1.83e-08  1.68e-08   1.089  |  1.07e-07  8.00e-08   1.337  |  7.76e-08  7.72e-08   1.005 
29   | 1.87e-08  1.72e-08   1.087  |  1.12e-07  8.31e-08   1.348  |  8.06e-08  8.02e-08   1.005 
30   | 1.93e-08  1.78e-08   1.084  |  1.17e-07  8.63e-08   1.356  |  8.37e-08  8.34e-08   1.004 
31   | 1.96e-08  1.80e-08   1.089  |  1.22e-07  8.96e-08   1.362  |  8.71e-08  8.66e-08   1.006 
32   | 1.99e-08  1.83e-08   1.087  |  1.28e-07  9.28e-08   1.379  |  9.06e-08  8.99e-08   1.008

I'm not an expert on designing perfect hash functions; these are the parameters I came up with after a bit of trial and error. The hash table stores the 2314 pseudoprimes in an array of 2560 entries, requiring 10% zero padding (plus a bit of extra space for hash keys).

Using an array of length 4096 instead would probably be a little bit faster by allowing a cheaper modulo operation, but would waste 6 KB of precious L1 cache. Maybe one could split by size of n into one small 512-entry table and a large 2048-entry table instead? If someone else wants to revisit this after me.

…s in n_is_prime

JASory · 2025-12-25T11:34:49Z

Does looking up the pseudoprimes perform favorably compared to looking up the witness to use, like Forisek and Jancina's 32-bit test?

If you want to reduce the number of pseudoprimes to store, you can use a different initial witness. 2 is actually very average when used by itself, it's strong because it eliminates the Monier-Rabin semiprimes { (2x+1)(4x+1) where x is odd} that many other witnesses are weak to.

Given a pair (witness, odd_strong_pseudoprimes_under_2^32) we have (15,1883) (34,2009) (37,1959) and given their proximity to a power of two (16-1,32+2,32+5) you might be able to exploit their form for faster montgomery exponentiation like done with 2.

I'm not aware of any witness that has only 1024 pseudoprimes under 2^32 as convenient as that might be.

fredrik-johansson · 2025-12-25T12:04:16Z

Does looking up the pseudoprimes perform favorably compared to looking up the witness to use, like Forisek and Jancina's 32-bit test?

Yes, though this may differ depending on CPU and minute implementation differences.

Random input will be declared composite by the base 2 test with high probability, so in the average case we benefit from the fast powering of base 2 and the fact that we don't need any hash table lookup at all. For primes (or, rarely, base-2 pseudoprimes), the additional hash table lookup after the strong probable prime test should have comparable cost to a hash table lookup before the test.

BTW, note that we use the Shoup modular reduction for the 32-bit powering and Montgomery only for >32 bit. This was faster on my machine, but again, this may differ on other machines (and with any micro-optimizations I've overlooked).

Use hash table instead of binary search to look up 32-bit pseudoprime…

0759041

…s in n_is_prime

fredrik-johansson merged commit 9ecc5eb into flintlib:main Nov 28, 2025
13 checks passed

fredrik-johansson deleted the isprime4 branch November 28, 2025 09:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use hash table instead of binary search to look up 32-bit pseudoprimes in `n_is_prime`#2509

Use hash table instead of binary search to look up 32-bit pseudoprimes in `n_is_prime`#2509
fredrik-johansson merged 1 commit intoflintlib:mainfrom
fredrik-johansson:isprime4

fredrik-johansson commented Nov 28, 2025

Uh oh!

Uh oh!

JASory commented Dec 25, 2025 •

edited

Loading

Uh oh!

fredrik-johansson commented Dec 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

fredrik-johansson commented Nov 28, 2025

Uh oh!

Uh oh!

JASory commented Dec 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fredrik-johansson commented Dec 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JASory commented Dec 25, 2025 •

edited

Loading