Improve isprime performance for 32 bit integers #9

pabloferz · 2016-05-27T10:22:14Z

As I commented over there there is another possibility (show below) instead of the current PR. It has been argued that it uses more memory (which is true, see JuliaLang/julia#16349 (comment)), but is much less memory than the current implementation anyway and is faster than this PR.

const bases = UInt16[15591,2018,166,7429,8064,16045,10503,4399,1949,1295,2776,3620,560,3128,5212,
2657,2300,2021,4652,1471,9336,4018,2398,20462,10277,8028,2213,6219,620,3763,4852,5012,3185,
1333,6227,5298,1074,2391,5113,7061,803,1269,3875,422,751,580,4729,10239,746,2951,556,2206,
3778,481,1522,3476,481,2487,3266,5633,488,3373,6441,3344,17,15105,1490,4154,2036,1882,1813,
467,3307,14042,6371,658,1005,903,737,1887,7447,1888,2848,1784,7559,3400,951,13969,4304,177,41,
19875,3110,13221,8726,571,7043,6943,1199,352,6435,165,1169,3315,978,233,3003,2562,2994,10587,
10030,2377,1902,5354,4447,1555,263,27027,2283,305,669,1912,601,6186,429,1930,14873,1784,1661,
524,3577,236,2360,6146,2850,55637,1753,4178,8466,222,2579,2743,2031,2226,2276,374,2132,813,
23788,1610,4422,5159,1725,3597,3366,14336,579,165,1375,10018,12616,9816,1371,536,1867,10864,
857,2206,5788,434,8085,17618,727,3639,1595,4944,2129,2029,8195,8344,6232,9183,8126,1870,3296,
7455,8947,25017,541,19115,368,566,5674,411,522,1027,8215,2050,6544,10049,614,774,2333,3007,
35201,4706,1152,1785,1028,1540,3743,493,4474,2521,26845,8354,864,18915,5465,2447,42,4511,
1660,166,1249,6259,2553,304,272,7286,73,6554,899,2816,5197,13330,7054,2818,3199,811,922,350,
7514,4452,3449,2663,4708,418,1621,1171,3471,88,11345,412,1559,194]

function witnesses(n::Union{UInt8,Int8,UInt16,Int16,UInt32,Int32})
    i = UInt64(n)
    i = ((i >> 16) $ i) * 0x45d9f3b
    i = ((i >> 16) $ i) * 0x45d9f3b
    i = ((i >> 16) $ i) & 255 + 1
    bases[i]
end

coveralls · 2016-05-27T10:29:11Z

Coverage increased (+0.9%) to 96.97% when pulling aac32f1 on pz/isprime into 078a15c on master.

simonbyrne · 2016-05-27T10:44:54Z

Sorry, I haven't really followed the discussion on the other threads (I don't actually know all that much about prime number algorithms), but is the only downside of the alternative method that it needs a slightly larger lookup table?

If so, that's definitely worth it: 512 bytes is basically nothing (a processor these days typically has 64K L1 cache), special functions implementations often use way larger tables, e.g.
https://github.com/JuliaLang/julia/blob/e11ff359f44121d9b9bf6057769465b29aa28405/base/special/log.jl

Also, add a reference to the paper in the comments (this should also be included in the eventual documents as well).

mschauer · 2016-05-27T12:30:20Z

src/Primes.jl

+    for m in (3,5,7,11,13,17,19,23)
+        n % m == 0 && return false
+    end
+    n < 841 && return n > 1
    s = trailing_zeros(n-1)
    d = (n-1) >>> s
    for a in witnesses(n)


Copied comment:

I think you can give a type hint

for a in witnesses(n)::Tuple{Vararg{Int}}

making the witnesses stack allocated.

pabloferz · 2016-05-27T12:49:39Z

@simonbyrne I also think the larger lookup table is worth it. But the first time the improvement strategy was discussed there was someone that wasn't so happy about it.

So I'll change it to the other version and will address your comments about the reference.

simonbyrne · 2016-05-27T13:03:16Z

might also see some speedup by using @inbounds bases[i].

coveralls · 2016-05-27T14:11:01Z

Coverage increased (+3.2%) to 99.248% when pulling 77debe1 on pz/isprime into 078a15c on master.

hayd · 2016-05-27T17:27:40Z

src/Primes.jl

-    i = (((n >> 16) $ n) * 0x45d9f3b) & 0xffffffff
-    i = ((i >> 16) $ i) & 15 + 1
-    (2,(2249,483,194,199,15,369,499,945,419,735,33,471,946,615,497,702)[i])
+#     Forišek and Jančina, "Fast Primality Testing for Integers That Fit into a Machine Word", 2015


is this FJ32_256? that seem to be the only one described in the paper (though it's a little different?).

I just took the hash and bases from the FJ32_256 there, since we were already had a Miller-Rabin test implemented.

That was the only thing we really needed, since everything else was already here.

I see, thanks! Would be good to note that this is the algorithm used :) Is there any value is the 64-bit version (for 64 bit types)? the paper seems to suggest they weren't sure.

I'll mention the algorithm used. The 64-bit version needs a way larger look-up table and is probably better to try one of the approaches discussed here JuliaLang/julia#11594

mschauer · 2016-06-01T11:46:15Z

Looks good.

coveralls · 2016-06-01T12:23:58Z

Coverage increased (+0.9%) to 96.992% when pulling 4dc4fae on pz/isprime into 078a15c on master.

simonbyrne · 2016-06-07T09:44:11Z

So is this good to merge?

simonbyrne · 2016-06-07T09:49:25Z

Great, thanks!

Improve isprime performance for 32 bit integers

aac32f1

This was referenced May 27, 2016

Improve isprime performance for 32-bit integers JuliaLang/julia#16349

Closed

Move PRs from Base #8

Closed

mschauer reviewed May 27, 2016
View reviewed changes

pabloferz added 2 commits May 27, 2016 15:48

Forisek and Jancina algorithm

0f9c62f

Add missing return

77debe1

hayd reviewed May 27, 2016
View reviewed changes

Indicate algorithm used

4dc4fae

mschauer mentioned this pull request Jun 6, 2016

Performance regression in isprimes JuliaLang/julia#16780

Closed

simonbyrne merged commit 5407ca0 into master Jun 7, 2016

simonbyrne deleted the pz/isprime branch June 7, 2016 09:49

KristofferC mentioned this pull request May 24, 2017

Performance regression in isprimes #46

Closed

rfourquet mentioned this pull request Jun 25, 2017

Better algorithm for factoring + refactoring of isprime #53

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve isprime performance for 32 bit integers #9

Improve isprime performance for 32 bit integers #9

pabloferz commented May 27, 2016

coveralls commented May 27, 2016 •

edited

Loading

simonbyrne commented May 27, 2016

mschauer May 27, 2016

pabloferz commented May 27, 2016

simonbyrne commented May 27, 2016

coveralls commented May 27, 2016 •

edited

Loading

hayd May 27, 2016

pabloferz May 27, 2016

hayd May 27, 2016

pabloferz May 30, 2016

mschauer commented Jun 1, 2016

coveralls commented Jun 1, 2016 •

edited

Loading

simonbyrne commented Jun 7, 2016

simonbyrne commented Jun 7, 2016

Improve isprime performance for 32 bit integers #9

Improve isprime performance for 32 bit integers #9

Conversation

pabloferz commented May 27, 2016

coveralls commented May 27, 2016 • edited Loading

simonbyrne commented May 27, 2016

mschauer May 27, 2016

Choose a reason for hiding this comment

pabloferz commented May 27, 2016

simonbyrne commented May 27, 2016

coveralls commented May 27, 2016 • edited Loading

hayd May 27, 2016

Choose a reason for hiding this comment

pabloferz May 27, 2016

Choose a reason for hiding this comment

hayd May 27, 2016

Choose a reason for hiding this comment

pabloferz May 30, 2016

Choose a reason for hiding this comment

mschauer commented Jun 1, 2016

coveralls commented Jun 1, 2016 • edited Loading

simonbyrne commented Jun 7, 2016

simonbyrne commented Jun 7, 2016

coveralls commented May 27, 2016 •

edited

Loading

coveralls commented May 27, 2016 •

edited

Loading

coveralls commented Jun 1, 2016 •

edited

Loading