Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

next_prime for lib/prime.rb #1272

Closed
wants to merge 1 commit into from
Closed

Conversation

dankogai
Copy link

Rationale

To me, integer-without-limit is one of the greatest features of Ruby. I am currently working on my own implementation of arbitrary precision number system (https://github.com/dankogai/swift-pons) and ruby has been my sensei.

That is why I am disappointed to find prime.rb is pretty darn impractical, even for such "small" numbers below the 64-bit boundary. Consider this:

ruby -rprime -e 'p (2**61-1).prime?' # hello, anybody home?

M61 is well below an ordinary, fixed, 64-bit integer can hold.

This patch makes prime.rb a little more practical by:

  • making .prime? base upon Miller-Rabin primarity test
  • adding .next_prime and .prev_prime which returns the (next|previous) prime.
  • adding Prime::NextPrimeGenerator which makes use of .next_prime.

vs. OpenSSL::BN

Like current prime.rb, this patch is by no means to replace OpenSSL::BN.prime?. For very large numbers OpenSSL::BN is still faster. But for numbers below 32-bit limit this patch is faster. And for numbers between 32-bit limit and 64-bit limit, its performance is okay.

# coding: utf-8
require 'benchmark'
require 'prime'
require 'openssl'

count = 10000

[
  2147483647,                # int32 max == M31
  1<<61 - 1,                 # M61
  2147483647*2147483629,     # M31 * M31.prev_prime
  18446744073709551427,      # the largest prime uint64_t can handle
  318665857834031151167461,  # A014233_11
  # found at:
  # https://rosettacode.org/wiki/Miller–Rabin_primality_test
  4547337172376300111955330758342147474062293202868155909489, # prime
  4547337172376300111955330758342147474062293202868155909393  # composite
].each do |n|
  primerbsays   = n.prime?
  opensslbnsays = OpenSSL::BN.new(n.to_s).prime?
  puts "#{n}.prime? => #{primerbsays}"
  puts "OpenSSL::BN.new(#{n}.to_s).prime? => #{opensslbnsays}"
  puts "Do they agree? #{primerbsays == opensslbnsays}"
  Benchmark.bm do |x|
    x.report("OpenSSL::BN") {
      count.times { OpenSSL::BN.new(n.to_s).prime? }
    }
    x.report("prime.rb") {
      count.times { n.prime? }
    }
  end
end
2147483647.prime? => true
OpenSSL::BN.new(2147483647.to_s).prime? => true
Do they agree? true
       user     system      total        real
OpenSSL::BN  1.190000   0.010000   1.200000 (  1.216361)
prime.rb  0.180000   0.000000   0.180000 (  0.190299)
1152921504606846976.prime? => false
OpenSSL::BN.new(1152921504606846976.to_s).prime? => false
Do they agree? true
       user     system      total        real
OpenSSL::BN  0.010000   0.000000   0.010000 (  0.009166)
prime.rb  0.000000   0.000000   0.000000 (  0.000847)
4611685975477714963.prime? => false
OpenSSL::BN.new(4611685975477714963.to_s).prime? => false
Do they agree? true
       user     system      total        real
OpenSSL::BN  0.120000   0.010000   0.130000 (  0.123623)
prime.rb  0.300000   0.000000   0.300000 (  0.308675)
18446744073709551427.prime? => true
OpenSSL::BN.new(18446744073709551427.to_s).prime? => true
Do they agree? true
       user     system      total        real
OpenSSL::BN  1.980000   0.020000   2.000000 (  2.023935)
prime.rb  4.330000   0.020000   4.350000 (  4.390414)
318665857834031151167461.prime? => false
OpenSSL::BN.new(318665857834031151167461.to_s).prime? => false
Do they agree? true
       user     system      total        real
OpenSSL::BN  0.100000   0.010000   0.110000 (  0.105193)
prime.rb  4.320000   0.010000   4.330000 (  4.346966)
4547337172376300111955330758342147474062293202868155909489.prime? => true
OpenSSL::BN.new(4547337172376300111955330758342147474062293202868155909489.to_s).prime? => true
Do they agree? true
       user     system      total        real
OpenSSL::BN  1.790000   0.000000   1.790000 (  1.799558)
prime.rb 27.270000   0.260000  27.530000 ( 28.073170)
4547337172376300111955330758342147474062293202868155909393.prime? => false
OpenSSL::BN.new(4547337172376300111955330758342147474062293202868155909393.to_s).prime? => false
Do they agree? true
       user     system      total        real
OpenSSL::BN  0.170000   0.000000   0.170000 (  0.168822)
prime.rb  2.020000   0.010000   2.030000 (  2.054712)

Conclusion

IMHO the gap between prime.rb and OpenSSL::BN is now unacceptably too large. It was acceptable when native integers were only 32-bit wide. But it is 2016 and even your phone may be using 64-bit int.
.prime? should return instantly at least within 64-bit range.

Dan the Amateur Rubyist

@@ -132,7 +209,8 @@ def method_added(method) # :nodoc:
# Upper bound of prime numbers. The iterator stops after it
# yields all prime numbers p <= +ubound+.
#
def each(ubound = nil, generator = EratosthenesGenerator.new, &block)
#def each(ubound = nil, generator = EratosthenesGenerator.new, &block)
def each(ubound = nil, generator = NextPrimeGenerator.new, &block)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, interesting.

could I have benchmark for this part?

Copy link

@jzakiya jzakiya Jul 31, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following changes makes the code shorter, more readable, and at lease twice as fast (for very large numbers), and is adapted from my primes-utils gem. See PRIMES-UTILS HANDBOOK for math and code details.

https://www.scribd.com/doc/266461408/Primes-Utils-Handbook

require 'openssl'

class Integer
  # Returns true if +self+ passes Miller-Rabin Test on +b+
  def miller_rabin_test(b)
    return self >= 2 if self <= 3 
    return false unless [1,5].include?(self % 6)
    n = d = self - 1
    d >>= 1 while d.even?
    y = b.to_bn.mod_exp(d, self)
    while d != n && y != n && y != 1
      y = y.to_bn.mod_exp(2, self)
      d <<= 1
    end
    return false if y != n && d.even?
    true
  end
  # Returns true if +self+ is a prime number, else returns false.
  def prime?
    return true if [2, 3, 5, 7].include? self
    return false unless self > 1 and 210.gcd(self) == 1
    Prime::A014233.each do |pair|
      ix, mx = pair
      return false unless miller_rabin_test(ix)
      break if self < mx
    end
    self == 41 ? true : miller_rabin_test(41)
  end
end

Here are some timing results on a 2.3 GHz Lenovo Intel I5 laptop with a 32-bit Linux distro.

2.3.1 :103 > def tm; s=Time.now; yield; Time.now-s end
 => :tm 
2.3.1 :104 > (10**1700 + 469).prime?
 => true 
2.3.1 :105 > (10**1700 + 699).prime?
 => true 
2.3.1 :106 > tm{ (10**1700 + 469).prime? }
 => 4.885875503 
2.3.1 :107 > tm{ (10**1700 + 699).prime? }
 => 4.808029537 

On a 3.5 GHz I7 System76 laptop with a 64-bit Linux distro the times for both were ~1.05secs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jzakiya's miller_rabin_test method could use at least one tweak at the end, apply De Morgan's law.

  def miller_rabin_test(b)
    return self >= 2 if self <= 3 
    # I would avoid creating a temp Array to check only 2 cases
    return false unless [1,5].include?(self % 6)
    n = d = self - 1 # self - 1 is pred, perhaps less readable
    d >>= 1 while d.even?
    y = b.to_bn.mod_exp(d, self)
    while d != n && y != n && y != 1
      y = y.to_bn.mod_exp(2, self)
      d <<= 1
    end
    #return false if y != n && d.even?
    #true
    y == n || d.odd?
  end

Copy link

@jzakiya jzakiya Aug 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I like the simplification at the end.

I used the array because it's visually simpler to see what the test is, but it can be replaced with:

      return false unless [1, 5].include?(self % 6)
      return false unless 6.gcd(self) == 1

6 = 2*3 is the P3 Prime Generator (PG) modulus, so for any number > 3, the residue n mod 6 has to be either 1 or 5 (coprimes to 6), which is equivalent to directly testing if the residue of n mod 6 is coprime to 6. The math for all of this is explained in my Primes-Utils Handbook.

Actually you can eliminate the first 2 tests for small primes (since prime? tests for them too) if the method miller_rabin_test is strictly to be used in prime?, but since (as written) it's a public method of Integer it's probably best to keep them, as they don't hurt.

Copy link

@jzakiya jzakiya Aug 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code here modifies the original prime? implementation.
It first determines which is the smallest witness range for the number.
It then creates an array of all the witnesses for that range.
It then uses those witnesses to perform the miller_rabin_test on.

The method prime1? shows the form of the implementation.
Its performance is the same as the original prime? implementation.
However, now that the witnesses are found first we can do prime2?.
Using the parallel gem, it can now perform the miller-rabin tests in parallel,
with a significant speedup for very large numbers, and no degradation in
performance for small numbers.

Below are average typical times (in seconds) with Ruby 2.3.1.

2.3.1 > def tm; s = Time.now; yield; Time.now-s end
2.3.1 > n = 10**2500; tm{ (n + 11859).prime? }      # et al

system: 64-bit Linux
laptop: System76 Gazelle, 3.5GHz I7-6700HQ cpu, 4 cores, 8 threads

            (10**2500 + 11859)      (10**2500 + 13671)
prime?             3.16                    3.19        
prime1?            3.16                    3.19
prime2?            0.95                    0.96

system: 32-bit Linux
laptop: Lenovo V570, 2.3GHz I5-2410M cpu, 2 cores, 4 threads

            (10**2500 + 11859)      (10**2500 + 13671)
prime?             13.6                    13.6        
prime1?            13.6                    13.6
prime2?             7.5                     7.5

Below is the whole file.

require 'openssl'
require 'parallel'

class Integer
  # Returns true if +self+ passes Miller-Rabin Test on +b+
  def miller_rabin_test(b)             # b is a witness to test with
    return self >= 2 if self <= 3 
    return false unless 6.gcd(self) == 1
    n = d = self - 1
    d >>= 1 while d.even?
    y = b.to_bn.mod_exp(d, self)       # y = (b**d) mod self
    while d != n && y != n && y != 1
      y = y.to_bn.mod_exp(2, self)     # y = (y**2) mod self
      d <<= 1
    end
    y == n || d.odd?
  end

  # Returns true if +self+ is a prime number, else returns false.
  def prime?
    return true if [2, 3, 5, 7].include? self
    return false unless self > 1 and 210.gcd(self) == 1
    Prime::A014233.each do |pair|
      ix, mx = pair
      return false unless miller_rabin_test(ix)
      break if self < mx
    end
    self == 41 || miller_rabin_test(41)
  end

  def prime1?
    return true if [2, 3, 5, 7].include? self
    return false unless self > 1 and 210.gcd(self) == 1
    witnesses = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41]
    val = Prime::A014233.values.sort.detect {|v| v > self}
    witnesses.select! { |p| p <= Prime::A014233.key(val) } if val
    witnesses.each { |p| return false unless miller_rabin_test(p) }
    true
  end

  def prime2?    # perform miller-rabin tests in parallel
    return true if [2, 3, 5, 7].include? self
    return false unless self > 1 and 210.gcd(self) == 1
    witnesses = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41]
    val = Prime::A014233.values.sort.detect {|v| v > self}
    witnesses.select! { |p| p <= Prime::A014233.key(val) } if val
    Parallel.each(witnesses) { |p| return false unless miller_rabin_test(p) }
    true
  end
end

class Prime
  # https://oeis.org/A014233
  #
  # Smallest odd number for which Miller-Rabin primality test
  # on bases <= n-th prime does not reveal compositeness.
  #
  A014233 = {
    2  => 2_047,
    3  => 1_373_653,
    5  => 25_326_001,
    7  => 3_215_031_751,
    11 => 2_152_302_898_747,
    13 => 3_474_749_660_383,
    17 => 341_550_071_728_321,
    19 => 341_550_071_728_321,
    23 => 3_825_123_056_546_413_051,
    29 => 3_825_123_056_546_413_051,
    31 => 3_825_123_056_546_413_051,
    37 => 318_665_857_834_031_151_167_461
  }
end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not that familiar with prime numbers, so I can only hope to help with the implementation.

Your default prime? could be simplified, but wouldn't make such a great impact.

def prime?
  return true if [2, 3, 5, 7].include? self
  return false unless self > 1 and 210.gcd(self % 210) == 1
  Prime::A014233.each do |ix, mx| # Remove pair, use (ix, mx) directly
    return false unless miller_rabin_test(ix)
    break if self < mx
  end
  self == 41 || miller_rabin_test(41) # Simplify return value
end

With your prime1? and prime2? there is more room to make a significant change.
There is a typo in prime2, comma instead of dot for val = Prime::A014233.values.sort,detect {|v| v > self}.

def prime1?
  return true if [2, 3, 5, 7].include? self
  return false unless self > 1 and 210.gcd(self % 210) == 1
  witnesses = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41]
  # Isn't Prime::A014233.values already returning in the defined order in recent Ruby versions?
  # Otherwise Prime::A014233.values.sort! could be stored in a constant
  # array.sort! can be used here
  # Some values are tested twice as they repeat in Prime::A014233,
  # perhaps a constant could help here
  val = Prime::A014233.values.sort!.detect {|v| v > self}
  # Prime::A014233.key(val) is a constant inside the iterator, calculate it once outside
  if val
    # Since hash.key is used we could flip key/values to take more advantage of the Hash,
    # but then keys would collide, are repeated values needed?
    # Prime::A014233.key(3_825_123_056_546_413_051) should return one of [23,29,31] or 23 is enough for example?
    val = Prime::A014233.key(val)
    witnesses.select! { |p| p <= val }
  end
  # Simplify return value for the serial version, return false if any miller rabin test fails, otherwise true
  witnesses.all? { |p| miller_rabin_test(p) }
end

Edit
After some time I noticed that val and its respective key are used.
If hash.each always iterates the key value pairs in the same order they were defined (1.9+ behavior, not sure if this will hold for future versions) I believe the following code is equivalent.
Otherwise hash.sort_by could be used to force a sort by value.

def prime1?
  return true if [2, 3, 5, 7].include? self
  return false unless self > 1 and 210.gcd(self % 210) == 1
  witnesses = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41]
  # Prime::A014233.sort_by {|k,v| v}.each do |k,v|
  Prime::A014233.each do |k,v|
    if v > self
      witnesses.select! { |p| p <= k }
      break
    end
  end
  witnesses.all? { |p| miller_rabin_test(p) }
end

@jzakiya
Copy link

jzakiya commented Aug 2, 2016

When I was in bed I also realized I could do self == 41 || miller_rabin_test(41),
so yeh, good catch.

Actually, your suggestions for prime1? undoes the reasons for doing it that way.

First, my initial modifications of miller_rabin_test and prime? was to just
improve the coding implementation of the pull requester's design. I followed what
he proposed but just coded the same functionality faster and more efficiently.

With prime1? I wanted to redesign the implementation to a more functional programming
(FP) style, with the deliberate goal to allow it to be performed efficiently in parallel.
The ultimate purpose for doing this is to create a design that can perform optimally.

So I separated out the distinct functional steps so they can be performed independently.
Finding the appropriate range for the input number is a separate process.
Then finding the witnesses for the range is a separate process.
Finally, performing the miller-rabin tests with the proper witnesses is a separate process.

The goal of prime1? was to get to prime2?. The miller-rabin are independent tests,
and don't share any memory, so they can inherently be done in parallel. So the only change
to prime1? to allow the miller-rabin tests be done in parallel is to use the
Parallel.each(witnesses).. method from the parallel gem. The timing results tell all.
When CRuby (3.0 ?) supports parallelization, this can be replaced by its native method(s).

Ah, but we can do better.

The original design has some unnecessary dependencies we can eliminate to make the code
easier to update and mathematically modify.

The design uses a specific set of ranges/witnesses to provide deterministic results over those
ranges. See such as https://en.wikipedia.org/wiki/Miller%E2%80%93Rabin_primality_test for the
mathematical details.

The way the Prime::A014233 hash is constructed, each larger range is dependent on all the
smaller ranges keys to determine its witnesses. This forces us to find (code) a way to
determine the correct set of witnesses for each range. For this particular design, it's
pretty straight forward to do this. But what would have to happen if the mathemeticians
find a better set of ranges/witnesses that don't follow this particular scheme?

The fix is to create the hash so that the keys are the ranges whose values are arrays of
all their witnesses. Now, the ranges are independent, and new ranges can be added/modified
without worry of affecting another range. Prime::A014233NEW below shows this. Now, finding
the appropriate range automatically provides the appropriate witnesses.

In the code for prime3? one line now finds the appropriate witnesses for a range.
wits = Prime::A014233NEW.sort.detect {|range, wits| range > self}.

Let's break this down.

wits = Prime::A014233NEW.sort.              # make sure for old/new Rubys' the keys are sorted
       detect {|range, wits| range > self}` # find first [range, [wits]] pair where range > self
                                            # if no range > self then wits =  nil

Then we do wits && wits[1] || witnesses to determine the array of witnesses for testing with.
If wits is nil then wits && wits is false and we use the default witnesses array.
Otherwise wits && wits[1] returns wits[1] to be the witnesses array for the identified range.

In prime3? every line is an independent functional entity, and new range => [wits] pairs can be
added to Prime::A014233NEW independently (without regard to their position in the hash).

But I can even make prime3? a little bit more (mathematically) efficient, as shown in prime4?.

Since I'm creating an array of consecutive witnesses primes, I might as well use it to check
for small primes, instead of creating a smaller array of small primes. And I also might as
well use this modulus for these Prime Generator primes as my non-primality test.

So m = witnesses.reduce(:*) is 304,250,263,527,210,
and the non-primality test m.gcd(self) == 1 will eliminate 85.5% of all integers
from consideration to being prime, instead of the 77% eliminated by 210.gcd(self) == 1.
(see Primes-Utils Handbook for details).

Thus, I think prime4? represents the most mathematically efficient/optimally performant
design/code for doing this.

class Integer
  ...
  ...

  def prime3?    # perform miller-rabin tests in parallel
    return true if [2, 3, 5, 7].include? self
    return false unless self > 1 and 210.gcd(self) == 1
    witnesses = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41]
    wits = Prime::A014233NEW.sort.detect {|range, wits| range > self} # [range, [wits]] or nil
    Parallel.each(wits && wits[1] || witnesses) { |p| return false unless miller_rabin_test(p) }
    true
  end

  def prime4?    # perform miller-rabin tests in parallel
    witnesses = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41]
    return true if witnesses.include? self
    m = witnesses.reduce(:*)                           # modulus for P41 Prime Generator
    return false unless self > 1 and m.gcd(self) == 1  # non-primality test for P41
    wits = Prime::A014233NEW.sort.detect {|range, wits| range > self} # [range, [wit_prms]] or nil
    Parallel.each(wits && wits[1] || witnesses) { |p| return false unless miller_rabin_test(p) }
    true
  end
end

class Prime
  # https://oeis.org/A014233
  #
  # Smallest odd number for which Miller-Rabin primality test
  # on bases <= n-th prime does not reveal compositeness.
  #
  A014233 = {
    2  => 2_047,
    3  => 1_373_653,
    5  => 25_326_001,
    7  => 3_215_031_751,
    11 => 2_152_302_898_747,
    13 => 3_474_749_660_383,
    17 => 341_550_071_728_321,
    19 => 341_550_071_728_321,
    23 => 3_825_123_056_546_413_051,
    29 => 3_825_123_056_546_413_051,
    31 => 3_825_123_056_546_413_051,
    37 => 318_665_857_834_031_151_167_461
  }

  A014233NEW = {
    2_047 => [2],
    1_373_653 => [2, 3],
    25_326_001 => [2, 3, 5],
    3_215_031_751 => [2, 3, 5, 7],
    2_152_302_898_747 => [2, 3, 5, 7, 11],
    3_474_749_660_383 => [2, 3, 5, 7, 11, 13],
    341_550_071_728_321 => [2, 3, 5, 7, 11, 13, 17],
    3_825_123_056_546_413_051 => [2, 3, 5, 7, 11, 13, 17, 19, 23],
    318_665_857_834_031_151_167_461 => [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37]
  }
end

@Maumagnaguagno
Copy link
Contributor

I am glad that I could help in minor details and eventually motivated this discussion.
And you really flipped the key values, I thought this was just a crazy idea but wrote anyway.
Now I understand that your goal is to fully parallelize the process.
I would still use sort! instead of sort to avoid GC calls, but that would make no difference in a small benchmark.
It seems like you achieved the best solution, have you seen the tests?
5 tests were failing in the original request, 4 of them because 41 was missing and 1 because of timeout.

@jzakiya
Copy link

jzakiya commented Aug 10, 2016

Been busy, couldn't respond immediately.

Regarding test passing, all my versions of primex? pass the tests I run to
verify their accuracy. As you noted, the original pull request code gave a
false negative to 41 because of the architecural structure, as I elaborated on.

But the crux of the pull request is to improve the performance of the prime.rb
std lib methods. That goal, and also to provide more enhanced methods, is why
I create the primes-utils gem.

Fortunately, the current method for prime? in prime.rb uses the math from
primes-utils (see https://www.scribd.com/doc/266461408/Primes-Utils-Handbook)
making it significatntly faster than its previous implementation, i.e. before its
inclusion in Ruby 2.3.0 (released Christmas 2015, 2015/12/25).

However, this version, which uses the P3 Strictly-Prime Prime Generator (SP PG),
can be made much faster by using the next higher SP PG, i.e. P5, as with primep5?.
This allows it to be used with bigger numbers with much better performance.

class Integer

  def prime?     # version now used in prime.rb since MRI 2.3.0
    return self >= 2 if self <= 3
    return false if self % 2 == 0 or self % 3 == 0
    # return false unless 6.gcd(self) == 1              # simplification
    (5..(self**0.5).floor).step(6).each do |i|
      if self % i == 0 || self % (i + 2) == 0
        return false
      end
    end
    true
  end

  def primep5?   # Uses P5 Strictly-Prime PG
    return false unless self > 1 and 30.gcd(self) == 1 or [2,3,5].include? self
    (7..Math.sqrt(self).to_i).step(30) do |p|
      return false if 
        self%(p)    == 0 or self%(p+4)  == 0 or self%(p+6)  == 0 or self%(p+10) == 0 or
        self%(p+12) == 0 or self%(p+16) == 0 or self%(p+22) == 0 or self%(p+24) == 0
    end
    true
  end

end

The P7 SP PG is even faster (for increasing numbers) but the code is not as compact.
In fact, I just submitted using primep5? to the Ruby issues tracker as an easy/simple
improvement to prime? in prime.rb. https://bugs.ruby-lang.org/issues/12665

Below are some timing comparisions for 3 progressively larger primes for 2.3.1.
System: System76 3.5GHz I7 cpu laptop, Linux 64-bit OS

n1 =   100_000_000_000_000_003
n2 =   200_000_000_000_000_003
n3 = 1_000_000_000_000_000_003

               n1         n2         n3
prime?        4.1        5.7        12.9
primep5?      2.5        3.6         7.9

def tm; s = Time.now; yield; Time.now - s end

irb(main):028:0> n = 100_000_000_000_000_003; tm{ n.prime? }
=> 4.127392644
irb(main):029:0> n = 100_000_000_000_000_003; tm{ n.primep5? }
=> 2.539485672
irb(main):030:0> n = 200_000_000_000_000_003; tm{ n.prime? }
=> 5.721895509
irb(main):031:0> n = 200_000_000_000_000_003; tm{ n.primep5? }
=> 3.56925564
irb(main):032:0> n = 1_000_000_000_000_000_003; tm{ n.prime? }
=> 12.940908142
irb(main):033:0> n = 1_000_000_000_000_000_003; tm{ n.primep5? }
=> 7.920408959

So, just making this minor change to prime? creates major performance benefits,
while using a mathematical technique that is totally deterministic.

This pull request uses the Miller-Rabin (MR) test for prime?, with the minimum set of
prime witnesses to supposedly make the MR tests deterministic over specific number ranges.
However, for 'small' numbers, I see the code producing random false negatives results.

Repeatedly run this test below (for small numbers) with prime? using MR, and then with prime?
from prime.rb. The answer to s.size should be 78498.

2.3.1 :029 > s = []; 1000000.times{|i| s << i if i.prime?}
 => 1000000 
2.3.1 :030 > s.size    # should be 78498

From my testing, using SP PGs are faster than MR for 'small' numbers, and 100% accurate,
while MR shines for larger numbers (for speed). In primes-utils I provide and use a
randomized MR version that is probablistic. I can incorporate the code here for MR to make
it determinsitic over its (larger) deterministic ranges. I think the key for the devs to
consider is first accuracy, and then speed. Getting wrong results fast is not the goal.

This suggests that if you want the best of both worlds (100% accuracy and speed) in one
method combining the use of PGs and MR may be the way to go. Something to consider.

Below are additional simplifications to the methods next_prime and prev_prime in
the pull request code.

class Integer

def next_prime                           def next_prime
  return 2 if self < 2                     return 2 if self < 2
  n = self
  n += if n & 1 == 0 then 1 else 2 end     n = self + 1 | 1          # 1st odd number > self
  while !n.prime?                          n += 2 until n.prime?     # do until n is prime
    n += 2
  end
  return n                                 n
end                                      end

def prev_prime                           def prev_prime
  return nil if self <= 2                  return nil if self <= 2
  return   2 if self == 3                  return   2 if self == 3
  n = self
  n -= if n & 1 == 0 then 1 else 2 end     n = self - 2 | 1          # 1st odd number < self
  while !n.prime?                          n -= 2 until n.prime?     # do until n is prime
    n -= 2
  end
  return n                                 n
end                                      end

Here are all my modifications to the pull requester's code.
The code is now much shorter, more readable, and (hopefully) should pass every test.

require 'openssl'

class Integer

  # Returns true if +self+ passes Miller-Rabin Test on +b+
  def miller_rabin_test(b)             # b is a witness to test with
    return self >= 2 if self <= 3 
    return false unless 6.gcd(self) == 1
    n = d = self - 1
    d >>= 1 while d.even?
    y = b.to_bn.mod_exp(d, self)       # y = (b**d) mod self
    while d != n && y != n && y != 1
      y = y.to_bn.mod_exp(2, self)     # y = (y**2) mod self
      d <<= 1
    end
    y == n || d.odd?
  end

  # Returns true if +self+ is a prime number, else returns false.
  def prime?
    witnesses = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41]
    return true if witnesses.include? self
    return false unless self > 1 and witnesses.reduce(:*).gcd(self) == 1
    wits = Prime::A014233NEW.sort.detect {|range, wits| range > self} # [range, [wit_prms]] or nil
    witnesses = wits && wits[1] || witnesses
    witnesses.each { |p| return false unless miller_rabin_test(p) }
    true
  end

  # Returns the smallest prime number which is greater than +self+
  def next_prime
    return 2 if self < 2
    n = self + 1 | 1          # first odd number > self
    n += 2 until n.prime?     # find first prime > self
    n
  end

  # Returns the largest prime number which is smaller than +self+
  # or +nil+ if +self+ <= 2
  def prev_prime
    return nil if self <= 2
    return   2 if self == 3
    n = self - 2 | 1          # first odd number < self
    n -= 2 until n.prime?     # find first prime < self
    n
  end

end

class Prime

  # https://oeis.org/A014233
  #
  # Smallest odd number for which Miller-Rabin primality test
  # on bases <= n-th prime does not reveal compositeness.
  #
  A014233NEW = {
    2_047 => [2],
    1_373_653 => [2, 3],
    25_326_001 => [2, 3, 5],
    3_215_031_751 => [2, 3, 5, 7],
    2_152_302_898_747 => [2, 3, 5, 7, 11],
    3_474_749_660_383 => [2, 3, 5, 7, 11, 13],
    341_550_071_728_321 => [2, 3, 5, 7, 11, 13, 17],
    3_825_123_056_546_413_051 => [2, 3, 5, 7, 11, 13, 17, 19, 23],
    318_665_857_834_031_151_167_461 => [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37]
  }
end

Now you can do, very fast, and for very large numbers, things like:

> 11111111111118349843942143.prime?     => false
> 11111111111118349843942143.next_prime => 11111111111118349843942147
> 11111111111118349843942143.prev_prime => 11111111111118349843942109
> (2**64).prev_prime   => 18446744073709551557
> (2**128).preve_prime => 340282366920938463463374607431768211297

There are still code and performance improvements that can be made to the other
methods in prime.rb, especially prime_division, which I replaced with factors
in prime-utils. Actually, I would love to see the devs take everything from
primes-utils and incorporate its techniques into the std lib, which would make
Ruby much more useful, flexible, and performant, for all things prime.

@Matsuyanagi
Copy link

This page shows numbers up to 71.
http://mathworld.wolfram.com/StrongPseudoprime.html

ψ20 meaning [ 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71 ] base numbers.

Prime::A014233 can extend

  41 => 3_317_044_064_679_887_385_961_981,
  43 => 6_003_094_289_670_105_800_312_596_501,
  47 => 59_276_361_075_595_573_263_446_330_101,
  53 => 564_132_928_021_909_221_014_087_501_701,
  59 => 564_132_928_021_909_221_014_087_501_701,
  61 => 1_543_267_864_443_420_616_877_677_640_751_301,
  67 => 1_543_267_864_443_420_616_877_677_640_751_301,
  71 => 10 ** 36,

@k0kubun k0kubun changed the base branch from trunk to master August 15, 2019 17:58
@k0kubun
Copy link
Member

k0kubun commented Aug 17, 2019

It seems to have a conflict now. Could you rebase this from master?

@k0kubun
Copy link
Member

k0kubun commented Aug 19, 2019

Let me close this as it has not been updated for a while. Please reopen this after resolving conflicts. Thanks.

@k0kubun k0kubun closed this Aug 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
6 participants