Binary Search for Ruby’s Arrays

One incredibly handy algorithm that is missing from Ruby’s Array class is the binary search. If we know for absolute certain that the array we’re working with is sorted you can use a binary search to search through the array much much more quickly than a linear search, which would be performed with index or detect/find.

Usage

There are two methods defined by this gem. binary_search and binary_index. There are two versions of both of those methods. You can use the native version by requiring ‘binary_search/native’ or use the pure Ruby version with ‘binary_search/pure’.

  
    require 'binary_search/native'

    x = [5,1,6,7,2,6,4,2,6,1,6,1,1,8,3,5,2].sort
    puts x.binary_index(5)
    #=> 10

    target = 4
    y = [[1,:a], [2,:b], [3,:c], [4,:d]]
    puts y.binary_search { |v| target <=> v[0] }
    #=> [4,:d]

So the method ‘binary_index’ does the same thing that ‘index’ does: returns the index of a matching element. It should be noted that ‘index’ returns the first instance of a matching element. ‘binary_index’ is not guaranteed to return the first. It should also be noted, again, that this will only work if the array is sorted correctly. If it’s not weird crap will happen.

‘binary_search’ is similar to ‘find’ or ‘detect’ from Ruby’s normal arsenal. The difference is that rather than the block needing to return a boolean, it needs to return the usual output from the ‘<=>’ operator. (1 for >, -1 for <, and 0 for ==). This is obviously because we need to know whether the element being evaluated is greater than, less than, or equal to the value we’re actually looking for.

Benchmarks

Need proof? Howsabout some benchmarks:

== Benchmark Ruby's builtin :index method vs. a pure Ruby binary search method

Benchmark for 2000000 iterations searching through an array of 5 elements.
                  user     system      total        real
Index:        1.320000   0.010000   1.330000 (  1.320691)
Ruby BI:      8.700000   0.010000   8.710000 (  8.774825)

Benchmark for 1000000 iterations searching through an array of 10 elements.
                  user     system      total        real
Index:        0.990000   0.000000   0.990000 (  0.996690)
Ruby BI:      6.000000   0.010000   6.010000 (  6.043375)

Benchmark for 1000000 iterations searching through an array of 100 elements.
                  user     system      total        real
Index:        6.050000   0.020000   6.070000 (  6.091887)
Ruby BI:     10.250000   0.010000  10.260000 ( 10.318961)

Benchmark for 100000 iterations searching through an array of 1000 elements.
                  user     system      total        real
Index:        6.120000   0.010000   6.130000 (  6.155703)
Ruby BI:      1.480000   0.010000   1.490000 (  1.493227)

Benchmark for 10000 iterations searching through an array of 10000 elements.
                  user     system      total        real
Index:        5.880000   0.010000   5.890000 (  5.916098)
Ruby BI:      0.200000   0.000000   0.200000 (  0.206104)

Benchmark for 1000 iterations searching through an array of 100000 elements.
                  user     system      total        real
Index:        5.950000   0.010000   5.960000 (  6.005916)
Ruby BI:      0.030000   0.000000   0.030000 (  0.027823)



== Benchmark Ruby's builtin :index method vs. a native binary search method

Benchmark for 2000000 iterations searching through an array of 5 elements.
                  user     system      total        real
Index:        1.360000   0.000000   1.360000 (  1.369248)
Native BI:    1.140000   0.010000   1.150000 (  1.144739)

Benchmark for 1000000 iterations searching through an array of 10 elements.
                  user     system      total        real
Index:        0.960000   0.000000   0.960000 (  0.971568)
Native BI:    0.630000   0.000000   0.630000 (  0.637069)

Benchmark for 1000000 iterations searching through an array of 100 elements.
                  user     system      total        real
Index:        6.150000   0.010000   6.160000 (  6.192804)
Native BI:    0.810000   0.000000   0.810000 (  0.816337)

Benchmark for 100000 iterations searching through an array of 1000 elements.
                  user     system      total        real
Index:        6.170000   0.020000   6.190000 (  6.216637)
Native BI:    0.110000   0.000000   0.110000 (  0.111025)

Benchmark for 10000 iterations searching through an array of 10000 elements.
                  user     system      total        real
Index:        5.980000   0.010000   5.990000 (  6.033161)
Native BI:    0.010000   0.000000   0.010000 (  0.013183)

Benchmark for 1000 iterations searching through an array of 100000 elements.
                  user     system      total        real
Index:        5.920000   0.020000   5.940000 (  5.972206)
Native BI:    0.000000   0.000000   0.000000 (  0.001602)

So, your array must be fairly large (between 100 and 1000 elements) for the Ruby version of binary_index to be faster than Ruby’s builtin index method. However, even for arrays as small as 5 elements, the native version of the binary_index method is faster than Ruby’s index. However, for very large sized Arrays, both the the pure and the native version are much much much faster than the builtin method.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
ext		ext
lib		lib
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.textile		README.textile
Rakefile		Rakefile
VERSION.yml		VERSION.yml
benchmark.rb		benchmark.rb
binary_search.gemspec		binary_search.gemspec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Binary Search for Ruby’s Arrays

Usage

Benchmarks

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

tyler/binary_search

Folders and files

Latest commit

History

Repository files navigation

Binary Search for Ruby’s Arrays

Usage

Benchmarks

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages