Skip to content
boyer-moore string search algorithm in ruby (supports tokens and regexps)
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
examples
lib
spec
.gitignore
README.mkd
Rakefile

README.mkd

boyermoore

The Boyer-Moore string matching algorithm.

Ported to ruby from wikipedia's c code, but geared towards a token search rather than merely characters.

Usage:

BoyerMoore.search(haystack, needle)   # returns index of needle or nil

Examples:

Basic search in string:

BoyerMoore.search("ANPANMAN", "ANP")   # => 0
BoyerMoore.search("ANPANMAN", "ANPXX") # => nil 
BoyerMoore.search("foobar", "bar")     # => 3

You can also search an array of tokens:

BoyerMoore.search(["<b>", "hi", "</b>"], ["hi"])         # => 1 
BoyerMoore.search(["bam", "foo", "bar"], ["foo", "bar"]) # => 1 
BoyerMoore.search(["bam", "bar", "baz"], ["foo"])        # => nil 

A token can be a regular expression:

BoyerMoore.search(["Sing", "99", "Luftballon"], [/\d+/]) == 1
BoyerMoore.search(["Nate Murray", "5 Pine Street", "Los Angeles", "CA", "90210"], [/^\w{2}$/, /^\d{5}$/]) == 3 

Notes

The regular-expression token matching is a bit of a hack and will be fairly slow because every hash miss is compared against every regular expression key. You probably shouldn't use the regular expression token search for anything more than a toy.

Credits

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.