Match things based on string similarity (using the Pair Distance algorithm) and regular expressions.
Ruby
Pull request Compare This branch is 85 commits behind seamusabshere:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
examples
lib
test
.document
.gitignore
LICENSE
README.rdoc
Rakefile
VERSION
loose_tight_dictionary.gemspec

README.rdoc

loose_tight_dictionary

Match things based on string similarity (using the Pair Distance algorithm) and regular expressions.

Quickstart

>> d = LooseTightDictionary.new %w(seamus andy ben)
=> [...]
>> puts d.find 'Shamus Heaney'
=> 'seamus'

Try running the included example file:

$ ruby examples/first_name_matching.rb 
Left side (input)
====================
Mr. Seamus
Sr. Andy
Master BenT

Right side (output)
====================
seamus
andy
ben

Results
====================
Left record (input)           Right record (output)         Prefix used (if any)          Score                         
Mr. Seamus                    seamus                        NULL                          0.666666666666667             
Sr. Andy                      andy                          NULL                          0.5                           
Master BenT                   ben                           NULL                          0.2

Improving dictionaries

Similarity matching will only get you so far.

TODO: regex usage

Note on Patches/Pull Requests

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don't break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)

  • Send me a pull request. Bonus points for topic branches.

Copyright

Copyright © 2010 Seamus Abshere. See LICENSE for details.