Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Ruby
branch: master

Merge pull request #3 from bitdeli-chef/master

Add a Bitdeli Badge to README
latest commit f4e62e94f0
@mkdynamic authored
Failed to load latest commit information.
lib bump 0.1.9
test whitespace
.gitignore update to use bundler. fix for ruby 1.8.7 and 1.9.2.
.travis.yml add travis cs config
Gemfile
LICENSE update to use bundler. fix for ruby 1.8.7 and 1.9.2.
README.md Add a Bitdeli badge to README
Rakefile update Rakefile and add test task
vss.gemspec specify minor version dependency for stemmer

README.md

VSS – Vector Space Search  Build Status

A simple vector space search engine with tf*idf ranking.

More info, and details of how it works.

Installation

Just install the gem:

gem install vss

Or add to your Gemfile, if you're using Bundler:

gem 'vss'

Usage

To perform a search on a collection of documents:

require "vss"
docs = ["hello", "goodbye", "hello and goodbye", "hello, hello!"]
engine = VSS::Engine.new(docs)
engine.search("hello") #=> ["hello", "hello, hello!", "hello and goodbye"]

Rails/ActiveRecord

If you want to search a collection of ActiveRecord objects, you need to pass a documentizer Proc when initializing VSS::Engine which will convert the objects into documents (which are simply strings). For example:

class Page < ActiveRecord::Base
    #attrs: title, content
end

docs = Page.all
documentizer = lambda { |record| record.title + " " + record.content }
engine = VSS::Engine.new(docs, documentizer)

Notes

This isn't designed to be used on huge collections of records. The original use case was for ranking a smallish set of ActiveRecord results obtained via a query (using SearchLogic). So, essentially, the search consisted of 2 stages; getting the corpus via a SQL query, then doing the VSS on that.

Ruby

Tested with the following Ruby versions:

  • MRI 1.9.2
  • MRI 1.8.7

Probably works on JRuby ~> 1.6 too, but not actively tested.

Credits

Heavily inspired by Joesph Wilk's article on building a vector space search engine in Python.

Written by Mark Dodwell (@madeofcode)

Bitdeli Badge

Something went wrong with that request. Please try again.