Skip to content
Ruby
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
lib
test
.gitignore
.travis.yml
Gemfile
LICENSE
README.md
Rakefile
vss.gemspec

README.md

VSS – Vector Space Search  Build Status

A simple vector space search engine with tf*idf ranking.

More info, and details of how it works.

Installation

Just install the gem:

gem install vss

Or add to your Gemfile, if you're using Bundler:

gem 'vss'

Usage

To perform a search on a collection of documents:

require "vss"
docs = ["hello", "goodbye", "hello and goodbye", "hello, hello!"]
engine = VSS::Engine.new(docs)
engine.search("hello") #=> ["hello", "hello, hello!", "hello and goodbye"]

Rails/ActiveRecord

If you want to search a collection of ActiveRecord objects, you need to pass a documentizer Proc when initializing VSS::Engine which will convert the objects into documents (which are simply strings). For example:

class Page < ActiveRecord::Base
    #attrs: title, content
end

docs = Page.all
documentizer = lambda { |record| record.title + " " + record.content }
engine = VSS::Engine.new(docs, documentizer)

Notes

This isn't designed to be used on huge collections of records. The original use case was for ranking a smallish set of ActiveRecord results obtained via a query (using SearchLogic). So, essentially, the search consisted of 2 stages; getting the corpus via a SQL query, then doing the VSS on that.

Ruby

Tested with the following Ruby versions:

  • MRI 1.9.2
  • MRI 1.8.7

Probably works on JRuby ~> 1.6 too, but not actively tested.

Credits

Heavily inspired by Joesph Wilk's article on building a vector space search engine in Python.

Written by Mark Dodwell (@madeofcode)

Bitdeli Badge

Something went wrong with that request. Please try again.