Skip to content

Litesearch Guide

Mohammad A. Ali edited this page Oct 12, 2023 · 3 revisions

Litesearch is a simple, flexible and extremely fast full text search engine for Ruby applications.

This guide focuses on direct usage of Litesearch, if you are interested in ActiveRecord integration then check here, for Sequel integration check here.

Requirements

Litesearch requires SQLite 3.43 and above, this is needed for critical functionality that was not available before this release. Please upgrade your SQLite library to 3.43 or later if you want to use Litesearch.

Creating your index

A new index needs to be defined on a Litedb instance, as such

require 'litestack/litedb'
db = Litedb.new(":memory:")
idx = db.search_index('email') do |schema|
  schema.fields [:sender, :receiver, :body] # all these have weight = 1
  schema.field :subject, weight: 10 # higher weight for the subject field
  schema.tokenizer :porter # options are :ascii, :unicode, :porter (default) and :trigram
end

Adding, updating and removing documents

# you can supply an id field, but if you don't one will be created automatically for you 
id = idx.add(sender: 'Kamal', receiver: 'Layla', subject: 'Miss you all', body: 'you and the girls')

To update a document just call add with and supply the document id in the document hash (with ALL the other fields, otherwise missing fields will be nullified), removing a document only needs the id

# update document with id = 1
idx.add(id: 1, sender: 'Kamal', receiver: 'Layla', subject: 'Miss you all very much', body: 'you and the girls')
# remove document with id = 1
idx.remove(1)
# remove all documents
idx.clear!
# drop the index
idx.drop!

Searching the index

# single term search
idx.search('Kamal')
# multi term search (can use AND or OR)
idx.search('kamal OR layla')
# multi field search, not that girl would match girls thanks to the porter tokenizre 
idx.search('(subject: miss) AND body:('girl')')
# you can pass a limit and an offset to search
idx.search('kamal', limit:10, offset:1000)
# you can also query the count of hits of a certain search
idx.count('kamal')

For a detailed description of the query language please refer to the relevant section in SQLite's FTS5 guide

Retrieving an index instance from a Litedb object

Just call search_index without supplying a block

idx = db.search_index('email')

Modifying the index schema

You can supply a new schema by either calling #search_index again with a block or calling #modify on the index object with a block

idx = db.search_index('email') do |schema|
  # subject now has weight = zero, a new field :urgent was added
  schema.fields [:sender, :body, :subject, :urgent] 
  # you cannot just remove fields, but setting their weight to 0 ignores them and removes them after the next rebuild
  schema.fields :reciever, weight: 0
  # changing the tokenizer requires a rebuild, will throw an error if rebuild_on_modify is set to false 
  schema.tokenizer = :trigram
  # rebuild to match the new schema, keeping the data intact, also removing fields with weight = 0
  schema.rebuild_on_modify true
end

#another option is to call #modify on the idx object
idx.modify do |schema|
 # ...
end

If you set a field weight to zero without rebuilding the index then the field data will not be visible to your queries, you can still query the field but it will always return no matches, you can also add documents with the field data populated, but this data will also not contribute to search results. After a rebuild though, the field will completely disappear, it will be an error to try to search for it using the FTS5 query syntax.