Skip to content

alecbz/scripsi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scripsi

A flexible text-searching library built on top of redis.

Sorted suffix indexing

Sorted suffix indexing allows you to search for any substring within a set of documents. First, index a collection of documents and associated ids.

require 'scripsi'
Scripsi.connect  # connect to a running redis server

ssi = Scripsi::SortedSuffixIndexer.new "myindexer"
ssi.index('1',"Epistulam ad te scripsi.")
ssi.index('2',"I've written you a letter.")
ssi.index('3',"Quisnam Tusculo espistulam me misit?")
ssi.index('4',"Who in Tusculum would've sent me a letter?")

You can then search for any substring, and the indexer will return the ids of the documents where that substring appears.

ssi = Scripsi.indexer "myindexer"
ssi.search("te")        # => ["1","2","4"]
ssi.search("Tuscul")    # => ["3","4"]
ssi.search("Tusculu")   # => ["4"]
ssi.search("you a le")  # => ["2"]

If we want to get more information about the match, we can use the matches method:

match = ssi.matches("you a le").first
match.doc    # => "2"
match.start  # => 13
match.end    # => 21

ssi.documents[match.doc][match.start...match.end]  # => "you a le"

You can also retrive the stored documents efficiently:

ssi.documents  # lazy list of documents
ssi.documents['3']  # document with id '3'

About

A flexible text-searching library built on top of redis.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published