C Ruby C++
Fetching latest commit…
Cannot retrieve the latest commit at this time.
= Whistlepig Whistlepig is a minimalist realtime full-text search index. Its goal is to be as small and feature-free as possible, while still remaining useful, performant and scalable to large corpora. If you want realtime full-text search without the frills, Whistlepig may be for you. Whistlepig is written in ANSI C99. It currently provides a C API and Ruby bindings. Latest version: 0.9.1, released 2012-03-14. Status: beta News: http://all-thing.net/label/whistlepig/ Homepage: http://masanjin.net/whistlepig/ Bug reports: http://github.com/wmorgan/whistlepig/issues = Getting it Tarball: http://masanjin.net/whistlepig/whistlepig-0.7.tar.gz Rubygem: gem install whistlepig Git: git clone git://github.com/wmorgan/whistlepig.git = Realtime search Roughly speaking, realtime search means: - documents are available to to queries immediately after indexing, without any reindexing or index merging; - later documents are more important than earlier documents. Whistlepig takes these principles to an extreme. - It only returns documents in the reverse (LIFO) order to which they were added, and performs no ranking, reordering, or scoring. - It only supports incremental indexing. There is no notion of batch indexing or index merging. - It does not support document deletion or modification (except in the special case of labels; see below). - It only supports in-memory indexes. Features that Whistlepig does provide: - Incremental indexing. Updates to the index are immediately available to readers. - Fielded terms with arbitrary fields. - A full query language and parser with conjunctions, disjunctions, phrases, negations, grouping, and nesting. - Labels: arbitrary tokens which can be added to and removed from documents at any point, and incorporated into search queries. - Early query termination and resumable queries. - A tiny, < 3 KLOC ANSI C99 implementation. == Synopsis (using Ruby bindings) require 'rubygems' require 'whistlepig' include Whistlepig index = Index.new "index" entry1 = Entry.new entry1.add_string "body", "hello there bob" docid1 = index.add_entry entry1 # => 1 entry2 = Entry.new entry2.add_string "body", "goodbye bob" docid2 = index.add_entry entry2 # => 2 q1 = Query.new "body", "bob" results1 = index.search q1 # => [2, 1] q2 = q1.and Query.new("body", "hello") results2 = index.search q2 # =>  index.add_label docid2, "funny" q3 = Query.new "body", "bob ~funny" results3 = index.search q3 # =>  entry3 = Entry.new entry3.add_string "body", "hello joe" entry3.add_string "subject", "what do you know?" docid3 = index.add_entry entry3 # => 3 q4 = Query.new "body", "subject:know hello" results4 = index.search q4 # =>  == A note on concurrency: Whistlepig is currently single-process and single-thread only. However, it is built with multi-process access in mind. Per-segment single-writer, multi-reader support is planned in the near future. Multi-writer support can be accomplished via index striping and may be attempted in the distant future. Please send bug reports and comments to: firstname.lastname@example.org.