Skip to content
A minimalist realtime full-text search index http://masanjin.net/whistlepig
C Ruby C++
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
build
integration-tests
ruby
www
.gitignore
COPYING
Makefile
README
RELEASE-SCRIPT
batch-run-queries.c
defaults.h
dump.c
entry.c
entry.h
error.c
error.h
file-indexer.c
index.c
index.h
interactive.c
khash.h
make-queries.c
mbox-indexer.c
mmap-obj.c
mmap-obj.h
query-parser.c
query-parser.h
query-parser.lex
query-parser.y
query.c
query.h
search.c
search.h
segment.c
segment.h
stringmap.c
stringmap.h
stringpool.c
stringpool.h
termhash.c
termhash.h
test-labels.c
test-queries.c
test-search.c
test-segment.c
test-stringmap.c
test-stringpool.c
test-termhash.c
test-tokenizer.c
test.h
timer.h
tokenizer.lex
whistlepig.h

README

= Whistlepig

Whistlepig is a minimalist realtime full-text search index. Its goal is to be
as small and feature-free as possible, while still remaining useful, performant
and scalable to large corpora. If you want realtime full-text search without
the frills, Whistlepig may be for you.

Whistlepig is written in ANSI C99. It currently provides a C API and Ruby
bindings.

Latest version: 0.9.1, released 2012-03-14.
        Status: beta
          News: http://all-thing.net/label/whistlepig/
      Homepage: http://masanjin.net/whistlepig/
   Bug reports: http://github.com/wmorgan/whistlepig/issues

= Getting it

       Tarball:  http://masanjin.net/whistlepig/whistlepig-0.7.tar.gz
       Rubygem:  gem install whistlepig
           Git:  git clone git://github.com/wmorgan/whistlepig.git

= Realtime search

Roughly speaking, realtime search means:
- documents are available to to queries immediately after indexing, without any
  reindexing or index merging;
- later documents are more important than earlier documents.

Whistlepig takes these principles to an extreme.
- It only returns documents in the reverse (LIFO) order to which they were
  added, and performs no ranking, reordering, or scoring.
- It only supports incremental indexing. There is no notion of batch indexing
  or index merging.
- It does not support document deletion or modification (except in the
  special case of labels; see below).
- It only supports in-memory indexes.

Features that Whistlepig does provide:
- Incremental indexing. Updates to the index are immediately available to
  readers.
- Fielded terms with arbitrary fields.
- A full query language and parser with conjunctions, disjunctions, phrases,
  negations, grouping, and nesting.
- Labels: arbitrary tokens which can be added to and removed from documents
  at any point, and incorporated into search queries.
- Early query termination and resumable queries.
- A tiny, < 3 KLOC ANSI C99 implementation.

== Synopsis (using Ruby bindings)

  require 'rubygems'
  require 'whistlepig'

  include Whistlepig

  index = Index.new "index"

  entry1 = Entry.new
  entry1.add_string "body", "hello there bob"
  docid1 = index.add_entry entry1              # => 1

  entry2 = Entry.new
  entry2.add_string "body", "goodbye bob"
  docid2 = index.add_entry entry2              # => 2

  q1 = Query.new "body", "bob"
  results1 = index.search q1                   # => [2, 1]

  q2 = q1.and Query.new("body", "hello")
  results2 = index.search q2                   # => [1]

  index.add_label docid2, "funny"

  q3 = Query.new "body", "bob ~funny"
  results3 = index.search q3                   # => [2]

  entry3 = Entry.new
  entry3.add_string "body", "hello joe"
  entry3.add_string "subject", "what do you know?"
  docid3 = index.add_entry entry3              # => 3

  q4 = Query.new "body", "subject:know hello"
  results4 = index.search q4                   # => [3]

== A note on concurrency:

Whistlepig is currently single-process and single-thread only. However, it is
built with multi-process access in mind. Per-segment single-writer,
multi-reader support is planned in the near future. Multi-writer support can be
accomplished via index striping and may be attempted in the distant future.

Please send bug reports and comments to: wmorgan-whistlepig-readme@masanjin.net.
Something went wrong with that request. Please try again.