Skip to content

Latest commit

 

History

History
85 lines (55 loc) · 2.52 KB

full-text-search.rst

File metadata and controls

85 lines (55 loc) · 2.52 KB

Full-text Search

Walrus comes with a standalone full-text search index that supports:

  • Storing documents along with arbitrary metadata.
  • Complex search using boolean/set operations and parentheses.
  • Stop-word removal.
  • Porter-stemming.
  • Optional double-metaphone for phonetic search.

To create a full-text index, use:

  • :pyDatabase.Index
  • :pyIndex

Example:

from walrus import Database

db = Database()
search_index = db.Index('app-search')

# Phonetic search.
phonetic_index = db.Index('phonetic-search', metaphone=True)

Storing data

Use the :pyIndex.add method to add documents to the search index:

# Specify the document's unique ID and the content to be indexed.
search_index.add('doc-1', 'this is the content of document 1')

# Besides the document ID and content, we can also store metadata, which is
# not searchable, but is returned along with the document content when a
# search is performed.
search_index.add('doc-2', 'another document', title='Another', status='1')

To update a document, use either the :pyIndex.update or :pyIndex.replace methods. The former will update existing metadata while the latter clears any pre-existing metadata before saving.

# Update doc-1's content and metadata.
search_index.update('doc-1', 'this is the new content', title='Doc 1')

# Overwrite doc-2...the "status" metadata value set earlier will be lost.
search_index.replace('doc-2', 'another document', title='Another doc')

To remove a document use :pyIndex.remove:

search_index.remove('doc-1')  # Removed from index and removed metadata.

Searching

Use the :pyIndex.search method to perform searches. The search query can include set operations (e.g. AND, OR) and use parentheses to indicate operation precedence.

for document in search_index.search('python AND flask'):
    # Print the "title" that was stored as metadata. The "content" field
    # contains the original content of the document as it was indexed.
    print(document['title'], document['content'])

Phonetic search, using metaphone, is tolerant of typos:

for document in phonetic_index.search('flasck AND pythonn'):
    print(document['title'], document['content'])

For more information, see the :pyIndex API documentation.