Skip to content

nextapps-de/flexsearch

Repository files navigation

FlexSearch v0.8: Overview and Migration Guide

FlexSearch.js: Next-Generation full-text search library for Browser and Node.js

Next-Generation full-text search library for Browser and Node.js

Basic Start  •  API Reference  •  Encoder  •  Document Search  •  Persistent Indexes  •  Using Worker  •  Tag Search  •  Resolver  •  Changelog

Please Support this Project

FlexSearch has been helping developers around the world build powerful, efficient search functionalities for years. Maintaining and improving the library requires significant time and resources. If you’ve found this project valuable and you're interested in supporting the project, please consider donating. Thanks a lot for your continued support!

Donate using Open Collective Donate using Github Sponsors Donate using Liberapay Donate using Patreon Donate using Bountysource Donate using PayPal

FlexSearch Sponsors

Donate using Open Collective
Antithesis Operations LLC

FlexSearch performs queries up to 1,000,000 times faster compared to other libraries by also providing powerful search capabilities like multi-field search (document search), phonetic transformations, partial matching, tag-search or suggestions.

Bigger workloads are scalable through workers to perform any updates or queries on the index in parallel through dedicated balanced threads.

The latest generation v0.8 introduce Persistent Indexes, well optimized for scaling of large datasets and running in parallel. All available features was natively ported right into the database engine of your choice.

FlexSearch was nominated by the GitNation for the "Best Technology of the Year".

Supported Platforms:

  • Browser
  • Node.js

Supported Database:

Supported Charsets:

  • Latin
  • Chinese, Korean, Japanese (CJK)
  • Hindi
  • Arabic
  • Cyrillic
  • Greek and Coptic
  • Hebrew

Common Code Examples:

Demos:

Benchmarks:

Latest Benchmark Results

The benchmark was measured in terms per seconds, higher values are better (except the test "Memory"). The memory value refers to the amount of memory which was additionally allocated during search.

Library Memory Query: Single Query: Multi Query: Large Query: Not Found
flexsearch 16 50955718 11912730 13981110 51706499
jsii 2188 13847 949559 1635959 3730307
wade 980 60473 443214 419152 1239372
js-search 237 22982 383775 426609 994803
minisearch 4777 30589 191657 5849 304233
orama 5355 29445 170231 4454 225491
elasticlunr 3073 14326 48558 101206 95840
lunr 2443 11527 51476 88858 103386
ufuzzy 13754 2799 7788 58544 9557
bm25 33963 3903 4777 12657 12471
fuzzysearch 300147 148 229 455 276
fuse 247107 422 321 337 329

Run Comparison: Performance Benchmark "Gulliver's Travels"

Extern Projects & Plugins:

Table of contents

Tip

You will just need to spend 5 minutes to improve your results significantly by understanding these 3 elementary things about FlexSearch : Tokenizer, Encoder and Suggestions

Load Library (Node.js, ESM, Legacy Browser)

npm install flexsearch

The dist folder is located in: node_modules/flexsearch/dist/

Download Builds
Build File CDN
flexsearch.bundle.debug.js Download https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.bundle.debug.js
flexsearch.bundle.min.js Download https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.bundle.min.js
flexsearch.bundle.module.debug.js Download https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.bundle.module.debug.js
flexsearch.bundle.module.min.js Download https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.bundle.module.min.js
flexsearch.es5.debug.js Download https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.es5.debug.js
flexsearch.es5.min.js Download https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.es5.min.js
flexsearch.light.debug.js Download https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.light.debug.js
flexsearch.light.min.js Download https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.light.min.js
flexsearch.light.module.debug.js Download https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.light.module.debug.js
flexsearch.light.module.min.js Download https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.light.module.min.js
Javascript Modules (ESM) Download https://github.com/nextapps-de/flexsearch/tree/0.8.1/dist/module
Javascript Modules Minified (ESM) Download https://github.com/nextapps-de/flexsearch/tree/0.8.1/dist/module-min
Javascript Modules Debug (ESM) Download https://github.com/nextapps-de/flexsearch/tree/0.8.1/dist/module-debug
flexsearch.custom.js Read more about "Custom Build"
Compare Bundles: Light, Compact, Bundle

The Node.js package includes all features from flexsearch.bundle.js.

Feature flexsearch.bundle.js flexsearch.compact.js flexsearch.light.js
Presets
Async Processing -
Workers (Web + Node.js) - -
Context Search
Document Search -
Document Store -
Partial Matching
Relevance Scoring
Auto-Balanced Cache by Popularity/Last Queries -
Tag Search -
Suggestions
Phonetic Search (Fuzzy Search) -
Encoder
Export / Import Indexes -
Resolver - -
Persistent Index (IndexedDB) - -
File Size (gzip) 14.0 kb 9.0 kb 4.4 kb

Tip

All debug versions are providing debug information through the console and gives you helpful advices on certain situations. Do not use them in production, since they are special builds containing extra debugging processes which noticeably reduce performance.

The abbreviations used at the end of the filenames indicates:

  • bundle All features included, FlexSearch is available on window.FlexSearch
  • light Only basic features are included, FlexSearch is available on window.FlexSearch
  • es5 bundle has support for EcmaScript5, FlexSearch is available on window.FlexSearch
  • module indicates that this bundle is a Javascript module (ESM), FlexSearch members are available by import { Index, Document, Worker, Encoder, Charset } from "./flexsearch.bundle.module.min.js" or alternatively using the default export import FlexSearch from "./flexsearch.bundle.module.min.js"
  • min bundle is minified
  • debug bundle has enabled debug mode and contains additional code just for debugging purposes (do not use for production)

Load Library

Non-Module Bundles (ES5 Legacy)

Non-Module Bundles export all their features to the public namespace "FlexSearch" e.g. window.FlexSearch.Index or window.FlexSearch.Document.

Load the bundle by a script tag:

<script src="dist/flexsearch.bundle.min.js"></script>
<script>
  // ... access FlexSearch
  var Index = window.FlexSearch.Index;
  var index = new Index(/* ... */);
</script>

FlexSearch Members are accessible on:

var Index = window.FlexSearch.Index;
var Document = window.FlexSearch.Document;
var Encoder = window.FlexSearch.Encoder;
var Charset = window.FlexSearch.Charset;
var Resolver = window.FlexSearch.Resolver;
var Worker = window.FlexSearch.Worker;
var IdxDB = window.FlexSearch.IndexedDB;
// only exported by non-module builds:
var Language = window.FlexSearch.Language;

Load language packs:

<!-- English: -->
<script src="dist/lang/en.min.js"></script>
<!-- German: -->
<script src="dist/lang/de.min.js"></script>
<!-- French: -->
<script src="dist/lang/fr.min.js"></script>
<script>
  var EnglishEncoderPreset = window.FlexSearch.Language.en;
  var GermanEncoderPreset = window.FlexSearch.Language.de;
  var FrenchEncoderPreset = window.FlexSearch.Language.fr;
</script>

Module (ESM)

When using modules you can choose from 2 variants: flexsearch.xxx.module.min.js has all features bundled ready for production, whereas the folder /dist/module/ export all the features in the same structure as the source code but here compiler flags was resolved.

Also, for each variant there exist:

  1. A debug version for the development
  2. A pre-compiled minified version for production

Use the bundled version exported as a module (default export):

<script type="module">
    import FlexSearch from "./dist/flexsearch.bundle.module.min.js";
    const index = new FlexSearch.Index(/* ... */);
</script>

Or import FlexSearch members separately by:

<script type="module">
    import { Index, Document, Encoder, Charset, Resolver, Worker, IndexedDB } 
        from "./dist/flexsearch.bundle.module.min.js";
    const index = new Index(/* ... */);
</script>

Use bundled style on non-bundled modules:

<script type="module">
    import { Index, Document, Encoder, Charset, Resolver, Worker, IndexedDB }
        from "./dist/module/bundle.js";
    const index = new Index(/* ... */);
</script>

Use non-bundled modules by file default exports:

<script type="module">
    import Index from "./dist/module/index.js";
    import Document from "./dist/module/document.js";
    import Encoder from "./dist/module/encoder.js";
    import Charset from "./dist/module/charset.js";
    import Resolver from "./dist/module/resolver.js";
    import Worker from "./dist/module/worker.js";
    import IndexedDB from "./dist/module/db/indexeddb/db.js";
    const index = new Index(/* ... */);
</script>

Language packs are accessible via:

import EnglishEncoderPreset from "./dist/module/lang/en.js";
import GermanEncoderPreset from "./dist/module/lang/de.js";
import FrenchEncoderPreset from "./dist/module/lang/fr.js";

Also, pre-compiled non-bundled production-ready modules are located in dist/module-min/, whereas the debug version is located at dist/module-debug/.

You can also load modules via CDN:

<script type="module">
    import Index from "https://unpkg.com/flexsearch@0.8.1/dist/module/index.js";
    const index = new Index(/* ... */);
</script>

Node.js

Install FlexSearch via NPM:

npm install flexsearch

Use the default export:

const FlexSearch = require("flexsearch");
const index = new FlexSearch.Index(/* ... */);

Or require FlexSearch members separately by:

const { Index, Document, Encoder, Charset, Resolver, Worker, IndexedDB } = require("flexsearch");
const index = new Index(/* ... */);

When using ESM instead of CommonJS:

import { Index, Document, Encoder, Charset, Resolver, Worker, IndexedDB } from "flexsearch";
const index = new FlexSearch.Index(/* ... */);

Language packs are accessible via:

const EnglishEncoderPreset = require("flexsearch/lang/en");
const GermanEncoderPreset = require("flexsearch/lang/de");
const FrenchEncoderPreset = require("flexsearch/lang/fr");

Persistent Connectors are accessible via:

const Postgres = require("flexsearch/db/postgres");
const Sqlite = require("flexsearch/db/sqlite");
const MongoDB = require("flexsearch/db/mongodb");
const Redis = require("flexsearch/db/redis");
const Clickhouse = require("flexsearch/db/clickhouse");

Basic Usage and Variants

There are 3 types of indexes:

  1. Index is a flat high performance index which stores id-content-pairs.
  2. Worker / WorkerIndex is also a flat index which stores id-content-pairs but runs in background as a dedicated worker thread.
  3. Document is multi-field index which can store complex JSON documents (could also exist of worker indexes).

The most of you probably need just one of them according to your scenario. Any of these 3 index type are upgradable to persistent indexes.

The worker instance inherits from type Index and basically works like a standard FlexSearch Index. A document index is a complex register automatically operating on several of those standard indexes in parallel. Worker-Support in documents needs to be enabled by just passing the appropriate option during creation e.g. { worker: true }.

index.add(id, text);
const result = index.search(text, options);
worker.add(id, text);
const result = worker.search(text, options);
document.add(doc);
const result = document.search(text, options);

Each of these index types have a persistent model (optionally). So, persistent index isn't a new 4th index type, instead it extends the existing ones.

Every method called on a Worker index is treated as async. You will get back a Promise or you can provide a callback function as the last parameter additionally.

Common Code Examples

The documentation will refer to several examples. A list of all examples:

Examples Node.js (CommonJS)
Examples Node.js (ESM/Module)

Examples Browser (Legacy)
Examples Browser (ESM/Module)

API Overview

Constructors:

  • new Index(<options>) : index
  • new Document(options) : document
  • new Worker(<options>) : worker
  • new Encoder(<options>, <options>, ...) : encoder
  • new Resolver(<options>) : resolver
  • new IndexedDB(<options>) : indexeddb

Global Members:


Index / Worker-Index Methods:


Document Methods:

Document Properties:


Async Equivalents (Non-Blocking Balanced):

Async methods will return a Promise, additionally you can pass a callback function as the last parameter.

Methods export and also import are always async as well as every method you call on a Worker-based or Persistent Index.


Encoder Methods:


Resolver Methods:

  • resolver.and(options)
  • resolver.or(options)
  • resolver.xor(options)
  • resolver.not(options)
  • resolver.boost(number)
  • resolver.limit(number)
  • resolver.offset(number)
  • resolver.resolve(<options>)

Resolver Properties:


StorageInterface Methods:


Charset Encoder Preset:


Language Encoder Preset:

Options

Tokenizer (Partial Match)

The tokenizer is one of the most important options and heavily influence:

  1. required memory / storage
  2. capabilities of partial matches

Tip

If you want getting back results of an indexed term "flexsearch" when just typing "flex" or "search" then this is done by choosing a tokenizer.

Try to choose the most upper of these tokenizer which covers your requirements:

Option Description Example Memory Factor (n = length of term)
"strict"
"exact"
"default"
index the full term foobar * 1
"forward" index term in forward direction (supports right-to-left by Index option rtl: true) foobar
foobar
* n
"reverse"
"bidirectional"
index term in both directions foobar
foobar
foobar
foobar
* 2n - 1
"full" index every consecutive partial foobar
foobar
* n * (n - 1)

Charset Collection

Encoding is one of the most important task and heavily influence:

  1. required memory / storage
  2. capabilities of phonetic matches (Fuzzy-Search)
Option Description Compression Ratio
Exact Bypass encoding and take exact input 0%
Default Case in-sensitive encoding 3%
Normalize Case in-sensitive encoding
Charset normalization
~ 7%
LatinBalance Case in-sensitive encoding
Charset normalization
Phonetic basic transformation
~ 30%
LatinAdvanced Case in-sensitive encoding
Charset normalization
Phonetic advanced transformation
~ 45%
LatinExtra Case in-sensitive encoding
Charset normalization
Soundex-like transformation
~ 60%
LatinSoundex Full Soundex transformation ~ 70%
function(str) => [str] Pass a custom encoding function to the Encoder

Basic Usage

Create a new index

var index = new Index();

Create a new index and choosing one of the presets:

var index = new Index("performance");

Create a new index with custom options:

var index = new Index({
    charset: "latin:extra",
    tokenize: "reverse",
    resolution: 9
});

Create a new index and extend a preset with custom options:

var index = new FlexSearch({
    preset: "memory",
    tokenize: "forward",
    resolution: 5
});

The resolution refers to the maximum count of scoring slots on which the content is divided into.

A formula to determine a well-balanced value for the resolution is: $2*floor(\sqrt{content.length})$ where content is the value pushed by index.add(). Here the maximum length of all contents should be used.

See all available custom options.

Add text item to an index

Every content which should be added to the index needs an ID. When your content has no ID, then you need to create one by passing an index or count or something else as an ID (a value from type number is highly recommended). Those IDs are unique references to a given content. This is important when you update or adding over content through existing IDs. When referencing is not a concern, you can simply use something simple like count++.

Index.add(id, string)

index.add(0, "John Doe");

Search items

Index.search(string | options, <limit>, <options>)

index.search("John");

Limit the result:

index.search("John", 10);

Check existence of already indexed IDs

You can check if an ID was already indexed by:

if(index.contain(1)){
    console.log("ID is already in index");
}

Update item from an index

Index.update(id, string)

index.update(0, "Max Miller");

Remove item from an index

Index.remove(id)

index.remove(0);

Document Search (Field-Search)

Read here

Chaining

Simply chain methods like:

var index = Index.create().addMatcher({'â': 'a'}).add(0, 'foo').add(1, 'bar');
index.remove(0).update(1, 'foo').add(2, 'foobar');

Context Search

The basic idea of this concept is to limit relevance by its context instead of calculating relevance through the whole distance of its corresponding document. The context acts like a bidirectional moving window of 2 pointers (terms) which can initially have a maximum distance of the value passed via option setting depth and dynamically growth on search when the query did not match any results.

Enable Context-Search

Create an index and use the default context:

var index = new FlexSearch({
    tokenize: "strict",
    context: true
});

Create an index and apply custom options for the context:

var index = new FlexSearch({
    tokenize: "strict",
    context: { 
        resolution: 5,
        depth: 3,
        bidirectional: true
    }
});

Only the tokenizer "strict" is actually supported by the contextual index.

The contextual index requires additional amount of memory depending on depth.

Index Memory Allocation

The book "Gulliver's Travels" (Swift Jonathan 1726) was indexed for this test.

by default a lexical index is very small:
depth: 0, bidirectional: 0, resolution: 3, minlength: 0 => 2.1 Mb

a higher resolution will increase the memory allocation:
depth: 0, bidirectional: 0, resolution: 9, minlength: 0 => 2.9 Mb

using the contextual index will increase the memory allocation:
depth: 1, bidirectional: 0, resolution: 9, minlength: 0 => 12.5 Mb

a higher contextual depth will increase the memory allocation:
depth: 2, bidirectional: 0, resolution: 9, minlength: 0 => 21.5 Mb

a higher minlength will decrease memory allocation:
depth: 2, bidirectional: 0, resolution: 9, minlength: 3 => 19.0 Mb

using bidirectional will decrease memory allocation:
depth: 2, bidirectional: 1, resolution: 9, minlength: 3 => 17.9 Mb

enable the option "fastupdate" will increase memory allocation:
depth: 2, bidirectional: 1, resolution: 9, minlength: 3 => 6.3 Mb

Presets

  1. memory primarily optimized for a small memory footprint
  2. performance primarily optimized for high performance
  3. match primarily optimized for matching capabilities
  4. score primarily optimized for scoring capabilities (order of results)
  5. default the default balanced profile

These profiles are covering standard use cases. It is recommended to apply custom configuration instead of using profiles to get the best out. Every profile could be optimized further to its specific task, e.g. extreme performance optimized configuration or extreme memory and so on.

You can pass a preset during creation/initialization of the index.

Best Practices

Use numeric IDs

It is recommended to use numeric id values as reference when adding content to the index. The byte length of passed ids influences the memory consumption significantly. If this is not possible you should consider to use a index table and map the ids with indexes, this becomes important especially when using contextual indexes on a large amount of content.


Copyright 2018-2025 Thomas Wilkerling, Hosted by Nextapps GmbH
Released under the Apache 2.0 License