FlexSearch v0.8: Overview and Migration Guide
Basic Start • API Reference • Encoder • Document Search • Persistent Indexes • Using Worker • Tag Search • Resolver • Changelog
FlexSearch has been helping developers around the world build powerful, efficient search functionalities for years. Maintaining and improving the library requires significant time and resources. If you’ve found this project valuable and you're interested in supporting the project, please consider donating. Thanks a lot for your continued support!
Antithesis Operations LLC
FlexSearch performs queries up to 1,000,000 times faster compared to other libraries by also providing powerful search capabilities like multi-field search (document search), phonetic transformations, partial matching, tag-search or suggestions.
Bigger workloads are scalable through workers to perform any updates or queries on the index in parallel through dedicated balanced threads.
The latest generation v0.8 introduce Persistent Indexes, well optimized for scaling of large datasets and running in parallel. All available features was natively ported right into the database engine of your choice.
FlexSearch was nominated by the GitNation for the "Best Technology of the Year".
Supported Platforms:
- Browser
- Node.js
Supported Database:
- InMemory (Default)
- IndexedDB (Browser)
- Redis
- SQLite
- Postgres
- MongoDB
- Clickhouse
Supported Charsets:
- Latin
- Chinese, Korean, Japanese (CJK)
- Hindi
- Arabic
- Cyrillic
- Greek and Coptic
- Hebrew
Common Code Examples:
- Node.js: Module (ESM)
- Node.js: CommonJS
- Browser: Module (ESM)
- Browser: Legacy Script
Demos:
Latest Benchmark Results
The benchmark was measured in terms per seconds, higher values are better (except the test "Memory"). The memory value refers to the amount of memory which was additionally allocated during search.
Library | Memory | Query: Single | Query: Multi | Query: Large | Query: Not Found |
---|---|---|---|---|---|
flexsearch | 16 | 50955718 | 11912730 | 13981110 | 51706499 |
jsii | 2188 | 13847 | 949559 | 1635959 | 3730307 |
wade | 980 | 60473 | 443214 | 419152 | 1239372 |
js-search | 237 | 22982 | 383775 | 426609 | 994803 |
minisearch | 4777 | 30589 | 191657 | 5849 | 304233 |
orama | 5355 | 29445 | 170231 | 4454 | 225491 |
elasticlunr | 3073 | 14326 | 48558 | 101206 | 95840 |
lunr | 2443 | 11527 | 51476 | 88858 | 103386 |
ufuzzy | 13754 | 2799 | 7788 | 58544 | 9557 |
bm25 | 33963 | 3903 | 4777 | 12657 | 12471 |
fuzzysearch | 300147 | 148 | 229 | 455 | 276 |
fuse | 247107 | 422 | 321 | 337 | 329 |
Run Comparison: Performance Benchmark "Gulliver's Travels"
Extern Projects & Plugins:
- React: https://github.com/angeloashmore/react-use-flexsearch
- Vue: https://github.com/Noction/vue-use-flexsearch
- Gatsby: https://www.gatsbyjs.org/packages/gatsby-plugin-flexsearch/
Tip
You will just need to spend 5 minutes to improve your results significantly by understanding these 3 elementary things about FlexSearch : Tokenizer, Encoder and Suggestions
- Load Library (Node.js, ESM, Legacy Browser)
- Basic Usage and Variants
- API Overview
- Options
- Context Search
- Document Search (Multi-Field Search)
- Multi-Tag Search
- Phonetic Search (Fuzzy Search)
- Tokenizer (Partial Search)
- Encoder
- Universal Charset Collection
- Latin Charset Encoder Presets
- Language Specific Preset
- Custom Encoder
- Non-Blocking Runtime Balancer (Async)
- Worker Indexes
- Resolver (Complex Queries)
- Boolean Operations (and, or, xor, not)
- Boost
- Limit / Offset
- Resolve
- Export / Import Indexes
- Persistent Indexes
- Result Highlighting
- Common Code Examples (Browser, Node.js)
npm install flexsearch
The dist folder is located in: node_modules/flexsearch/dist/
Download Builds
Compare Bundles: Light, Compact, Bundle
The Node.js package includes all features from
flexsearch.bundle.js
.
Feature | flexsearch.bundle.js | flexsearch.compact.js | flexsearch.light.js |
Presets | ✓ | ✓ | ✓ |
Async Processing | ✓ | ✓ | - |
Workers (Web + Node.js) | ✓ | - | - |
Context Search | ✓ | ✓ | ✓ |
Document Search | ✓ | ✓ | - |
Document Store | ✓ | ✓ | - |
Partial Matching | ✓ | ✓ | ✓ |
Relevance Scoring | ✓ | ✓ | ✓ |
Auto-Balanced Cache by Popularity/Last Queries | ✓ | ✓ | - |
Tag Search | ✓ | ✓ | - |
Suggestions | ✓ | ✓ | ✓ |
Phonetic Search (Fuzzy Search) | ✓ | ✓ | - |
Encoder | ✓ | ✓ | ✓ |
Export / Import Indexes | ✓ | ✓ | - |
Resolver | ✓ | - | - |
Persistent Index (IndexedDB) | ✓ | - | - |
File Size (gzip) | 14.0 kb | 9.0 kb | 4.4 kb |
Tip
All debug versions are providing debug information through the console and gives you helpful advices on certain situations. Do not use them in production, since they are special builds containing extra debugging processes which noticeably reduce performance.
The abbreviations used at the end of the filenames indicates:
bundle
All features included, FlexSearch is available onwindow.FlexSearch
light
Only basic features are included, FlexSearch is available onwindow.FlexSearch
es5
bundle has support for EcmaScript5, FlexSearch is available onwindow.FlexSearch
module
indicates that this bundle is a Javascript module (ESM), FlexSearch members are available byimport { Index, Document, Worker, Encoder, Charset } from "./flexsearch.bundle.module.min.js"
or alternatively using the default exportimport FlexSearch from "./flexsearch.bundle.module.min.js"
min
bundle is minifieddebug
bundle has enabled debug mode and contains additional code just for debugging purposes (do not use for production)
Non-Module Bundles export all their features to the public namespace "FlexSearch" e.g.
window.FlexSearch.Index
orwindow.FlexSearch.Document
.
Load the bundle by a script tag:
<script src="dist/flexsearch.bundle.min.js"></script>
<script>
// ... access FlexSearch
var Index = window.FlexSearch.Index;
var index = new Index(/* ... */);
</script>
FlexSearch Members are accessible on:
var Index = window.FlexSearch.Index;
var Document = window.FlexSearch.Document;
var Encoder = window.FlexSearch.Encoder;
var Charset = window.FlexSearch.Charset;
var Resolver = window.FlexSearch.Resolver;
var Worker = window.FlexSearch.Worker;
var IdxDB = window.FlexSearch.IndexedDB;
// only exported by non-module builds:
var Language = window.FlexSearch.Language;
Load language packs:
<!-- English: -->
<script src="dist/lang/en.min.js"></script>
<!-- German: -->
<script src="dist/lang/de.min.js"></script>
<!-- French: -->
<script src="dist/lang/fr.min.js"></script>
<script>
var EnglishEncoderPreset = window.FlexSearch.Language.en;
var GermanEncoderPreset = window.FlexSearch.Language.de;
var FrenchEncoderPreset = window.FlexSearch.Language.fr;
</script>
When using modules you can choose from 2 variants: flexsearch.xxx.module.min.js
has all features bundled ready for production, whereas the folder /dist/module/
export all the features in the same structure as the source code but here compiler flags was resolved.
Also, for each variant there exist:
- A debug version for the development
- A pre-compiled minified version for production
Use the bundled version exported as a module (default export):
<script type="module">
import FlexSearch from "./dist/flexsearch.bundle.module.min.js";
const index = new FlexSearch.Index(/* ... */);
</script>
Or import FlexSearch members separately by:
<script type="module">
import { Index, Document, Encoder, Charset, Resolver, Worker, IndexedDB }
from "./dist/flexsearch.bundle.module.min.js";
const index = new Index(/* ... */);
</script>
Use bundled style on non-bundled modules:
<script type="module">
import { Index, Document, Encoder, Charset, Resolver, Worker, IndexedDB }
from "./dist/module/bundle.js";
const index = new Index(/* ... */);
</script>
Use non-bundled modules by file default exports:
<script type="module">
import Index from "./dist/module/index.js";
import Document from "./dist/module/document.js";
import Encoder from "./dist/module/encoder.js";
import Charset from "./dist/module/charset.js";
import Resolver from "./dist/module/resolver.js";
import Worker from "./dist/module/worker.js";
import IndexedDB from "./dist/module/db/indexeddb/db.js";
const index = new Index(/* ... */);
</script>
Language packs are accessible via:
import EnglishEncoderPreset from "./dist/module/lang/en.js";
import GermanEncoderPreset from "./dist/module/lang/de.js";
import FrenchEncoderPreset from "./dist/module/lang/fr.js";
Also, pre-compiled non-bundled production-ready modules are located in dist/module-min/
, whereas the debug version is located at dist/module-debug/
.
You can also load modules via CDN:
<script type="module">
import Index from "https://unpkg.com/flexsearch@0.8.1/dist/module/index.js";
const index = new Index(/* ... */);
</script>
Install FlexSearch via NPM:
npm install flexsearch
Use the default export:
const FlexSearch = require("flexsearch");
const index = new FlexSearch.Index(/* ... */);
Or require FlexSearch members separately by:
const { Index, Document, Encoder, Charset, Resolver, Worker, IndexedDB } = require("flexsearch");
const index = new Index(/* ... */);
When using ESM instead of CommonJS:
import { Index, Document, Encoder, Charset, Resolver, Worker, IndexedDB } from "flexsearch";
const index = new FlexSearch.Index(/* ... */);
Language packs are accessible via:
const EnglishEncoderPreset = require("flexsearch/lang/en");
const GermanEncoderPreset = require("flexsearch/lang/de");
const FrenchEncoderPreset = require("flexsearch/lang/fr");
Persistent Connectors are accessible via:
const Postgres = require("flexsearch/db/postgres");
const Sqlite = require("flexsearch/db/sqlite");
const MongoDB = require("flexsearch/db/mongodb");
const Redis = require("flexsearch/db/redis");
const Clickhouse = require("flexsearch/db/clickhouse");
There are 3 types of indexes:
Index
is a flat high performance index which stores id-content-pairs.Worker
/WorkerIndex
is also a flat index which stores id-content-pairs but runs in background as a dedicated worker thread.Document
is multi-field index which can store complex JSON documents (could also exist of worker indexes).
The most of you probably need just one of them according to your scenario. Any of these 3 index type are upgradable to persistent indexes.
The worker
instance inherits from type Index
and basically works like a standard FlexSearch Index. A document index is a complex register automatically operating on several of those standard indexes in parallel. Worker-Support in documents needs to be enabled by just passing the appropriate option during creation e.g. { worker: true }
.
index.add(id, text);
const result = index.search(text, options);
worker.add(id, text);
const result = worker.search(text, options);
document.add(doc);
const result = document.search(text, options);
Each of these index types have a persistent model (optionally). So, persistent index isn't a new 4th index type, instead it extends the existing ones.
Every method called on a
Worker
index is treated as async. You will get back aPromise
or you can provide a callback function as the last parameter additionally.
The documentation will refer to several examples. A list of all examples:
Examples Node.js (CommonJS)
Examples Node.js (ESM/Module)
Examples Browser (Legacy)
Examples Browser (ESM/Module)
Constructors:
- new Index(<options>) : index
- new Document(options) : document
- new Worker(<options>) : worker
- new Encoder(<options>, <options>, ...) : encoder
- new Resolver(<options>) : resolver
- new IndexedDB(<options>) : indexeddb
Global Members:
Index
/ Worker
-Index Methods:
-
index.add(id, string)
-
index.append(id, string) -
index.update(id, string)
-
index.remove(id)
-
index.search(string, <limit>, <options>)
-
index.search(options)
-
index.searchCache(...)
-
index.contain(id)
-
index.clear()
-
index.cleanup()
-
async index.export(handler)
-
async index.import(key, data)
-
async index.serialize(boolean)
-
async index.mount(db)
-
async index.commit(boolean)
-
async index.destroy()
Document
Methods:
-
document.add(<id>, document)
-
document.append(<id>, document) -
document.update(<id>, document)
-
document.remove(id)
-
document.remove(document)
-
document.search(string, <limit>, <options>)
-
document.search(options)
-
document.searchCache(...)
-
document.contain(id)
-
document.clear()
-
document.cleanup()
-
document.get(id)
-
document.set(<id>, document)
-
async document.export(handler)
-
async document.import(key, data)
-
async document.mount(db)
-
async document.commit(boolean)
-
async document.destroy()
Document
Properties:
- document.store
Async Equivalents (Non-Blocking Balanced):
- async .addAsync( ... , <callback>)
- async
.appendAsync( ... , <callback>) - async .updateAsync( ... , <callback>)
- async .removeAsync( ... , <callback>)
- async .searchAsync( ... , <callback>)
Async methods will return a Promise
, additionally you can pass a callback function as the last parameter.
Methods export
and also import
are always async as well as every method you call on a Worker-based or Persistent Index.
Encoder
Methods:
- encoder.encode(string)
- encoder.assign(options, <options>, ...)
- encoder.addFilter(string)
- encoder.addStemmer(string => boolean)
- encoder.addMapper(char, char)
- encoder.addMatcher(string, string)
- encoder.addReplacer(regex, string)
Resolver
Methods:
- resolver.and(options)
- resolver.or(options)
- resolver.xor(options)
- resolver.not(options)
- resolver.boost(number)
- resolver.limit(number)
- resolver.offset(number)
- resolver.resolve(<options>)
Resolver
Properties:
- resolver.result
StorageInterface
Methods:
- async db.mount(index, <options>)
- async db.open()
- async db.close()
- async db.destroy()
- async db.clear()
Charset
Encoder Preset:
-
Charset.Exact
-
Charset.Default
-
Charset.Normalize
-
Charset.LatinBalance
-
Charset.LatinAdvanced
-
Charset.LatinExtra
-
Charset.LatinSoundex
Language
Encoder Preset:
- Index Options
- Context Options
- Document Options
- Encoder Options
- Resolver Options
- Search Options
- Document Search Options
- Worker Options
- Persistent Options
The tokenizer is one of the most important options and heavily influence:
- required memory / storage
- capabilities of partial matches
Tip
If you want getting back results of an indexed term "flexsearch" when just typing "flex" or "search" then this is done by choosing a tokenizer.
Try to choose the most upper of these tokenizer which covers your requirements:
Option | Description | Example | Memory Factor (n = length of term) |
"strict" "exact" "default" |
index the full term | foobar |
* 1 |
"forward" | index term in forward direction (supports right-to-left by Index option rtl: true ) |
fo obarfoob ar |
* n |
"reverse" "bidirectional" |
index term in both directions | fo obarfoob arfoob ar fo obar |
* 2n - 1 |
"full" | index every consecutive partial | fooba rf oob ar |
* n * (n - 1) |
Encoding is one of the most important task and heavily influence:
- required memory / storage
- capabilities of phonetic matches (Fuzzy-Search)
Option | Description | Compression Ratio |
Exact |
Bypass encoding and take exact input | 0% |
Default |
Case in-sensitive encoding | 3% |
Normalize |
Case in-sensitive encoding Charset normalization |
~ 7% |
LatinBalance |
Case in-sensitive encoding Charset normalization Phonetic basic transformation |
~ 30% |
LatinAdvanced |
Case in-sensitive encoding Charset normalization Phonetic advanced transformation |
~ 45% |
LatinExtra |
Case in-sensitive encoding Charset normalization Soundex-like transformation |
~ 60% |
LatinSoundex |
Full Soundex transformation | ~ 70% |
function(str) => [str] |
Pass a custom encoding function to the Encoder |
var index = new Index();
Create a new index and choosing one of the presets:
var index = new Index("performance");
Create a new index with custom options:
var index = new Index({
charset: "latin:extra",
tokenize: "reverse",
resolution: 9
});
Create a new index and extend a preset with custom options:
var index = new FlexSearch({
preset: "memory",
tokenize: "forward",
resolution: 5
});
The resolution refers to the maximum count of scoring slots on which the content is divided into.
A formula to determine a well-balanced value for the
resolution
is:$2*floor(\sqrt{content.length})$ where content is the value pushed byindex.add()
. Here the maximum length of all contents should be used.
See all available custom options.
Every content which should be added to the index needs an ID. When your content has no ID, then you need to create one by passing an index or count or something else as an ID (a value from type number
is highly recommended). Those IDs are unique references to a given content. This is important when you update or adding over content through existing IDs. When referencing is not a concern, you can simply use something simple like count++
.
Index.add(id, string)
index.add(0, "John Doe");
Index.search(string | options, <limit>, <options>)
index.search("John");
Limit the result:
index.search("John", 10);
You can check if an ID was already indexed by:
if(index.contain(1)){
console.log("ID is already in index");
}
Index.update(id, string)
index.update(0, "Max Miller");
Index.remove(id)
index.remove(0);
Simply chain methods like:
var index = Index.create().addMatcher({'â': 'a'}).add(0, 'foo').add(1, 'bar');
index.remove(0).update(1, 'foo').add(2, 'foobar');
The basic idea of this concept is to limit relevance by its context instead of calculating relevance through the whole distance of its corresponding document. The context acts like a bidirectional moving window of 2 pointers (terms) which can initially have a maximum distance of the value passed via option setting depth
and dynamically growth on search when the query did not match any results.
Create an index and use the default context:
var index = new FlexSearch({
tokenize: "strict",
context: true
});
Create an index and apply custom options for the context:
var index = new FlexSearch({
tokenize: "strict",
context: {
resolution: 5,
depth: 3,
bidirectional: true
}
});
Only the tokenizer "strict" is actually supported by the contextual index.
The contextual index requires additional amount of memory depending on depth.
The book "Gulliver's Travels" (Swift Jonathan 1726) was indexed for this test.
by default a lexical index is very small:
depth: 0, bidirectional: 0, resolution: 3, minlength: 0
=> 2.1 Mb
a higher resolution will increase the memory allocation:
depth: 0, bidirectional: 0, resolution: 9, minlength: 0
=> 2.9 Mb
using the contextual index will increase the memory allocation:
depth: 1, bidirectional: 0, resolution: 9, minlength: 0
=> 12.5 Mb
a higher contextual depth will increase the memory allocation:
depth: 2, bidirectional: 0, resolution: 9, minlength: 0
=> 21.5 Mb
a higher minlength will decrease memory allocation:
depth: 2, bidirectional: 0, resolution: 9, minlength: 3
=> 19.0 Mb
using bidirectional will decrease memory allocation:
depth: 2, bidirectional: 1, resolution: 9, minlength: 3
=> 17.9 Mb
enable the option "fastupdate" will increase memory allocation:
depth: 2, bidirectional: 1, resolution: 9, minlength: 3
=> 6.3 Mb
memory
primarily optimized for a small memory footprintperformance
primarily optimized for high performancematch
primarily optimized for matching capabilitiesscore
primarily optimized for scoring capabilities (order of results)default
the default balanced profile
These profiles are covering standard use cases. It is recommended to apply custom configuration instead of using profiles to get the best out. Every profile could be optimized further to its specific task, e.g. extreme performance optimized configuration or extreme memory and so on.
You can pass a preset during creation/initialization of the index.
It is recommended to use numeric id values as reference when adding content to the index. The byte length of passed ids influences the memory consumption significantly. If this is not possible you should consider to use a index table and map the ids with indexes, this becomes important especially when using contextual indexes on a large amount of content.
Copyright 2018-2025 Thomas Wilkerling, Hosted by Nextapps GmbH
Released under the Apache 2.0 License