A powerful and flexible text search library for JavaScript that enables you to build a simple text search engine. It provides a set of classes to tokenize, parse, and interpret queries using a binary AST (Abstract Syntax Tree). The library supports various grouping operators (and/or/&/|) and any degree of parenthesis nesting.
- Tokenization of search queries
- Parsing to Abstract Syntax Trees (AST)
- Interpretation to evaluate search queries against text
- Normalization of text and query strings
- Abstract factory for easy extension
Install the package with:
npm install @basd/search
First, import the Search
library.
import Search from '@basd/search'
or
const Search = require('@basd/search')
Here's how to create a simple search evaluator and use it.
const Search = require('@basd/search')
const search = new Search()
const evaluator = search.evaluator('apple AND orange')
const result = evaluator('I have an apple and an orange.')
// Returns true
Here's a basic example of how you can use @basd/search
to perform a text search:
const { Tokenizer, Parser, Interpreter } = require('@basd/search')
const query = 'apple AND orange OR pear'
const tokenizer = new Tokenizer()
const tokens = tokenizer.tokenize(query)
const parser = new Parser(tokens)
const ast = parser.parse()
const interpreter = new Interpreter(ast)
const result = interpreter.interpret('apple orange') // true
Factory class to produce instances of Tokenizer, Parser, and Interpreter.
const factory = new SearchFactory(registry)
createTokenizer(...args)
: Creates aSearchTokenizer
instance.createParser(...args)
: Creates aSearchParser
instance.createInterpreter(...args)
: Creates aSearchInterpreter
instance.
Normalizes text to be used in tokenization and interpretation.
const normalizedText = SearchNormalizer.normalize('some text')
Tokenizes the normalized query.
const tokenizer = new SearchTokenizer()
const tokens = tokenizer.tokenize('apple AND orange')
Parses the tokens into an AST.
const parser = new SearchParser(tokens)
const ast = parser.parse()
Interprets the AST against a given text.
const interpreter = new SearchInterpreter(ast)
const result = interpreter.interpret('I have an apple.')
The main class that combines all the functionalities.
const search = new Search()
evaluator(needle)
: Returns an evaluator function for a given search query.evaluate(needle, haystack)
: Evaluates a search query against a given text.
The library is designed to be easily extendable. You can extend SearchTokenizer
, SearchParser
, and SearchInterpreter
to add additional functionalities.
Normalizes text by removing punctuations, converting to uppercase, and replacing multiple spaces with a single space.
Tokenizes a query into distinct elements such as words, operators, and parentheses.
Takes the tokens and turns them into a binary AST.
Takes the AST and matches a given text string against it.
Takes a query string and returns an array of tokens.
Takes an array of tokens and returns a binary AST.
Takes a string of text and returns a boolean indicating whether it matches the AST.
In order to run the test suite, simply clone the repository and install its dependencies:
git clone https://gitlab.com/frenware/framework/plaindb/search.git
cd search
npm install
To run the tests:
npm test
Thank you! Please see our contributing guidelines for details.
If you find this project useful and want to help support further development, please send us some coin. We greatly appreciate any and all contributions. Thank you!
Bitcoin (BTC):
1JUb1yNFH6wjGekRUW6Dfgyg4J4h6wKKdF
Monero (XMR):
46uV2fMZT3EWkBrGUgszJCcbqFqEvqrB4bZBJwsbx7yA8e2WBakXzJSUK8aqT4GoqERzbg4oKT2SiPeCgjzVH6VpSQ5y7KQ
@basd/search is MIT licensed.