Skip to content

FurkanToprak/OkapiBM25

Repository files navigation

okapibm25

NPM Downloads Statements Branches Functions Lines

A strongly typed, well-tested implementation of the Okapi BM25 algorithm. Just provide your documents to search, query keywords, and (optionally) your weights (b and k1).

Installation

Check out the NPM package.

npm install okapibm25 --save

Usage

import { BM25 } from "okapibm25";

const documents = [
  "place",
  "documents",
  "here",
  "Each test document will be searched with the keywords specified below.",
];
const query = ["keywords", "of", "your", "query."];
// A numerical scoring will be returned.
const result = BM25(documents, query, { k1: 1.3, b: 0.9 }) as number[];
console.log(result);

Sorting

A recent update allows you to sort your documents. This works very similar to JavaScript's Array.prototype.sort() function.

Here is an example of how to sort in descending order (by score).

 const results = BM25(
      corpuses,
      ["relevant"],
      undefined,
      (firstEl, secondEl) => {
        return secondEl.score - firstEl.score;
      }
    ) as BMDocument[];

I've purposely given a schema that lets you sort results by more than just score; you could also sort alphabetically (or by how many times the word 'unicorn' is mentioned, for all I care!) by comparing the documents as well. You can also even ignore scores while sorting!

Important: Note that enabling sorting changes the return type from number[] to { document: string; score: number; }[]

What's this?

An implementation of OkapiBM25 (AKA BM25), a bag-of-words information retrieval algorithm. Read up on it here.

License

Under license.md

Contributing

Submit a Pull Request if you have a useful feature that you'd like to add. If you're too lazy or this isn't your area of expertise, open an issue and I'll get to it.

About

Tested and profiled implementation of the OkapiBM25 algorithm. Install the npm package. Now at 21K downloads per year!

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •