# BM25S

>[BM25S (blogpost)](https://huggingface.co/blog/xhluca/bm25s) is a faster variant of the classic BM25 algorithm since it relies on sparse algorithms instead of dense computations.
>
>`BM25SRetriever` retriever uses the [`bm25s`](https://github.com/xhluca/bm25s) package.


In [None]:
%pip install --upgrade --quiet  bm25s

# If you want to use stemming for better results, you can install a stemmer
%pip install --upgrade --quiet PyStemmer

# To speed up the top-k selection process, you can install `jax`
%pip install --upgrade --quiet "jax[cpu]"

In [None]:
from langchain_community.retrievers import BM25SRetriever

## Create New Retriever with Texts

In [None]:
retriever = BM25SRetriever.from_texts(
    texts=["I have a pen.", "Do you have a pen?", "I have a bag."]
)

## Create a New Retriever with Documents

You can now create a new retriever with the documents you created.

In [None]:
from langchain_core.documents import Document

retriever = BM25SRetriever.from_documents(
    [
        Document(page_content="foo", metadata={"id": 1}),
        Document(page_content="bar", metadata={"id": 2}),
        Document(page_content="world", metadata={"id": 3}),
        Document(page_content="hello", metadata={"id": 4}),
        Document(page_content="foo bar", metadata={"id": 5}),
    ]
)

## Create a New Retriever that is persisted

You can now create a new retriever that is persisted on disk and can be reloaded.

In [None]:
retriever = BM25SRetriever.from_documents(
    [
        Document(page_content="foo", metadata={"id": 1}),
        Document(page_content="bar", metadata={"id": 2}),
        Document(page_content="world", metadata={"id": 3}),
        Document(page_content="hello", metadata={"id": 4}),
        Document(page_content="foo bar"),
    ],
    persist_directory="bm25s_retriever",
)

retriever = BM25SRetriever.load("bm25s_retriever", mmap=True)

## Use Retriever

We can now use the retriever!

In [None]:
result = retriever.invoke("foo bar")

In [None]:
result