<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/vector_stores/MilvusFullTextSearchDemo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using Full-Text Search with LlamaIndex and Milvus

In the complex landscape of information retrieval, [full-text search](https://milvus.io/docs/full-text-search.md#Full-Text-Search) emerges as a precision tool for document discovery. Unlike semantic search, which navigates contextual nuances, full-text search delivers exact keyword matching through the BM25 algorithm — a ranking method particularly powerful in Retrieval-Augmented Generation (RAG) applications. 

With [Milvus 2.5](https://milvus.io/blog/introduce-milvus-2-5-full-text-search-powerful-metadata-filtering-and-more.md)'s Sparse-BM25 approach, raw text is automatically transformed into sparse vectors, eliminating manual embedding generation and enabling a hybrid search strategy that combines semantic understanding with keyword relevance.

This tutorial will demonstrate how to implement full-text search using LlamaIndex and Milvus, creating a sophisticated search system that bridges semantic understanding with keyword precision.

> Before proceeding with this tutorial, ensure you have a basic understanding of [full-text search](https://milvus.io/docs/full-text-search.md#Full-Text-Search) and the [basic usage](https://docs.llamaindex.ai/en/stable/examples/vector_stores/MilvusIndexDemo/) of Milvus vector store in LlamaIndex.

## Prerequisites

**Install dependencies**

To get ready for this tutorial, make sure you have the following dependencies installed:

In [None]:
! pip install llama-index-vector-stores-milvus
! pip install llama-index-embeddings-openai
! pip install llama-index-llms-openai

> If you're using Google Colab, you may need to **restart the runtime** (Navigate to the "Runtime" menu at the top of the interface, and select "Restart session" from the dropdown menu.)

**Set up accounts**

We will use OpenAI services to generate text embeddings and answers. You need to prepare the [OpenAI API key](https://platform.openai.com/api-keys). 

In [None]:
import openai

openai.api_key = "sk-"

To use the Milvus vector store, specify your Milvus server `URI` (and optionally with the `TOKEN`). To start a Milvus server, you may follow [Milvus documentation](https://milvus.io/docs/install-overview.md) for installation or simply [register with Zilliz Cloud](https://docs.zilliz.com/docs/register-with-zilliz-cloud). 

> Full-text search is currently available in Milvus Standalone, Milvus Distributed, and Zilliz Cloud, though not yet supported in Milvus Lite (which has this feature planned for future implementation). Reach out support@zilliz.com for more information.

In [None]:
URI = "http://localhost:19530"
# TOKEN = ""

**Download example data**

The following commands will download example documents to the relative directory "data/paul_graham":

In [None]:
! mkdir -p 'data/paul_graham/'
! wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

--2025-03-26 08:19:15--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.110.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 75042 (73K) [text/plain]
Saving to: ‘data/paul_graham/paul_graham_essay.txt’


2025-03-26 08:19:16 (1.09 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]



## RAG with Full-Text Search

Full-text search enhances RAG systems by enabling precise keyword matching across large document collections. This approach helps to find the most relevant information quickly, leading to more accurate and contextually grounded responses.

Use the `SimpleDirectoryReaderLoad` to load document from the essay by Paul Graham with the title "What I Worked On":

In [None]:
from llama_index.core import SimpleDirectoryReader

documents = SimpleDirectoryReader("./data/paul_graham/").load_data()

# Let's take a look at the first document
print("Example document:\n", documents[0])

Example document:
 Doc ID: 195a1e8a-7c19-4425-92d8-074db2859fc2
Text: What I Worked On  February 2021  Before college the two main
things I worked on, outside of school, were writing and programming. I
didn't write essays. I wrote what beginning writers were supposed to
write then, and probably still are: short stories. My stories were
awful. They had hardly any plot, just characters with strong feelings,
which I ...


### BM25 only

LlamaIndex's `MilvusVectorStore` introduces a powerful full-text search capability through sparse field indexing. By integrating a built-in function as the `sparse_embedding_function`, the system can rank text fields using the BM25 algorithm.

In this section, we will demonstrates how to implement RAG with full-text search using BM25.

In [None]:
from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.vector_stores.milvus import MilvusVectorStore
from llama_index.vector_stores.milvus.utils import BM25BuiltInFunction
from llama_index.core import Settings

# Skip dense embedding model
Settings.embed_model = None

# Build Milvus vector store creating a new collection
vector_store = MilvusVectorStore(
    uri=URI,
    # token=TOKEN,
    enable_dense=False,
    enable_sparse=True,
    sparse_embedding_function=BM25BuiltInFunction(),
    overwrite=True,
)

# Store documents in Milvus
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)

2025-03-26 08:19:21,240 [DEBUG][_create_connection]: Created new connection using: d1d2f906f03c45589092809db07c8ac4 (async_milvus_client.py:547)


Embeddings have been explicitly disabled. Using MockEmbedding.


The above code inserts example documents into Milvus and builds index to enable BM25 ranking for full-text search. It disables dense embedding and utilizes `BM25BuiltInFunction` with default arguments in the Milvus vector store.

You can specify the input and output fields for this function in the parameters of the `BM25BuiltInFunction`:

- `input_field_names (str)`: The name of the input field, defaults to "text". It indicates which text field the BM25 algorithm applied to. Change this if using your own collection with a different name of the text field.
- `output_field_names (str)`: The name of the output field, default is "sparse_embedding". It indicates which field this function outputs the computed result to.

Now we have the vector store ready for retrieval. To query with full-text search through Milvus vector store, select the query mode of "sparse" or "text_search". Let's test with a sample question:

In [None]:
import textwrap

query_engine = index.as_query_engine(
    vector_store_query_mode="sparse", similarity_top_k=5  # or "text_search"
)
answer = query_engine.query("What did the author learn?")
print(textwrap.fill(str(answer), 100))

The author learned that the traditional approach to artificial intelligence, which involved explicit
data structures representing concepts, was not effective in truly understanding natural language.
This realization led the author to shift his focus to Lisp and eventually write a book about Lisp
hacking. Additionally, the author learned the importance of focusing on projects that align with
one's strengths and interests, as attention is a zero-sum game and choosing the right project is
crucial to maximizing productivity and success.


#### Customize text analyzer

Analyzers are essential in full-text search by breaking the sentence into tokens and performing lexical analysis like stemming and stop word removal. They are typically language-specific. To learn more about Milvus analyzers, you can refer to the [guide](https://milvus.io/docs/analyzer-overview.md#Analyzer-Overview).

Milvus supports two types of analyzers: Built-in Analyzers and Custom Analyzers. By default, the `BM25BuiltInFunction` will use the standard built-in analyzer, which is the most basic analyzer that tokenizes the text with punctuation.

To use a different analyzer or customize the current one, you can pass value to the argument `analyzer_params`:

In [None]:
bm25_function = BM25BuiltInFunction(
    analyzer_params={
        "tokenizer": "standard",
        "filter": [
            "lowercase",  # Built-in filter
            {"type": "length", "max": 40},  # Custom filter
            {"type": "stop", "stop_words": ["of", "to"]},  # Custom filter
        ],
    },
    enable_match=True,
)

### Hybrid Search with reranker

Furthermore, we are able to build an optimized RAG system with hybrid search combining semantic search and full-text search.

The following example uses OpenAI embedding for semantic search and BM25 for full-text search:

In [None]:
# Create index over the documnts
vector_store = MilvusVectorStore(
    uri=URI,
    # token=Token,
    enable_dense=True,  # by default
    dim=1536,
    enable_sparse=True,
    sparse_embedding_function=BM25BuiltInFunction(),
    overwrite=True,
    hybrid_ranker="RRFRanker",  # by default
    hybrid_ranker_params={},  # by default
)

storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
    documents,
    storage_context=storage_context,
    embed_model="default",  # "default" will use OpenAI embedding
)

2025-03-26 08:19:56,999 [DEBUG][_create_connection]: Created new connection using: b44cc978f84e4b7799832a81ffb7f99d (async_milvus_client.py:547)


The above code stores documents in Milvus collection with both dense and sparse embedding fields. The dense field is used for OpenAI embedding outputs and the sparse field takes the sparse embedding function outputs. In this case, we use `BM25BuiltInFunction` as the sparse embedding function to allow full-text search.

In this example, we use the reranking strategy "RRFRanker" with its default parameters. To customize reranker, you are able to configure `hybrid_ranker` and `hybrid_ranker_params`. For more details, you can refer to [Milvus Reranking](https://milvus.io/docs/reranking.md).

Now let's test the RAG system with a sample question:

In [None]:
# Query
query_engine = index.as_query_engine(
    vector_store_query_mode="hybrid", similarity_top_k=5
)
answer = query_engine.query("What did the author learn?")
print(textwrap.fill(str(answer), 100))

The author learned about programming on early computers like the IBM 1401 and later on
microcomputers like the TRS-80. They also learned about Lisp programming and its association with
Artificial Intelligence. Additionally, the author learned about the limitations of traditional AI
approaches and the importance of focusing on practical applications and helping startups as an angel
investor through their experiences with Y Combinator.
