# Milvus Hybrid Search

>[Milvus](https://milvus.io/docs/overview.md) is a database that stores, indexes, and manages massive embedding vectors generated by deep neural networks and other machine learning (ML) models.

This notebook shows how to use functionality related to the Milvus>=2.4 vector database's [hybrid_search](https://milvus.io/api-reference/pymilvus/v2.4.x/ORM/Collection/hybrid_search.md). Beside `hybrid_search`, `MilvusHybridSearchRetriever` also supports normal single (dense or sparse) vector search, and Milvus's [scalar filtering query](https://milvus.io/docs/get-and-scalar-query.md). Note that you can also use langchain's [Milvus](https://python.langchain.com/docs/integrations/vectorstores/milvus/) vector store to do normal single vector search.

To run, you should have a [Milvus instance up and running](https://milvus.io/docs/install_standalone-docker.md), Please make sure your Milvus instance version>=2.4.

## Installation

First, you need to install `pymilvus` python package.

In [None]:
%pip install --upgrade --quiet pymilvus>=2.4.0

## Examples

### Basic Usage

Milvus vector database support both dense vector search and [sparse vector search](https://milvus.io/docs/sparse_vector.md). To use sparse vector search, we need a sparse embedding model, which embed text to sparse vector. Sparse vector is a high dimension vector, but almost all element is `0.0`. Let's first define a simple sparse embedding model, we will use a realworld [BGE-M3](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3) sparse embedding model later.

In [1]:
from typing import Dict, List
import random

from langchain_core.embeddings import Embeddings
from langchain_community.retrievers.milvus_hybrid_search import SparseEmbeddings

class FakeSparseEmbeddings(SparseEmbeddings):
    def embed_documents(self, texts: List[str]) -> List[Dict[int, float]]:
        random.seed(0)
        n = 100  # sparse vector's dimension
        sparse_vectors = []
        for text in texts:
            vector_dict = {}
            k = random.randint(0, 4)
            for i in range(k):
                vector_dict[random.randint(0, n)] = random.random()
            # Hack: This maybe Milvus's bug, which cannot accept an all zero sparse vector.
            if not vector_dict:
                vector_dict = {0: 0.000001}
            sparse_vectors.append(vector_dict)
        return sparse_vectors

    def embed_query(self, text: str) -> List[Dict[int, float]]:
        return self.embed_documents([text])[0]

In [2]:
sparse_embeddings = FakeSparseEmbeddings()

In [3]:
sparse_embeddings.embed_documents(["text-a", "text-b"])

[{53: 0.7579544029403025, 65: 0.04048437818077755, 100: 0.48592769656281265},
 {45: 0.9677999949201714, 27: 0.5833820394550312}]

To use hybrid search, we need one more embedding model, so we define a normal dense embedding as following.

Note: to use hybrid search, you can use more than two embedding models, no matter they are sparse or dense embedding models.

In [4]:
class FakeDenseEmbeddings(Embeddings):
    def embed_documents(self, texts: List[str]) -> List[List[float]]:
        random.seed(42)
        return [[random.random() for i in range(3)] for text in texts]
    
    def embed_query(self, text: str) -> List[float]:
        return self.embed_documents([text])[0]

In [5]:
dense_embeddings = FakeDenseEmbeddings()

In [6]:
dense_embeddings.embed_documents(["text-a", "text-b"])

[[0.6394267984578837, 0.025010755222666936, 0.27502931836911926],
 [0.22321073814882275, 0.7364712141640124, 0.6766994874229113]]

Now, we can use hybrid search by using `MilvusHybridSearchRetriever`, the process of hybrid search under the hood is independent perform vector search for each embedding model, then a rerank model will combine these results to the final result. At the writing time, milvus support RRF (Reciprocal Rank Fusion) and Weighted rerank model.

In [7]:
from langchain_community.retrievers.milvus_hybrid_search import MilvusHybridSearchRetriever

In [8]:
retriever = MilvusHybridSearchRetriever(
    embedding_functions={
        "dense": FakeDenseEmbeddings()
    },
    sparse_embedding_functions={
        "sparse": FakeSparseEmbeddings()
    },
    drop_old=True,  # drop the aleady exist collection named "LangChainCollection"
)

In [9]:
retriever.add_texts(
    ["a", "b", "c", "d", "e"],
    ids=["id_a", "id_b", "id_c", "id_d", "id_e"]
)

['id_a', 'id_b', 'id_c', 'id_d', 'id_e']

In [10]:
docs = retriever.invoke(input="d", k=3)

In [11]:
docs

[Document(page_content='d'),
 Document(page_content='c'),
 Document(page_content='a')]

To inspect the detail of search and rerank params, you can use the debug mode like this

In [12]:
import logging
from langchain_community.retrievers.milvus_hybrid_search import logger
logger.setLevel(logging.DEBUG)
handler = logging.StreamHandler()
handler.setLevel(logging.DEBUG)
logger.addHandler(handler)

In [13]:
docs = retriever.invoke(input="d", k=3)

vector search reqs:
anns_field: dense, param: {'param': {'metric_type': 'L2', 'params': {'ef': 10}}}, limit: 3, expr: None
anns_field: sparse, param: {'param': {'metric_type': 'IP', 'params': {'drop_ratio_search': 0.0}}}, limit: 3, expr: None
rerank: {'strategy': 'rrf', 'params': {'k': 60.0}}


### Try different index and search type

Milvus provides several [index types](https://milvus.io/docs/index.md) for vector fields, but be careful that `search_params` should be compatible with `index_params`. When using `MilvusHybridSearchRetriever`, you can try different index types via `index_params`, the corresponding compatible and default `search_params` is likely out of box. Here are some examples:

In [14]:
retriever = MilvusHybridSearchRetriever(
    embedding_functions={
        "dense": FakeDenseEmbeddings()
    },
    sparse_embedding_functions={
        "sparse": FakeSparseEmbeddings()
    },
    drop_old=True,  # drop the aleady exist collection named "LangChainCollection",
    index_params={
        "dense": {
            "metric_type": "COSINE",
            "index_type": "IVF_FLAT",
            "params": {"nlist": 32}
        }
    },
)

Using previous connection: f04753763e6b4fcb8dc783d1af7763aa


In [15]:
retriever.add_texts(
    ["a", "b", "c", "d", "e"],
    ids=["id_a", "id_b", "id_c", "id_d", "id_e"]
)

Successfully created an index on collection: LangChainCollection, field_name: dense
Successfully created an index on collection: LangChainCollection, field_name: sparse


['id_a', 'id_b', 'id_c', 'id_d', 'id_e']

In [16]:
retriever.search_params

{'dense': {'param': {'metric_type': 'COSINE', 'params': {'nprobe': 10}}},
 'sparse': {'param': {'metric_type': 'IP',
   'params': {'drop_ratio_search': 0.0}}}}

You can also manually specific a compatible `search_params` like this:

In [17]:
retriever = MilvusHybridSearchRetriever(
    embedding_functions={
        "dense": FakeDenseEmbeddings()
    },
    sparse_embedding_functions={
        "sparse": FakeSparseEmbeddings()
    },
    drop_old=True,  # drop the aleady exist collection named "LangChainCollection",
    index_params={
        "dense": {
            "metric_type": "COSINE",
            "index_type": "IVF_FLAT",
            "params": {"nlist": 32}
        }
    },
    search_params={
        "dense": {
            "metric_type": "COSINE",
            "params": {"nprobe": 5},
        }
    }
)

Using previous connection: f04753763e6b4fcb8dc783d1af7763aa


In [18]:
retriever.add_texts(
    ["a", "b", "c", "d", "e"],
    ids=["id_a", "id_b", "id_c", "id_d", "id_e"]
)

Successfully created an index on collection: LangChainCollection, field_name: dense
Successfully created an index on collection: LangChainCollection, field_name: sparse


['id_a', 'id_b', 'id_c', 'id_d', 'id_e']

Lastly, you can also manually specific a compatible `ann_search_params` for this single search process like this:

In [19]:
docs = retriever.invoke(
    "d",
    k=3,
    ann_search_params={
        "dense": {
            "param": {"metric_type": "COSINE", "params": {"nprobe": 3}},  # should be compatible with index params
            "limit": 2,
        },
        "sparse": {
            "limit": 4,
        }
    }
)

vector search reqs:
anns_field: dense, param: {'metric_type': 'COSINE', 'params': {'nprobe': 3}}, limit: 2, expr: None
anns_field: sparse, param: {'param': {'metric_type': 'IP', 'params': {'drop_ratio_search': 0.0}}}, limit: 4, expr: None
rerank: {'strategy': 'rrf', 'params': {'k': 60.0}}


### Using subset of vector fields to retrieve

#### single vector search

In [21]:
docs = retriever.invoke(
    "d",
    k=2,
    ann_search_params={"dense": {}},
    include_other_fields=False  # important: avoid to involve other vector fields not present in ann_search_params
)

using single vector search on dense


In [22]:
docs

[Document(page_content='a'), Document(page_content='c')]

#### scalar filtering query

In [23]:
docs = retriever.invoke(
    "d",
    k=2,
    include_other_fields=False,
    expr="pk in ['id_a', 'id_b', 'id_c']",
)

using query, expr=pk in ['id_a', 'id_b', 'id_c']


In [24]:
docs

[Document(page_content='a'), Document(page_content='b')]

### A realworld example: BGE-M3 dense and sparse hybrid search

This section will show you how to use BGE-M3 hybrid search via `MilvusHybridSearchRetriever` like this [example](https://github.com/milvus-io/pymilvus/blob/master/examples/hello_hybrid_sparse_dense.py)

In [24]:
%pip install FlagEmbedding --quiet

Note: you may need to restart the kernel to use updated packages.


In [None]:
from FlagEmbedding import BGEM3FlagModel

model = BGEM3FlagModel('BAAI/bge-m3', use_fp16=True)

In [26]:
class BGEM3SparseEmbeddings(SparseEmbeddings):
    def __init__(self, model):
        self.model = model
    
    def embed_documents(self, texts: List[str]) -> List[Dict[int, float]]:
        weights = self.model.encode(
            texts,
            batch_size=4,
            max_length=8192,
            return_dense=False,
            return_sparse=True
        )['lexical_weights']
        sparse_vectors = [
            {int(k): float(v) for k, v in weight_dict.items()}
            for weight_dict in weights
        ]
        return sparse_vectors
    
    def embed_query(self, text: str) -> Dict[int, float]:
        return self.embed_documents([text])[0]

In [27]:
class BGEM3DenseEmbeddings(Embeddings):
    def __init__(self, model):
        self.model = model
    
    def embed_documents(self, texts: List[str]) -> List[List[float]]:
        dense_vectors = self.model.encode(
            texts,
            batch_size=4,
            max_length=8192,
            return_dense=True,
        )['dense_vecs'].tolist()
        return dense_vectors
    
    def embed_query(self, text: str) -> Dict[int, float]:
        return self.embed_documents([text])[0]

In [28]:
texts = ["foo", "bar", "food", "bare", "hi", "hello"]

In [31]:
retriever = MilvusHybridSearchRetriever(
    embedding_functions={
        "dense": BGEM3DenseEmbeddings(model)
    },
    sparse_embedding_functions={
        "sparse": BGEM3SparseEmbeddings(model)
    },
    drop_old=True,
    auto_id=True,  # if set `auto_id=True`, then you don't need pass `ids` to `add_texts`
    rerank_params={"type": "Weighted", "param": {"weights": {"dense": 0.7, "sparse": 0.3}}}
)

Using previous connection: f04753763e6b4fcb8dc783d1af7763aa


In [32]:
retriever.add_texts(texts)

Successfully created an index on collection: LangChainCollection, field_name: dense
Successfully created an index on collection: LangChainCollection, field_name: sparse


[449456496514633207,
 449456496514633208,
 449456496514633209,
 449456496514633210,
 449456496514633211,
 449456496514633212]

In [33]:
retriever.invoke(
    "fruit",
    k=3,
)

vector search reqs:
anns_field: dense, param: {'param': {'metric_type': 'L2', 'params': {'ef': 10}}}, limit: 3, expr: None
anns_field: sparse, param: {'param': {'metric_type': 'IP', 'params': {'drop_ratio_search': 0.0}}}, limit: 3, expr: None
rerank: {'strategy': 'weighted', 'params': {'weights': [0.7, 0.3]}}


[Document(page_content='food'),
 Document(page_content='hi'),
 Document(page_content='hello')]