# Milvus Hybrid Search

>[Milvus](https://milvus.io/docs/overview.md) is a database that stores, indexes, and manages massive embedding vectors generated by deep neural networks and other machine learning (ML) models.

This notebook shows how to use functionality related to the Milvus>=2.4 vector database's [hybrid_search](https://milvus.io/api-reference/pymilvus/v2.4.x/ORM/Collection/hybrid_search.md). Beside `hybrid_search`, `MilvusHybridSearchRetriever` also supports normal single (dense or sparse) vector search, and Milvus's [scalar filtering query](https://milvus.io/docs/get-and-scalar-query.md). Note that you can also use langchain's [Milvus](https://python.langchain.com/docs/integrations/vectorstores/milvus/) vector store to do normal single vector search.

To run, you should have a [Milvus instance up and running](https://milvus.io/docs/install_standalone-docker.md), Please make sure your Milvus instance version>=2.4.

## Installation

First, you need to install `pymilvus` python package.

In [1]:
%pip install --upgrade --quiet pymilvus>=2.4.0

Note: you may need to restart the kernel to use updated packages.


## Examples

### Basic Usage

Milvus vector database support both dense vector search and [sparse vector search](https://milvus.io/docs/sparse_vector.md). To use sparse vector search, we need a sparse embedding model, which embed text to sparse vector. Sparse vector is a high dimension vector, but almost all element is `0.0`. Let's first define a simple sparse embedding model, we will use a realworld [BGE-M3](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3) sparse embedding model later.

In [1]:
from typing import Dict, List
import random

from langchain_core.embeddings import Embeddings
from langchain_community.retrievers.milvus_hybrid_search import SparseEmbeddings

class FakeSparseEmbeddings(SparseEmbeddings):
    def embed_documents(self, texts: List[str]) -> List[Dict[int, float]]:
        n = 100
        sparse_vectors = []
        for text in texts:
            vector_dict = {}
            k = random.randint(0, 4)
            for i in range(k):
                vector_dict[random.randint(0, n)] = random.random()
            # Hack: This maybe Milvus's bug, which cannot accept an all zero sparse vector.
            if not vector_dict:
                vector_dict = {0: 0.000001}
            sparse_vectors.append(vector_dict)
        return sparse_vectors

    def embed_query(self, text: str) -> List[Dict[int, float]]:
        return self.embed_documents([text])[0]

In [2]:
sparse_embeddings = FakeSparseEmbeddings()

In [3]:
sparse_embeddings.embed_documents(["text-a", "text-b"])

[{0: 1e-06}, {0: 1e-06}]

To use hybrid search, we need one more embedding model, so we define a normal dense embedding as following.

Note: to use hybrid search, you can use more than two embedding model, no matter they are sparse or dense embedding models.

In [4]:
class FakeDenseEmbeddings(Embeddings):
    def embed_documents(self, texts: List[str]) -> List[List[float]]:
        return [[random.random() for i in range(3)] for text in texts]
    
    def embed_query(self, text: str) -> List[float]:
        return self.embed_documents([text])[0]

In [5]:
dense_embeddings = FakeDenseEmbeddings()

In [6]:
dense_embeddings.embed_documents(["text-a", "text-b"])

[[0.8816866061850618, 0.5076061045442858, 0.42096944725175933],
 [0.59377710604117, 0.5369374928633824, 0.1301035265644469]]

Now, we can use hybrid search by using `MilvusHybridSearchRetriever`, the process of hybrid search under the hood is independent perform vector search for each embedding model, then a rerank model will combine these results to the final result. At the writing time, milvus support RRF (Reciprocal Rank Fusion) and Weighted rerank model.

In [7]:
from langchain_community.retrievers.milvus_hybrid_search import MilvusHybridSearchRetriever

In [8]:
retriever = MilvusHybridSearchRetriever(
    embedding_functions={
        "dense": FakeDenseEmbeddings()
    },
    sparse_embedding_functions={
        "sparse": FakeSparseEmbeddings()
    },
    drop_old=True  # drop the aleady exist collection named "LangChainCollection"
)

In [9]:
retriever.add_texts(
    ["a", "b", "c", "d", "e"],
    ids=["id_a", "id_b", "id_c", "id_d", "id_e"]
)

['id_a', 'id_b', 'id_c', 'id_d', 'id_e']

In [10]:
docs = retriever.get_relevant_documents(query="d", k=3)

In [11]:
docs

[Document(page_content='b'),
 Document(page_content='d'),
 Document(page_content='a')]

To inspect the detail of search and rerank params, you can use the debug mode like this

In [12]:
import logging
from langchain_community.retrievers.milvus_hybrid_search import logger
logger.setLevel(logging.DEBUG)
handler = logging.StreamHandler()
handler.setLevel(logging.DEBUG)
logger.addHandler(handler)

In [13]:
docs = retriever.get_relevant_documents(query="d", k=3)

vector search reqs:
anns_field: dense, param: {'param': {'metric_type': 'L2', 'params': {'ef': 10}}}, limit: 3, expr: None
anns_field: sparse, param: {'param': {'metric_type': 'IP', 'params': {'drop_ratio_search': 0.0}}}, limit: 3, expr: None
rerank: {'strategy': 'rrf', 'params': {'k': 60.0}}


### Try different index and search type

TODO: `MilvusHybridSearchRetriever.search_params` is unmatched to `MilvusHybridSearchRetriever.get_relevant_documents`'s param `search_params`, confusing.

In [14]:
docs = retriever.get_relevant_documents(
    query="d",
    k=3,
    search_params={
        "dense": {
            "param": {"metric_type": "L2", "params": {'ef': 2}},
            "limit": 2,
        },
    }
)

vector search reqs:
anns_field: dense, param: {'metric_type': 'L2', 'params': {'ef': 2}}, limit: 2, expr: None
anns_field: sparse, param: {'param': {'metric_type': 'IP', 'params': {'drop_ratio_search': 0.0}}}, limit: 3, expr: None
rerank: {'strategy': 'rrf', 'params': {'k': 60.0}}


In [15]:
retriever = MilvusHybridSearchRetriever(
    embedding_functions={
        "dense": FakeDenseEmbeddings()
    },
    sparse_embedding_functions={
        "sparse": FakeSparseEmbeddings()
    },
    drop_old=True,  # drop the aleady exist collection named "LangChainCollection",
    index_params={
        "dense": {
            "metric_type": "COSINE",
            "index_type": "FLAT",
            "params": {}
        }
    },
    search_params={
        "dense": {
            "metric_type": "COSINE",
            "params": {},
        }
    }
)

Using previous connection: 51880f1947ce401eab1b8ef93726d97d


In [16]:
retriever.add_texts(
    ["a", "b", "c", "d", "e"],
    ids=["id_a", "id_b", "id_c", "id_d", "id_e"]
)

Successfully created an index on collection: LangChainCollection, field_name: dense
Successfully created an index on collection: LangChainCollection, field_name: sparse


['id_a', 'id_b', 'id_c', 'id_d', 'id_e']

In [17]:
docs = retriever.get_relevant_documents(
    query="d",
    k=2
)

vector search reqs:
anns_field: dense, param: {'metric_type': 'COSINE', 'params': {}}, limit: 2, expr: None
anns_field: sparse, param: {'param': {'metric_type': 'IP', 'params': {'drop_ratio_search': 0.0}}}, limit: 2, expr: None
rerank: {'strategy': 'rrf', 'params': {'k': 60.0}}


### Using subset of vector fields to retrieve

TODO

### A realworld example: BGE-M3 dense and sparse hybrid search

TODO