![Top <](./images/watsonxdata.png "watsonxdata")

# Lab 5: Hybrid Multi-vector queries

Milvus allows you to search for objects using multiple types of information, such as text, images, and audio. This is called hybrid or multi-vector search. It combines searches across different fields to enhance the search experience. This labs shows how this can be done.

The first steps for creating and loading a database are similar to lab 1, 2, 3 and 4. Nevertheless you should execute them carefullly since we are creating an additional vector this time. In additional we are using slightly different functions for most operations by using the MilvusClient interface. The querying will start in the section **Hybrid Multi-Vector Queries with Milvus**. 

The first step is to make sure that the Milvus extensions are loaded into the notebook.

In [None]:
!pip install pymilvus

### We check the version since most of the rest of the notebook requires a current pymilvus version

In [None]:
import pymilvus
print(pymilvus.__version__)
print (dir(pymilvus))

## Local Connection

A local connection assumes that you are running your Jupyter notebook inside the same server that is running watsonx.data and the Milvus server. The connection user is the default watsonx.data userid (ibmlhadmin). You need to generate the certificate that will be used by the connection.

### Generate the Connection Certificate

In [None]:
!rm -f /tmp/presto.cert
!echo QUIT | openssl s_client -showcerts -connect localhost:8443 | awk '/-----BEGIN CERTIFICATE-----/ {p=1}; p; /-----END CERTIFICATE-----/ {p=0}' > /tmp/presto.crt

In [None]:
rc = %system echo QUIT | openssl s_client -showcerts -connect watsonxdata:8443 | \
        awk '/-----BEGIN CERTIFICATE-----/ {p=1}; p; /-----END CERTIFICATE-----/ {p=0}' > /tmp/presto.crt 

### Local Connection Parameters

Please change the values for apiuser and apikey to the values provided in the lab guide.

In [None]:
host            = 'watsonxdata'
port            = 19530
apiuser         = 'xxxxxxxxxx'
apikey          = 'xxxxxxxx'
server_pem_path = '/tmp/presto.crt'

## Milvus Connection

### We use MilvusClient instead of the connection function of the ORM API this time

In [None]:
from pymilvus import MilvusClient
print (dir(MilvusClient))

In [None]:
from pymilvus import MilvusClient

client = MilvusClient(
    uri=f"http://{host}:{port}",
    token=f"{apiuser}:{apikey}",
    server_pem_path=server_pem_path,
    secure=True
)

### Check Connection Status

In [None]:
from pymilvus import connections

print(f"\nList connections:")
print(connections.list_connections())

## Create a Collection in Milvus
This code will drop the wiki_articles collection if it exists, and then recreate it. This script should return the following text.
```
Status(code=0, message=)
```

#### Make various unitilty commands available

In [None]:
from pymilvus import utility

#### Clean up previous collection if one already exists

In [None]:
client.drop_collection("wiki_articles")

### Create a sample collection

#### Define the schema for our collection 

Since we want to perform a hybrid query which means that the query involves several vectors, we define two vector fields besides the scalar fields in our collection. The two vectors which we define are of the same type, but in general they can be very different. Some vectors can be dense vectors, other vectors can be sparse vetors. They can have different dimensions and can represent differengt kinds of data like text, audio, video, or images.

In our case the field "vector" is a representation of a chunk of a Wikipedia article like in the previous labs. We assume that we have textual representation of emotions (review comments) on each chunk of data. We present these emotions with a vector field with name "vectoremotion". 

In [None]:
from pymilvus import DataType

schema = MilvusClient.create_schema(
    auto_id=False,
    enable_dynamic_field=False,
)

schema.add_field(field_name="id", datatype=DataType.INT64, is_primary=True) # Primary key
schema.add_field(field_name="article_text", datatype=DataType.VARCHAR, max_length=2500)
schema.add_field(field_name="article_title", datatype=DataType.VARCHAR, max_length=200)
schema.add_field(field_name="article_subtopic", datatype=DataType.VARCHAR, max_length=10)
schema.add_field(field_name="emotion", datatype=DataType.VARCHAR, max_length=30)
schema.add_field(field_name="vector", datatype=DataType.FLOAT_VECTOR, dim=384)
schema.add_field(field_name="vectoremotion", datatype=DataType.FLOAT_VECTOR, dim=384)

#### Check which collections already exist

In [None]:
res = client.list_collections()

print(res)

#### Create indexes for the two vector fields of our collection

Since we want to query to vector fields at the same time, we have to index both of these vector fields. In general a hybrid query query can involve many vector fields (of different dimensions with different kinds of indexes. In our case we choose for simplicity reasons the same kind on vector index (IVF_FLAT) for both columns.

- metric_type specifies the distance metric used in the vector space. L2 is the Euclidian distance.
- index_type specifies the type of vector index to use. IVF means inverted file index which means clusting the the vector space and representing each cluster by its centroid. FLAT means that vectors are stored directly without any compression or quantization meaning that precise distance calculations are possible
- params specifies several parameters relevant for our index. For instance nlist defines the number clusters to use for the inverted file index. 

In [None]:
from pymilvus import MilvusClient

index_params = client.prepare_index_params()

index_params.add_index(
    field_name="vector",
    index_name="vector_index",
    index_type="IVF_FLAT",
    metric_type="L2",
    params={"nlist":2048}
)

index_params.add_index(
    field_name="vectoremotion",
    index_name="vectoremotion_index",
    index_type="IVF_FLAT",
    metric_type="L2",
    params={"nlist":2048}
)

#### After having prepared the schema and the index definitions for our collection we can create the collection.

In [None]:
client.create_collection(collection_name="wiki_articles", schema=schema, index_params=index_params)

#### Double Check that the Schema Exists

In [None]:
res = client.list_collections()

print(res)

## Get data from Wikipedia for loading into our collection

This is done in the same way as for labs 1 to 4.

In [None]:
import wikipedia
import pandas as pd
import warnings
warnings.filterwarnings('ignore')

# search
search_results = wikipedia.search("Climate")

articles = []
for i in range (0,len(search_results)):
    try:
        summary = wikipedia.summary(search_results[i],auto_suggest=False)
    except Exception as err:
        print(f"Skipped article '{search_results[i]}' skipped because of ambiguity.")
        continue
    try:
        page = wikipedia.page(search_results[i],auto_suggest=False).content
    except Exception as err:
        print(f"Skipped article '{search_results[i]}' skipped because of ambiguity.")
        continue

    
    articles.append({
        "title"   : search_results[i],
        "summary" : summary,
        "page"    : page
    })

#print(display_articles)

df = pd.DataFrame.from_dict(articles)
df.style.set_properties(**{'text-align': 'left'})
print(df)

In [None]:
print(articles)

## Emotions for chunks

In this lab we will assume that short emotional comments will exist for each chunk of a wikepedia article. We will use that to store a second vector with these emotions so that we can later perform a search involving two different vectors.

### Prepare some emotion data

In [None]:
emotionlist=["Very good!", "Highly recommended!", "Great quality!", "Excellent!", "Will not read again!", 
             "Meets my needs", "Good product", "As expected","Average quality", "Does the job", "Just alright", 
             "Nothing special","Disappointed!", "Expected more", "OK", "Mediocre","Terrible!", 
             "Not recommended!", "Never again!", "Worst ever"]
el_len=len(emotionlist)
print(el_len)

## Split Articles into chunks

### Define function for splitting article into chunks

In [None]:
# Chunk data
def split_into_chunks(text, chunk_size):
    words = text.split()
    return [' '.join(words[i:i + chunk_size]) for i in range(0, len(words), chunk_size)]

### Create list of chunks for all articles and create analog list for additional metadata corresponding to the chunk (title, subtopic)

We also create emotions for the chunks by randomly assigning one of the emotions to each chunk

In [None]:
from random import seed, randrange

seed(0)

chunk_size=255
passages=[]
passages_titles=[]
passages_subtopic=[]
passages_emotion=[]

for a in articles:
    print('title',a['title'])
    if a['title'] == "Climate":
        subtopic="false"
    else:
        subtopic="true"
    
    p = a['page']
    cl = split_into_chunks(p,chunk_size)
    
    print("number of chunks=",len(cl))
    for c in cl:
        passages.append(c)
        passages_titles.append(a['title'])
        passages_subtopic.append(subtopic)
        r=randrange(0,20)
        passages_emotion.append(emotionlist[r])

### Create the embeddings for the chunks

In [None]:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2') # 384 dim

passages_embeddings = model.encode(passages)
passages_emotion_embeddings = model.encode(passages_emotion)

### Insert all data into the collection created above

In [None]:
data = []

# create a list of dictionary with each dirctionary corresponding to a row in the collection which we want to insert into.
for i in range(0,len(passages)):
   data.append({
       "id": i, 
       "article_text": passages[i], 
       "article_title": passages_titles[i], 
       "article_subtopic": passages_subtopic[i],
       "emotion": passages_emotion[i],
       "vector": passages_embeddings[i],
       "vectoremotion": passages_emotion_embeddings[i]
   })

# insert the data from the above list into our collection
client.insert(collection_name="wiki_articles",data=data)

# make sure that the data is written to external storage 
client.flush(
    collection_name="wiki_articles"
)

print("Done")

## Hybrid Multi-Vector Queries with Milvus 

The following code shows how you can perform a hybrid multi-vector search

### Load the Collection into memory and check that the Collection has been Loaded

In [None]:
# load the collection
client.load_collection(
    collection_name="wiki_articles"
)

### Check whether the collection is actually loaded

In [None]:
res = client.get_load_state(
    collection_name="wiki_articles"
)

print(res)

### Check how many rows got loaded

In [None]:
client.get_collection_stats(collection_name="wiki_articles")

## Query Milvus & Prompt LLM
After gathering the data from Wikipedia and then vectorizing it and inserting into Milvus, we are now ready to perform queries against the vector database. We will use the `sentence-transformers/all-MiniLM-L6-v2` model to generate the query vector and then use Milvus to find the most similar vectors in the database.

### Create a Query Function
The following function will be used to query the collection with a hybrid query. A hybrid query includes more than one vector column (in our example it will be two vector columns). This can for instance be used to combine the search on image data with the search on text data. In our case we will combine search on the article with emotions saved for the different parts of the article. Of course the results of the queries of the different vectors will have to be merged. This is done by reranking the results of the individual vector searchs. There are several rerankers available which can place different weights on the results of the individual queries.

We use the Reciprocal Rank Fusion (RRF) Ranker. This is a reranking strategy for Milvus hybrid search that balances results from multiple vector searches based on their ranking positions rather than their raw similarity scores. 
RRF Ranker combines search results based on how highly each item ranks in each individual query, creating a fair and balanced final ranking.

#### Get the ranker first

In [None]:
from pymilvus import RRFRanker
# The parameter of RRFRanker is a smoothing parameter. It must be in  the range 0 to 16384. It is recommended having a value between 10 and 100.
ranker = RRFRanker(100)

In [None]:
from sentence_transformers import SentenceTransformer
from pymilvus import AnnSearchRequest

def hybrid_query_milvus(query, query_emotion, num_results=5):
    #print("query=<",query,"> query_emotion=<",query_emotion,"> num_results=<",num_results,">")
    # Vectorize query and query_emotion
    model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2') # 384 dim
    query_embeddings = model.encode([query])
    query_emotion_embeddings = model.encode([query])
    #print("query_embeddings",query_embeddings)
    #print("query_emotion_embeddings",query_emotion_embeddings)

    # Prepare the searches for the two vectors in our collection
    # data is the search vector (created via embedding of the original query text) 
    # anns_field specifies the name of the vector field we are searching in
    # params can have indiviual parameters relevant for the particular index used
    # limit determines how many results should be returned
    
    search_param_1 = {
        "data": query_embeddings,
        "anns_field": "vector",
        "param": {"nprobe": 10},
        "limit": num_results
    }
    request_1 = AnnSearchRequest(**search_param_1)
    
    search_param_2 = {
        "data": query_emotion_embeddings,
        "anns_field": "vectoremotion",
        "param": {"nprobe": 10},
        "limit": num_results
    }
    request_2 = AnnSearchRequest(**search_param_2)

    reqs = [request_1, request_2]
    # print(reqs)    

    results = client.hybrid_search(
        collection_name="wiki_articles",
        reqs=reqs,
        ranker=ranker,
        limit=num_results,
        output_fields=['article_text','emotion']
    )

    return results

### Suggestions for querys

For the hybrid query we need a query on the Wikipedia articles (question_text) and a query on the emotions (emotion_text) which will be combined. Uncomment one of the examples in each of the two catagories. 

In [None]:
question_text = "What can my company do to help fight climate change?"
#question_text = "How do businesses negatively effect climate change?"
#question_text = "What can a businesses do to have a positive effect on climate change?"
#question_text = "How can a business reduce their carbon footprint?"

emotion_text = "very good"
#emotion_text = "this is ok"
#emotion_text = "bad"

### Search a Question in Milvus

We want to use the above question_text and emotion_text to perform a approximate nearest neighbor search in Milvus.  The top 3 related chunks are retrieved below and can be used for a large language prompt.



In [None]:
num_results = 3

results = hybrid_query_milvus(question_text, emotion_text, num_results)

## Display result

The documents that best match the question are now displayed in the list below.

In [None]:
for hits in results:
    print("TopK results:")
    for hit in hits:
        print(hit)

In [None]:
import re
display_articles = []
relevant_chunks  = []
for i in range(num_results):
    display_articles.append({
        "ID"      : results[0].ids[i],
        "Distance": results[0].distances[i],
        "Emotion": results[0][i].entity.get('emotion'),
        "Article" : re.sub(r"^.*?\. (.*\.).*$",r"\1",results[0][i].entity.get('article_text'))        
    })
    relevant_chunks.append(re.sub(r"^.*?\. (.*\.).*$",r"\1",results[0][i].entity.get('article_text')))

df = pd.DataFrame.from_dict(display_articles).sort_values("Distance",ascending=False)
df.style.set_properties(**{'text-align': 'left'}).set_caption(question_text + " / " + emotion_text).set_table_styles([{
    'selector': 'caption',
    'props': [
        ('color', 'blue'),
        ('font-size', '20px')
    ]
}])

#### Credits: IBM 2025, Wilfried Hoge [hoge@de.ibm.com] and Andreas Weininger [andreas.weininger@de.ibm.com] based on a notebook by George Baklarz [baklarz@ca.ibm.com]