# Perform Sample Searches on Vector Database 

## Introduction

This notebook provides an optional exercise for exploring document search within a vector database. Experiment with different search techniques using **Elasticsearch or Milvus or Datastax** as your chosen database.

This notebook demonstrates how to perform sample searches using various techniques and Analyze the search results to understand the effectiveness of different methods.
While this notebook is optional, it can be valuable for gaining hands-on experience with vector databases and document search techniques. If you prefer, you can skip this exercise and proceed directly to **`Create and Deploy QnA AI Service`**

In this notebook, we will cover the following actions:

- Establishing a connection to the chosen vector database (Elasticsearch or Milvus or Datastax).
- If your connection type is Elastic Search 
    - Using an Elastic Learned Sparse Encoder (ELSER)/ Dense model (like E5 multilingual) and LangChain to search and retrieve relevant documents based on specific queries in Elasticsearch.  
- If your connection type is Milvus or Datastax  
    - Employing an embedding model and LangChain to search and retrieve relevant documents based on specific queries in Milvus.
- Additional functionality supported for search and retrieval of hybrid searching strategies with both Milvus and Elasticsearch Vector Database.

**NOTE**: 
- Hybrid search is not supported for Bulk indexing in Milvus. Please disable `milvus_hybrid_search` param in `RAG_ADVANCED_PARAMETER_SET` in case data is bulk ingested in Milvus.
- Datastax is not supported in this cloud version.

## Contents

This notebook contains the following parts:
- [Setup](#setup)
- [Connect to Vector Database](#connect)
- [Q&A using Vectorstore and query templates](#QnATest)




<a id="setup"></a>
### Pre-Requisite Libraries and Dependencies
Download and import mandatory libraries and dependencies. 

Note : Some of the versions of the libraries may throw warnings after installation. These library versions are crucial for successful execution of the accelerator. Please ignore the warning/error and proceed with your execution. 

In [None]:

!pip install elasticsearch==8.18.1 | tail -n 1
!pip install langchain | tail -n 1
!pip install ibm_watsonx_ai==1.3.26 | tail -n 1
!pip install langchain_elasticsearch==0.3.2 | tail -n 1
!pip install langchain_milvus==0.2.0 | tail -n 1
!pip install pymilvus==2.5.11 | tail -n 1
!pip install langchain_community | tail -n 1
!pip install cassio==0.1.10 | tail -n 1


Restart the kernel after performing the pip install if the below cell fails to import all the libraries.

In [None]:
from elasticsearch import Elasticsearch, helpers
from ibm_watsonx_ai import Credentials
from ibm_watsonx_ai.foundation_models import Embeddings
from ibm_watsonx_ai.metanames import EmbedTextParamsMetaNames as EmbedParams
from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams
from ibm_watsonx_ai import APIClient
import json
import os
import shutil 
import warnings
warnings.filterwarnings("ignore")
from pymilvus import(IndexType,Status,connections,FieldSchema,DataType,Collection,CollectionSchema,utility)

In [None]:
project_id=os.environ['PROJECT_ID']
# Environment and host url
hostname = os.environ['RUNTIME_ENV_APSX_URL']

if hostname.endswith("cloud.ibm.com") == True:
    environment = "cloud"
    project_id = os.environ['PROJECT_ID']
    runtime_region = os.environ["RUNTIME_ENV_REGION"] 
else:
    environment = "on-prem"
    from ibm_watson_studio_lib import access_project_or_space
    wslib = access_project_or_space()
   




<a id="parameterimport"></a>
### Import Parameter Sets, Credentials and Helper functions script.

Below cells imports parameter sets values, the credentials and helper functions script


In [None]:
try:
    filename = 'rag_helper_functions.py'
    wslib.download_file(filename)
    import rag_helper_functions
    print("rag_helper_functions imported from the project assets")
except NameError as e:
    print(str(e))
    print("If running watsonx.ai aaS on IBM Cloud, check that the first cell in the notebook contains a project token. If not, select the vertical ellipsis button from the notebook toolbar and `insert project token`. Also check that you have specified your ibm_api_key in the second code cell of the notebook")


In [None]:
parameter_sets = ["RAG_parameter_set","RAG_advanced_parameter_set"]

parameters=rag_helper_functions.get_parameter_sets(wslib, parameter_sets)

In [None]:
ibm_api_key=parameters['watsonx_ai_api_key']
if environment == "cloud":
    WML_SERVICE_URL = f"https://{runtime_region}.ml.cloud.ibm.com"
    wml_credentials = {"apikey": ibm_api_key, "url": WML_SERVICE_URL}
else:
    token = os.environ['USER_ACCESS_TOKEN']
    wml_credentials = {"token": token,"instance_id" : "openshift","url": hostname}

### Set Watsonx.ai client
Below cell uses the watson machine learning credentials to create an API client to interact with the project and deployment space. 

In [None]:
client = APIClient(wml_credentials)
client.set.default_project(project_id=project_id)


<a id="connect"></a>
### Connecting to a vector database

#### Connecting using Project Connection Asset (default)
The notebook, by default, will look for a connection asset in the project named `milvus_connect` or `elasticsearch_connect` or `datastax_connect`.  You can set this up by following the instructions in the project readme. 
This code checks if a specified connection exists in the project. If found, it retrieves the connection details and identifies the connection type. Depending on the connection type, it establishes a connection to the appropriate database. If the connection is not found, it raises an error indicating the absence of the specified connection in the project.



In [None]:
connection_name=parameters["connection_asset"]
if(next((conn for conn in wslib.list_connections() if conn['name'] == connection_name), None)):
    print(connection_name, "Connection found in the project")
    db_connection = wslib.get_connection(connection_name)
    
    connection_datatypesource_id=client.connections.get_details(db_connection['.']['asset_id'])['entity']['datasource_type']
    connection_type = client.connections.get_datasource_type_details_by_id(connection_datatypesource_id)['entity']['name']
    
    print("Successfully retrieved the connection details")
    print("Connection type is identified as:",connection_type)

    if connection_type=="elasticsearch":
        es_client=rag_helper_functions.create_and_check_elastic_client(db_connection, parameters['elastic_search_model_id'])
    elif connection_type=="milvus" or connection_type=="milvuswxd":
        milvus_credentials = rag_helper_functions.connect_to_milvus_database(db_connection, parameters)

    elif connection_type=="datastax" :
        if environment == "cloud":
            raise ValueError(f"ERROR! we don't support datastax connection for Cloud as of now")
        import cassio
        datastax_session,datastax_cluster = rag_helper_functions.connect_to_datastax(db_connection, parameters)
        cassio.init(session=datastax_session, keyspace=db_connection.get('keyspace'))

else:
    db_connection=""
    raise ValueError(f"No connection named {connection_name} found in the project.")
    



<a id="QnATest"></a>
## Q&A on the Vector Database Index/Collection

The following sections of the notebook are designed to test a sample Question and Answer (QnA) interaction on the vector store. The subsequent cell in the notebook executes this test and provides a response that includes several key pieces of information. 



In [None]:
question ="How can I create a project in watsonx.ai?"

### 1. Using Elastic Search Query Templates on Elastic Search Vector Database

The following section of the notebook is designed to test a sample Question and Answer (QnA) interaction using sample template of ELSER model or multilingual model, assuming it is utilized.
This response comprises of:

* `Relevance Score`: A numerical value indicating the relevance or confidence level of the answer provided by the model.
* `Title`: The title of the document from which the answer is derived.
* `Document ID`: A unique identifier for the document within the database or index.
* `Document URL`: The location where the original document can be accessed or referenced.
* `Document Content`: The actual content or text from the document that is relevant to the queried question.
* `Source`: The source of the content or text for the relevant document queried. 
* `Page Number`: Page Number of the document (if applicable).

This setup allows for a practical demonstration of the model's capabilities in retrieving and presenting information in response to a specific query. 
There are 2 ways to perform this step, depending on the `elastic_search_template_file` parameter provided in the parameter set by the user. 
1. **ELSER**: An ELSER exclusive search query is invoked.
2. **ELSER + BM25**: A hybrid search query that is a combination of a tradition vector search and ELSER is invoked.
3. **Multilingual**: A dense vector search query is invoked.


In [None]:

if connection_type=="elasticsearch":
    wslib.download_file(parameters['elastic_search_template_file'])
    with open(parameters['elastic_search_template_file']) as f:
        es_query_json = json.load(f)

    es_query_str = json.dumps(es_query_json)
    if 'dense' in parameters['elastic_search_vector_type']:
        from langchain_elasticsearch import ElasticsearchEmbeddings
        embeddings = ElasticsearchEmbeddings.from_es_connection(
                    model_id=parameters['elastic_search_model_id'],
                    es_connection=es_client,
                )
        query_vector = embeddings.embed_documents([question])[0]
        es_query_str = es_query_str.replace('"{{query_vector}}"', str(query_vector))
    else:
        es_query_str = es_query_str.replace("{{model_id}}", parameters['elastic_search_model_id'])
        es_query_str = es_query_str.replace("{{model_text}}", question)
    
    # Convert back to dictionary
    es_query_template = json.loads(es_query_str)
    es_query=es_query_template.get("query",es_query_template)
    print(es_query)

    query_temp_args = {'query': es_query}
    if 'sub_searches' in es_query:
        query_temp_args = {'body': es_query}

    try:
        response = es_client.search(
            index=parameters["vector_store_index_name"], 
            size=parameters['vectorsearch_top_n_results'],
            **query_temp_args
        )
        print("\nResponse:")
        for hit in response['hits']['hits']:

            score = hit['_score']
            title = hit['_source']['metadata']['title']
            page_content=hit['_source']['text']
            source = hit['_source']['metadata']['source']
            url = hit['_source']['metadata']['document_url']
            page_number = hit['_source']['metadata']['page_number']


            print(f"\nRelevance Score  : {score}\nTitle            : {title}\nSource     : {source}\nDocument Content : {page_content}\nDocument URL : {url}\nPage Number : {page_number}")

    except Exception as e:
            print("\nAn error occurred while querying elastic search, please retry after sometime:", e)




### 2. Using the Langchain Retrievers
Below code utilizes Langchains vector store extension to retrieve documents. <br>

The code sets up a vector store based on the specified connection type, either "elasticsearch" or "milvus". <br>

* If `connection_type` is `"elasticsearch"`, it imports `ElasticsearchRetriever` from the `langchain_elasticsearch` library and initializes it with an Elasticsearch client and specified parameters with the given model ID.

* If `connection_type` is `"milvus"`, the code imports `langchain_milvus` and the `Milvus` vector store is then created using the embedding function, connection parameters, and index settings. It initializes a **Milvus** vector store with either **dense embeddings** or **hybrid search** (dense + BM25 sparse embeddings) based on the `milvus_hybrid_search` parameter. If hybrid search is enabled, it performs a **weighted similarity search**. Otherwise, it only uses dense embeddings and retrieves the top `k` results. Finally, it prints the search results.

- If `connection_type` is `"datastax", it Creates a `Cassandra` vector store with the specified embedding function and table name. then Performs a vector-based similarity search with the top `k` results and prints the results with scores.

**NOTE**: Hybrid search is not supported for Bulk indexing in Milvus.

Below cell should run successfully, regardless of which vector database is used.


In [None]:
def get_embedding(environment, parameters, project_id, wml_credentials, WML_SERVICE_URL):
    if environment == "cloud":
        credentials = Credentials(
            api_key=parameters['watsonx_ai_api_key'],
            url=WML_SERVICE_URL
        )
        embedding = Embeddings(
            model_id=parameters['embedding_model_id'],
            credentials=credentials,
            project_id=project_id,
            verify=True
        )
    elif environment == "on-prem":
        try:
            if client.foundation_models.EmbeddingModels.__members__:
                if client.foundation_models.EmbeddingModels(parameters["embedding_model_id"]).name:
                    embedding = Embeddings(
                        model_id=parameters['embedding_model_id'],
                        credentials=wml_credentials,
                        project_id=project_id,
                        verify=True
                    )
                else:
                    print("Local on-prem embedding models not found, using models from IBM Cloud API")
                    credentials = Credentials(
                        api_key=parameters['watsonx_ai_api_key'],
                        url=parameters['watsonx_ai_url']
                    )
                    embedding = Embeddings(
                        model_id=parameters['embedding_model_id'],
                        credentials=credentials,
                        space_id=parameters["wx_ai_inference_space_id"],
                        verify=True
                    )
        except Exception as e:
            print(f"Exception in loading Embedding Models: {str(e)}")
            raise
    else:
        raise ValueError(f"Invalid environment: {environment}. Must be 'cloud' or 'on-prem'.")
    
    return embedding

### Vector Search Query to obtain most relevant result using the Langchain retrievers

Based on specific type of connection type (Elasticsearch/Milvus/Datastax) the below cell invokes the search against the vector index and provides the most relevant results for the above question.

In [None]:
match connection_type:
    case "elasticsearch":
        search_kwargs = {
        "k": parameters['vectorsearch_top_n_results'],
        "score_threshold": float(parameters['rag_es_min_score']),
        "include_scores": True,
        "verbose": True
        }

        def custom_body_func(query: str) -> dict:
            print(f"Reading from the template {parameters['elastic_search_template_file']}")
            return es_query_template
        
        from langchain_elasticsearch import ElasticsearchRetriever
        retriever = ElasticsearchRetriever(
                        es_client=es_client,
                        index_name=parameters["vector_store_index_name"],
                        body_func=custom_body_func,
                        content_field="text",
                        # document_mapper = document_mapper,
                        search_kwargs=search_kwargs
                    )
        
        print("ElasticsearchRetriever Created with",parameters['elastic_search_model_id'])
        results = retriever.invoke(question)
        print(f"Question: {question}")
        print("Response: ")
        print([{"page_content": doc.page_content, "metadata":doc.metadata['_source']['metadata'], "score": doc.metadata['_score'] or doc.metadata['_rank']} for doc in results])
        
    case "milvus" | "milvuswxd":
        from langchain_milvus import Milvus, BM25BuiltInFunction
        if environment=="cloud":
            credentials=Credentials(
                api_key = parameters['watsonx_ai_api_key'],
                url =WML_SERVICE_URL)
            embedding = Embeddings(
            model_id=parameters['embedding_model_id'],
            credentials=credentials,
            project_id=project_id,
            verify=True
            )
            
        elif environment=="on-prem":
            try:
                if client.foundation_models.EmbeddingModels.__members__:
                    if client.foundation_models.EmbeddingModels(parameters["embedding_model_id"]).name:
                        embedding = Embeddings(
                            model_id=parameters['embedding_model_id'],
                            credentials=wml_credentials,
                            project_id=project_id,
                            verify=True
                        )
                    else:
                        raise Exception(parameters["embedding_model_id"] + "model is missing. Please check and update embedding_model_id adv param")
                else:
                    print("local on prem embeddng models are not found, using models from IBM Cloud API")
                    credentials=Credentials(
                        api_key = parameters['watsonx_ai_api_key'],
                        url =parameters['watsonx_ai_url'])
                    embedding = Embeddings(
                        model_id=parameters['embedding_model_id'],
                        credentials=credentials,
                        space_id=parameters["wx_ai_inference_space_id"],
                        verify=True
                    )
            except Exception as e:
                print("Exception in loading Embedding Models:" + str(e))
            
        hybrid_search = True if parameters['milvus_hybrid_search'].lower()=="true" else False
        dense_index_param = {"metric_type": "L2", "index_type": "IVF_FLAT","params": {"nlist": 1024},}
        print(f"using the embedding model {parameters['embedding_model_id']} for dense embeddings.")
        if hybrid_search:
            sparse_index_param = {"metric_type": "BM25","index_type": "SPARSE_INVERTED_INDEX", "params": {"drop_ratio_build": 0.2}}
            print("using BM25 sparse embeddings.")
            vector_store = Milvus(
            embedding_function=embedding,
            builtin_function=BM25BuiltInFunction(output_field_names="sparse"), 
            index_params=[dense_index_param, sparse_index_param],
            vector_field=["dense", "sparse"],
            connection_args=milvus_credentials,
            primary_field='id',
            consistency_level="Strong",
            collection_name=parameters["vector_store_index_name"] 
            )
            search_result = vector_store.similarity_search_with_score(question,  ranker_type="weighted", ranker_params={"weights": [0.6, 0.4]})
        else:
            vector_store = Milvus(
                embedding_function=embedding,
                index_params=dense_index_param,
                connection_args=milvus_credentials,
                primary_field='id',
                consistency_level="Strong",
                collection_name=parameters["vector_store_index_name"] 
            )
            search_result = vector_store.similarity_search_with_score_by_vector(embedding.embed_query(question), k=parameters['vectorsearch_top_n_results'])
        print(search_result)

    case "datastax":
        if environment == "cloud":
            raise ValueError(f"ERROR! we don't support datastax connection for Cloud as of now")
        print("using the model",parameters['embedding_model_id'], "to create embeddings")
        embedding = get_embedding(environment, parameters, project_id, wml_credentials, WML_SERVICE_URL) if environment == "cloud" else get_embedding(environment, parameters, project_id, wml_credentials, None)  
        from langchain_community.vectorstores import Cassandra
        vector_store = Cassandra(
            embedding=embedding,
            table_name=parameters["vector_store_index_name"] 
        )
        print("Datastax vector store Created on the index",parameters["vector_store_index_name"] )
        
        search_result= vector_store.similarity_search_with_score_by_vector(embedding.embed_query(question), k=parameters['vectorsearch_top_n_results'])
        print("\nQuestion:",question, "\nSearch Results:", search_result)

    case _:
        raise ValueError(f"Unsupported connection_type: {connection_type}")

**Note** It's recommended to close the datastax session once you are done with ingestion in this notebook for optimal performance. once you execute this cell existing datastax connections are closed. if have to re run above code cells you have to create new connection for datastax by re running cells from `Connect to Vector Database`

In [None]:
if connection_type=="datastax" and environment != "cloud":
    if not datastax_session.is_shutdown:
        datastax_session.shutdown()
        print(f"datastax_session got shutdown : {datastax_session.is_shutdown}")
    if not datastax_cluster.is_shutdown:
        datastax_cluster.shutdown()
        print(f"datastax_cluster got shutdown : {datastax_cluster.is_shutdown}")

Optionally, proceed to **Ingest Expert Profile data to vector DB** notebook to ingest expert profiles into vector database.<br>
Otherwise, proceed to **Create and Deploy QnA AI Service** notebook to create and deploy the RAG AI Service python function.



**Sample Materials, provided under license.</a> <br>
Licensed Materials - Property of IBM. <br>
Â© Copyright IBM Corp. 2024, 2025. All Rights Reserved. <br>
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. <br>**
