# Hybrid Search with LlamaIndex & KDB.AI

Note: This example requires a KDB.AI endpoint and API key. Sign up for a free [KDB.AI account](https://kdb.ai/offerings/).

KDB.AI hybrid search is a method of similarity search to increase the relevancy of results retrieved from the vector database. It combines two search methods: sparse vector search, and dense vector search.

Sparse vector search uses the BM25 algorithm to find the most relevant keyword matches, while dense vector search finds the most semantically relevant matches.

In KDB.AI, users can run sparse or dense search independently, or run hybrid search which runs both sparse and dense vector searches and then re-ranks to combine the results of each search based on a user defined "alpha" value. An alpha value closer to 0 indicates a higher sparse search weight, while a value closer to 1 indicates a higher dense search weight.

## Install dependencies

In [None]:
%pip install llama-index llama-index-embeddings-huggingface llama-index-llms-openai llama-index-readers-file llama-index-vector-stores-kdbai
%pip install kdbai_client langchain-text-splitters pandas

## Downloading data

**Libraries**

In [2]:
import os
import urllib.request

In [3]:
import nest_asyncio

nest_asyncio.apply()

**Data directories and paths**

In [4]:
# Root path
root_path = os.path.abspath(os.getcwd())

# Data directory and path
data_dir = "data"
data_path = os.path.join(root_path, data_dir)
if not os.path.exists(data_path):
    os.mkdir(data_path)

**Downloading text**

In [5]:
text_url = "https://raw.githubusercontent.com/KxSystems/kdbai-samples/main/hybrid_search/data/inflation.txt"
with urllib.request.urlopen(text_url) as response:
    text_content = response.read().decode("utf-8")

text_file_name = text_url.split('/')[-1]
text_path = os.path.join(data_path, text_file_name)
if not os.path.exists(text_path):
    with open(text_path, 'w') as text_file:
        text_file.write(text_content)

metadata = {
    f"{data_dir}/{text_file_name}": {
        "title": text_file_name,
        "file_path": text_path
    }
}

**Show text data**

In [6]:
def show_text(text_path):
    with open(text_path, 'r') as text_file:
        contents = text_file.read()
    print(contents[:500])
    print("="*80)

In [7]:
show_text(text_path)

 At last year's Jackson Hole symposium, I delivered a brief, direct message. My remarks this year will be a bit longer, but the message is the same: It is the Fed's job to bring inflation down to our 2 percent goal, and we will do so. We have tightened policy significantly over the past year. Although inflation has moved down from its peak—a welcome development—it remains too high. We are prepared to raise rates further if appropriate, and intend to hold policy at a restrictive level until we ar


## KDB.ai Vector Database - session and tables

**Libraries**

In [8]:
import kdbai_client as kdbai
from getpass import getpass

**KDB.AI session**

With the embeddings created, we need to store them in a vector database to enable efficient searching.

### Define KDB.AI Session

KDB.AI comes in two offerings:

1. [KDB.AI Cloud](https://trykdb.kx.com/kdbai/signup/) - For experimenting with smaller generative AI projects with a vector database in our cloud.
2. [KDB.AI Server](https://trykdb.kx.com/kdbaiserver/signup/) - For evaluating large scale generative AI applications on-premises or on your own cloud provider.

Depending on which you use there will be different setup steps and connection details required.

##### Option 1. KDB.AI Cloud

To use KDB.AI Cloud, you will need two session details - a URL endpoint and an API key.
To get these you can sign up for free [here](https://trykdb.kx.com/kdbai/signup).

You can connect to a KDB.AI Cloud session using `kdbai.Session` and passing the session URL endpoint and API key details from your KDB.AI Cloud portal.

If the environment variables `KDBAI_ENDPOINTS` and `KDBAI_API_KEY` exist on your system containing your KDB.AI Cloud portal details, these variables will automatically be used to connect.
If these do not exist, it will prompt you to enter your KDB.AI Cloud portal session URL endpoint and API key details.

In [None]:
KDBAI_ENDPOINT = (
    os.environ["KDBAI_ENDPOINT"]
    if "KDBAI_ENDPOINT" in os.environ
    else input("KDB.AI endpoint: ")
)
KDBAI_API_KEY = (
    os.environ["KDBAI_API_KEY"]
    if "KDBAI_API_KEY" in os.environ
    else getpass("KDB.AI API key: ")
)

session = kdbai.Session(api_key=KDBAI_API_KEY, endpoint=KDBAI_ENDPOINT)

##### Option 2. KDB.AI Server

To use KDB.AI Server, you will need download and run your own container.
To do this, you will first need to sign up for free [here](https://trykdb.kx.com/kdbaiserver/signup/).

You willreceive an email with the required license file and bearer  token needed to download your instance.
Follow instructions in the signup email to get your session up and running.

Once the [setup steps](https://code.kx.com/kdbai/gettingStarted/kdb-ai-server-setup.html) are complete you can then connect to your KDB.AI Server session using `kdbai.Session` and passing your local endpoint.

In [None]:
# session = kdbai.Session(endpoint="http://localhost:8082")

**KDB.AI table**

In [11]:
# Table - name & schema
table_name = "hs_docs"
table_schema = {
    "columns": [
        dict(name="document_id", pytype="bytes"),
        dict(name="text", pytype="bytes"),
        dict(
            name="embedding",
            vectorIndex=dict(type="flat", metric="L2", dims=768)
        ),
        dict(
            name="sparseVectors",
            pytype="dict",
            sparseIndex=dict(k=1.25, b=0.75)
        ),
        dict(name="title", pytype="str"),
        dict(name="file_path", pytype="str")
    ]
}

In [12]:
# Drop table if exists
if table_name in session.list():
    session.table(table_name).drop()

In [13]:
# texts table
table = session.create_table(table_name, table_schema)

## Loading data

In [14]:
from llama_index.vector_stores.kdbai import KDBAIVectorStore
from llama_index.core import StorageContext
from llama_index.core import Settings
from llama_index.core.indices import VectorStoreIndex
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.callbacks import CallbackManager
from llama_index.core import SimpleDirectoryReader

In [15]:
# Helper function - for getting metadata
def get_metadata(file_path):
    return metadata[file_path]

In [16]:
%%time

local_files = [fpath for fpath in metadata]
documents = SimpleDirectoryReader(input_files=local_files, file_metadata=get_metadata)

docs = documents.load_data()
len(docs)

CPU times: user 14.4 ms, sys: 1.94 ms, total: 16.3 ms
Wall time: 16.4 ms


1

In [19]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
EMBEDDING = "sentence-transformers/all-mpnet-base-v2"
embeddings_model = HuggingFaceEmbedding(model_name=EMBEDDING)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]



1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

### Create vector store, storage context and the index for retrieval, query purposes

In [20]:
%%time

# Vector Store
text_store = KDBAIVectorStore(table=table, hybrid_search=True)

# Storage context
storage_context = StorageContext.from_defaults(vector_store=text_store)

# Settings
#Settings.callback_manager = callback_manager
Settings.transformations = [SentenceSplitter(chunk_size=500, chunk_overlap=0)]
Settings.embed_model = embeddings_model
Settings.llm = None

# Vector Store Index
index = VectorStoreIndex.from_documents(
    docs,
    use_async=True,
    storage_context=storage_context,
)

LLM is explicitly disabled. Using MockLLM.


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]



CPU times: user 7.82 s, sys: 780 ms, total: 8.6 s
Wall time: 11.3 s


In [21]:
table.query()

Unnamed: 0,document_id,text,embedding,sparseVectors,title,file_path
0,b'c26bcfc2-951e-40bd-959a-ae2b8edd2467',"b'At last year\'s Jackson Hole symposium, I de...","[-0.035284244, 0.0753799, -0.022666411, -0.017...","{101: 1, 2012: 4, 2197: 1, 2095: 3, 1005: 3, 1...",inflation.txt,/content/data/inflation.txt
1,b'e4d97506-7118-49ae-87bf-47c41abe670c',"b""On a 12-month basis, core PCE inflation peak...","[-0.04378559, 0.046354603, -0.030167095, 0.013...","{101: 1, 2006: 7, 1037: 5, 2260: 2, 1011: 6, 3...",inflation.txt,/content/data/inflation.txt
2,b'0014da23-8348-48af-ab56-c64ec48c47cc',b'In the highly interest-sensitive housing sec...,"[-0.07940253, 0.008506958, -0.035946056, -0.00...","{101: 1, 1999: 7, 1996: 21, 3811: 1, 3037: 2, ...",inflation.txt,/content/data/inflation.txt
3,b'1c00e107-b816-40d2-8445-a0ae707c2564',"b""Getting inflation sustainably back down to 2...","[-0.046816133, 0.052543037, -0.038334284, -0.0...","{101: 1, 2893: 1, 14200: 2, 15770: 1, 8231: 1,...",inflation.txt,/content/data/inflation.txt
4,b'73ac6d92-a93f-4b5f-a8c6-8bba94068e3f',b'While nominal wage growth must ultimately sl...,"[-0.033225708, 0.037619803, -0.030979052, -0.0...","{101: 1, 2096: 1, 15087: 2, 11897: 4, 3930: 4,...",inflation.txt,/content/data/inflation.txt
5,b'0decb50d-966f-448f-a2a4-88dacc50a375',b'Doing too little could allow above-target in...,"[-0.042863447, 0.02854309, -0.030805789, -0.03...","{101: 1, 2725: 2, 2205: 2, 2210: 1, 2071: 2, 3...",inflation.txt,/content/data/inflation.txt


## Retrieval from query using Hybrid Search

**Query**

In [22]:
query = '12-month basis'

**Helper function: To display search results**

In [23]:
import pandas as pd

In [24]:
def display_search_results(nodes):
    nodes_df = pd.DataFrame(columns=['score', 'text'])
    for node in nodes:
        nodes_df.loc[len(nodes_df.index)] = (node.score, node.text)
    return nodes_df

**Hybrid Search: Giving equal priority to both sparse and dense vector search ($\alpha=0.5$)**

In [25]:
%%time

retriever = index.as_retriever(similarity_top_k=5, vector_store_query_mode="hybrid")

CPU times: user 65 µs, sys: 12 µs, total: 77 µs
Wall time: 81.8 µs


In [26]:
equal_priority_nodes = retriever.retrieve(query)
display_search_results(equal_priority_nodes)



Unnamed: 0,score,text
0,0.416667,"On a 12-month basis, core PCE inflation peaked..."
1,0.333333,While nominal wage growth must ultimately slow...
2,0.266667,"At last year's Jackson Hole symposium, I deliv..."
3,0.225,Getting inflation sustainably back down to 2 p...
4,0.208333,In the highly interest-sensitive housing secto...


**Hybrid Search: Giving more priority to sparse vector search ($\alpha=0.1$)**

In [27]:
%%time

retriever = index.as_retriever(similarity_top_k=5, vector_store_query_mode="hybrid", alpha=0.1)

CPU times: user 55 µs, sys: 11 µs, total: 66 µs
Wall time: 71.3 µs


In [28]:
sparse_priority_nodes = retriever.retrieve(query)
display_search_results(sparse_priority_nodes)



Unnamed: 0,score,text
0,0.483333,"On a 12-month basis, core PCE inflation peaked..."
1,0.32,"At last year's Jackson Hole symposium, I deliv..."
2,0.241667,In the highly interest-sensitive housing secto...
3,0.205,Getting inflation sustainably back down to 2 p...
4,0.2,While nominal wage growth must ultimately slow...


**Hybrid Search: Giving more priority to dense vector search ($\alpha=0.9$)**

In [29]:
%%time

retriever = index.as_retriever(similarity_top_k=5, vector_store_query_mode="hybrid", alpha=0.90)

CPU times: user 81 µs, sys: 16 µs, total: 97 µs
Wall time: 102 µs


In [30]:
dense_priority_nodes = retriever.retrieve(query)
display_search_results(dense_priority_nodes)



Unnamed: 0,score,text
0,0.466667,While nominal wage growth must ultimately slow...
1,0.35,"On a 12-month basis, core PCE inflation peaked..."
2,0.245,Getting inflation sustainably back down to 2 p...
3,0.213333,"At last year's Jackson Hole symposium, I deliv..."
4,0.175,In the highly interest-sensitive housing secto...


**Conclusion**
- In the sparse search results, we can see the terms we are interested directly i.e "12-month basis" rather than terms having similar meanings.
- In the dense search resutls, we can see the most related or similar text to the query.

## Delete the KDB.AI Table
Once finished with the table, it is best practice to drop it.

In [None]:
table.drop()