## RAG Hybrid search

This notebook shows how a hybrid search (vector search + keyword search) can be performed using RAG components.  The following frameworks and DB libraries are used to build this search:

1. LlamaIndex - for hybrid search
2. ChromaDB

### Import required libraries

In [1]:
import os, json
from dotenv import load_dotenv

import chromadb

import nest_asyncio
nest_asyncio.apply()

In [2]:
load_dotenv()

True

In [3]:
from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding

from llama_index.core import (
    SimpleDirectoryReader, 
    VectorStoreIndex, 
    Settings
)

from llama_index.core.node_parser import SentenceSplitter
from llama_index.retrievers.bm25 import BM25Retriever
from llama_index.core.retrievers import QueryFusionRetriever
from llama_index.core.query_engine import RetrieverQueryEngine

resource module not available on Windows


In [4]:
os.environ["OPENAI_API_VERSION"] = os.getenv("OPENAI_API_VERSION")
os.environ["OPENAI_API_BASE"] = os.getenv("OPENAI_API_BASE")
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

### Define LLM and Embedding models.  These are required for RAG operations

In [5]:
llm = AzureOpenAI(
    engine = "<engine name>",
    model="gpt-4o",
    temperature=0.0,
    azure_endpoint = os.environ['OPENAI_API_BASE'],
    api_key = os.environ['OPENAI_API_KEY'],
    api_version = os.environ['OPENAI_API_VERSION'],
)

embed_model = AzureOpenAIEmbedding (
    model = "text-embedding-ada-002",
    deployment_name= "<deployment name>",
    azure_endpoint = os.environ['OPENAI_API_BASE'],
    api_key = os.environ['OPENAI_API_KEY'],
    api_version = os.environ['OPENAI_API_VERSION'],
)

In [6]:
Settings.llm = llm
Settings.embed_model = embed_model

## Load external data and index it

we are going to use a smaller chunk size (256). Typically, this results in a better accuracy for search operations

In [7]:
data = SimpleDirectoryReader(input_files=["gs_MA_report.pdf"]).load_data()
splitter = SentenceSplitter(chunk_size=256)
nodes = splitter.get_nodes_from_documents(data)

index = VectorStoreIndex.from_documents(
    data, 
    transformations=[splitter], 
    show_progress=True
)

Parsing nodes:   0%|          | 0/14 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/68 [00:00<?, ?it/s]

### Defining both vector and BM25 retrievers

In [8]:
vector_retriever = index.as_retriever(similarity_top_k=5)

bm25_retriever = BM25Retriever.from_defaults(
    nodes=nodes,
    similarity_top_k=10
)

### Define Hybrid Retriever

In [9]:
retriever = QueryFusionRetriever(
    [vector_retriever, bm25_retriever],
    retriever_weights=[0.6, 0.4],
    similarity_top_k=10,
    num_queries=1,  # to disable query generation
    mode="relative_score",
    use_async=True,
    verbose=True,
)

### Define Query Engine and run RAG hybrid queries

In [10]:
query_engine = RetrieverQueryEngine.from_args(retriever)

In [11]:
%%time
query_ai = "what is the effect of GenAI in M&A in the year 2025"
response = query_engine.query(query_ai)
print(response)

Generative AI is expected to unlock efficiencies that could be deflationary and potentially disruptive for some SaaS companies, pushing down valuations and, in certain cases, driving them to go private. Conversely, software companies with entrenched customer relationships and proprietary datasets are likely to become key targets for AI transformation, leading to solid M&A outcomes.
CPU times: total: 219 ms
Wall time: 1.69 s


### END