# Hybrid and rerank
https://github.com/milvus-io/bootcamp/blob/master/bootcamp/RAG/advanced_rag/hybrid_and_rerank_with_langchain.ipynb

In [2]:
from rag_utils.vanilla import llm, vectorstore

AIMessage(content='Hello! It looks like you\'ve typed "test." If you have any questions or need assistance with something, feel free to ask!', response_metadata={'token_usage': {'completion_tokens': 28, 'prompt_tokens': 8, 'total_tokens': 36}, 'model_name': 'deepseek-chat', 'system_fingerprint': 'fp_592007501c', 'finish_reason': 'stop', 'logprobs': None}, id='run-19ce8fec-da49-4b84-af87-02826d10318d-0', usage_metadata={'input_tokens': 8, 'output_tokens': 28, 'total_tokens': 36})

## Prepare the data

In [2]:
import bs4
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

from rag_utils.vanilla import vectorstore

# Create a WebBaseLoader instance to load documents from web sources
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
# Load documents from web sources using the loader
documents = loader.load()
# Initialize a RecursiveCharacterTextSplitter for splitting text into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)

# Split the documents into chunks using the text_splitter
docs = text_splitter.split_documents(documents)

## Build the chain

We load the docs into milvus vectorstore, and build a milvus retriever.

In [4]:
vectorstore.add_documents(docs)
retriever = vectorstore.as_retriever()

Build a vanilla RAG chain

In [5]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from rag_utils.vanilla import format_docs, rag_prompt, llm


vanilla_rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | rag_prompt
    | llm
    | StrOutputParser()
)

Build a hyde chain.

In [6]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from rag_utils.hyde import HydeRetriever

hyde_retriever = HydeRetriever.from_vectorstore(vectorstore)

hyde_chain = (
    {"context": hyde_retriever | format_docs, "question": RunnablePassthrough()}
    | rag_prompt
    | llm
    | StrOutputParser()
)

## Test the chain

In [7]:
query = "which vector approximate searching algorithms work in a vector store"

vanilla_result = vanilla_rag_chain.invoke(query)
hyde_result = hyde_chain.invoke(query)
print(f"\n[vanilla_result]:\n{vanilla_result}\n\n[hyde_result]:\n{hyde_result}")

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)



[vanilla_result]:
The vector approximate searching algorithms that work in a vector store include FAISS (Facebook AI Similarity Search) and ScaNN (Scalable Nearest Neighbors). These algorithms are commonly used for fast Maximum Inner Product Search (MIPS) in vector stores, with FAISS applying vector quantization by partitioning the vector space into clusters and refining quantization within clusters, and ScaNN introducing anisotropic vector quantization to maintain similarity in inner products.

[hyde_result]:
FAISS, ScaNN, LSH, and ANNOY are vector approximate searching algorithms that work in a vector store. FAISS uses vector quantization and clustering, ScaNN employs anisotropic vector quantization, LSH utilizes locality-sensitive hashing, and ANNOY leverages random projection trees.


In [8]:
retriever.invoke(query)

[Document(page_content='FAISS (Facebook AI Similarity Search): It operates on the assumption that in high dimensional space, distances between nodes follow a Gaussian distribution and thus there should exist clustering of data points. FAISS applies vector quantization by partitioning the vector space into clusters and then refining the quantization within clusters. Search first looks for cluster candidates with coarse quantization and then further looks into each cluster with finer quantization.\nScaNN (Scalable Nearest Neighbors): The main innovation in ScaNN is anisotropic vector quantization. It quantizes a data point $x_i$ to $\\tilde{x}_i$ such that the inner product $\\langle q, x_i \\rangle$ is as similar to the original distance of $\\angle q, \\tilde{x}_i$ as possible, instead of picking the closet quantization centroid points.', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'pk': 450749248914849811}),
 Document(page_content='FAISS (Facebook AI Si

In [9]:
hyde_retriever.invoke(query)

[Document(page_content='FAISS (Facebook AI Similarity Search): It operates on the assumption that in high dimensional space, distances between nodes follow a Gaussian distribution and thus there should exist clustering of data points. FAISS applies vector quantization by partitioning the vector space into clusters and then refining the quantization within clusters. Search first looks for cluster candidates with coarse quantization and then further looks into each cluster with finer quantization.\nScaNN (Scalable Nearest Neighbors): The main innovation in ScaNN is anisotropic vector quantization. It quantizes a data point $x_i$ to $\\tilde{x}_i$ such that the inner product $\\langle q, x_i \\rangle$ is as similar to the original distance of $\\angle q, \\tilde{x}_i$ as possible, instead of picking the closet quantization centroid points.', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'pk': 450749248914849811}),
 Document(page_content='FAISS (Facebook AI Si