# Hybrid and rerank
https://github.com/milvus-io/bootcamp/blob/master/bootcamp/RAG/advanced_rag/hybrid_and_rerank_with_langchain.ipynb

In [2]:
from rag_utils.vanilla import llm, vectorstore

AIMessage(content='Hello! It looks like you\'ve typed "test." If you have any questions or need assistance with something, feel free to ask!', response_metadata={'token_usage': {'completion_tokens': 28, 'prompt_tokens': 8, 'total_tokens': 36}, 'model_name': 'deepseek-chat', 'system_fingerprint': 'fp_592007501c', 'finish_reason': 'stop', 'logprobs': None}, id='run-19ce8fec-da49-4b84-af87-02826d10318d-0', usage_metadata={'input_tokens': 8, 'output_tokens': 28, 'total_tokens': 36})

## Prepare the data

In [4]:
import bs4
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

from rag_utils.vanilla import vectorstore

# Create a WebBaseLoader instance to load documents from web sources
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
# Load documents from web sources using the loader
documents = loader.load()
# Initialize a RecursiveCharacterTextSplitter for splitting text into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)

# Split the documents into chunks using the text_splitter
docs = text_splitter.split_documents(documents)

## Build the chain

We load the docs into milvus vectorstore, and build a milvus retriever.

In [6]:
vectorstore.add_documents(docs)
milvus_retriever = vectorstore.as_retriever()

And build a bm25 retriever from the docs.

In [8]:
from langchain_community.retrievers import BM25Retriever

bm25_retriever = BM25Retriever.from_documents(docs)

Build a vanilla RAG chain

In [9]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from rag_utils.vanilla import format_docs, rag_prompt, llm


vanilla_rag_chain = (
    {"context": milvus_retriever | format_docs, "question": RunnablePassthrough()}
    | rag_prompt
    | llm
    | StrOutputParser()
)

Prepare hybrid_and_rerank_retriever

Langchain use **Cross Encoder Reranker** to implement reranker in a retriever with your own cross encoder from Hugging Face cross encoder models or Hugging Face models that implements cross encoder function (example: BAAI/bge-reranker-base).

In [13]:
from rag_utils.hybrid_and_rerank import RerankerRunnable
from langchain.retrievers.document_compressors import CrossEncoderReranker
from langchain_community.cross_encoders import HuggingFaceCrossEncoder

model = HuggingFaceCrossEncoder(model_name="BAAI/bge-reranker-v2-m3")
compressor = CrossEncoderReranker(model=model, top_n=4)
reranker = RerankerRunnable(compressor=compressor, top_k=4)
hybrid_and_rerank_retriever = {
    "milvus_retrieved_doc": milvus_retriever,
    "bm25_retrieved_doc": bm25_retriever,
    "query": RunnablePassthrough(),
} | reranker

In [14]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

hybrid_and_rerank_chain = (
    {
        "context": hybrid_and_rerank_retriever | format_docs,
        "question": RunnablePassthrough(),
    }
    | rag_prompt
    | llm
    | StrOutputParser()
)

## Test the chain

In [26]:
# query = "Which model use tools?"
query = "What does AI agent can do?"

vanilla_result = vanilla_rag_chain.invoke(query)
hybrid_and_rerank_result = hybrid_and_rerank_chain.invoke(query)
print(
    f"\n[vanilla_result]:\n{vanilla_result}\n\n[hybrid_and_rerank_result]:\n{hybrid_and_rerank_result}"
)

len(milvus_retrieved_doc) = 4
len(bm25_retrieved_doc) = 4
len(unique_documents) = 6

[vanilla_result]:
An AI agent powered by a large language model (LLM) can perform several tasks, including:

1. Planning: The agent can break down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks. It can also engage in self-criticism and self-reflection over past actions, learn from mistakes, and refine future steps to improve the quality of final results.

2. Memory: The agent can simulate emergent social behavior, such as information diffusion, relationship memory (e.g., two agents continuing the conversation topic), and coordination of social events (e.g., host a party and invite many others).

3. Parsing user input: The AI assistant can parse user input into several tasks, with a logical relationship between them, and select tasks from an available task list. It can also record chat history and use it for task planning.

However, the reliability of the nat

In [None]:
hybrid_and_rerank_retriever.invoke(query)

len(milvus_retrieved_doc) = 4
len(bm25_retrieved_doc) = 4
len(unique_documents) = 5


[Document(page_content='LLM Powered Autonomous Agents\n    \nDate: June 23, 2023  |  Estimated Reading Time: 31 min  |  Author: Lilian Weng\n\n\nBuilding agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\nAgent System Overview#\nIn a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:\n\nPlanning\n\nSubgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.\nReflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results.\n\n

In [25]:
milvus_retriever.invoke(query)

[Document(page_content='LLM Powered Autonomous Agents\n    \nDate: June 23, 2023  |  Estimated Reading Time: 31 min  |  Author: Lilian Weng\n\n\nBuilding agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\nAgent System Overview#\nIn a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:\n\nPlanning\n\nSubgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.\nReflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results.\n\n