# bRAG: Routing and Query Construction

![route_query_construction](https://github.com/tivon-x/bRAG-langchain/blob/main/notebooks/image/route_query_construction.png?raw=1)

## 依赖

In [None]:
! pip3 install --quiet langchain_community tiktoken langchain-openai langchainhub chromadb langchain youtube-transcript-api pytube yt_dlp langgraph
! pip install --upgrade --quiet  dashscope

In [2]:
! pip3 install --quiet python-dotenv

## Environment

`(1) Packages`

In [None]:
# 非colab环境
import os
from dotenv import load_dotenv

# 从 .env 文件加载所有环境变量
load_dotenv()

# LangSmith
langsmith_tracing = os.getenv('LANGSMITH_TRACING')
langsmith_endpoint = os.getenv('LANGSMITH_ENDPOINT')
langsmith_api_key = os.getenv('LANGSMITH_API_KEY')

## LLM
dashscope_api_key = os.getenv('DASHSCOPE_API_KEY')


In [3]:
# Colab环境
import os
from google.colab import userdata

langsmith_tracing = userdata.get('LANGSMITH_TRACING')
langsmith_endpoint = userdata.get('LANGSMITH_ENDPOINT')
langsmith_api_key = userdata.get('LANGSMITH_API_KEY')

dashscope_api_key = userdata.get("DASHSCOPE_API_KEY")

`(2) LangSmith`

https://docs.smith.langchain.com/

In [4]:
os.environ['LANGSMITH_TRACING'] = langsmith_tracing
os.environ['LANGSMITH_ENDPOINT'] = langsmith_endpoint
os.environ['LANGSMITH_API_KEY'] = langsmith_api_key

`(3) API Keys`

In [5]:
# 使用阿里云百炼平台
os.environ['DASHSCOPE_API_KEY'] = dashscope_api_key
dashscope_model = "qwen-plus-latest"

## bRAG: Logical and Semantic routing

routing 路由允许您创建非确定性链，其中上一步的输出定义下一步。路由可以通过允许您定义状态并将与这些状态相关的信息用作模型调用的上下文，来帮助围绕与模型的交互提供结构和一致性。

执行路由的方法:

- 用 [`RunnableLambda`](https://python.langchain.ac.cn/docs/how_to/functions/) 有条件地返回 runnables——通过结构化输出实现

Flow:

![routing](https://github.com/tivon-x/bRAG-langchain/blob/main/notebooks/image/routing.png?raw=1)

Docs:

https://python.langchain.com/docs/how_to/routing/

In [7]:
from typing import Literal

from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field
from langchain_community.chat_models.tongyi import ChatTongyi

# Data model
class RouteQuery(BaseModel):
    """Route a user query to the most relevant datasource."""

    datasource: Literal["python_docs", "js_docs", "golang_docs"] = Field(
        ...,
        description="Given a user question choose which datasource would be most relevant for answering their question",
    )

# LLM with function call
llm = ChatTongyi(model=dashscope_model, temperature=0.1)
structured_llm = llm.with_structured_output(RouteQuery)

# Prompt
system = """You are an expert at routing a user question to the appropriate data source.

Based on the programming language the question is referring to, route it to the relevant data source."""

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        ("human", "{question}"),
    ]
)

# Define router
router = prompt | structured_llm

注意：结构化输出背后是一个function call

![structured_output](https://github.com/tivon-x/bRAG-langchain/blob/main/notebooks/image/structured_output.png?raw=1)

In [8]:
question = """Why doesn't the following code work:

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(["human", "speak in {language}"])
prompt.invoke("french")
"""

result = router.invoke({"question": question})

In [9]:
result

RouteQuery(datasource='python_docs')

In [10]:
result.datasource

'python_docs'

根据`result.datasource`，可以使用自定义函数在不同输出之间路由——需要`RunnableLambda`包装该函数

In [11]:
def choose_route(result):
    if "python_docs" in result.datasource.lower():
        ### Logic here
        return "chain for python_docs"
    elif "js_docs" in result.datasource.lower():
        ### Logic here
        return "chain for js_docs"
    else:
        ### Logic here
        return "golang_docs"

from langchain_core.runnables import RunnableLambda

full_chain = router | RunnableLambda(choose_route)

In [12]:
full_chain.invoke({"question": question})

'chain for python_docs'

跟踪

https://smith.langchain.com/public

### Semantic routing

使用嵌入将查询路由到最相关的提示

Flow:

![semantic_routing](https://github.com/tivon-x/bRAG-langchain/blob/main/notebooks/image/semantic_routing.png?raw=1)

Docs:

https://python.langchain.com/docs/how_to/routing/

In [13]:
from langchain.utils.math import cosine_similarity
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnableLambda, RunnablePassthrough
from langchain_community.embeddings import DashScopeEmbeddings

# Two prompts
physics_template = """You are a very smart physics professor. \
You are great at answering questions about physics in a concise and easy to understand manner. \
When you don't know the answer to a question you admit that you don't know.

Here is a question:
{query}"""

math_template = """You are a very good mathematician. You are great at answering math questions. \
You are so good because you are able to break down hard problems into their component parts, \
answer the component parts, and then put them together to answer the broader question.

Here is a question:
{query}"""

# Embed prompts
embeddings = DashScopeEmbeddings(model="text-embedding-v4")
prompt_templates = [physics_template, math_template]
prompt_embeddings = embeddings.embed_documents(prompt_templates)

# Route question to prompt
def prompt_router(input):
    # Embed question
    query_embedding = embeddings.embed_query(input["query"])
    # Compute similarity
    similarity = cosine_similarity([query_embedding], prompt_embeddings)[0]
    most_similar = prompt_templates[similarity.argmax()]
    # Chosen prompt
    print("Using MATH" if most_similar == math_template else "Using PHYSICS")
    return PromptTemplate.from_template(most_similar)


chain = (
    {"query": RunnablePassthrough()}
    | RunnableLambda(prompt_router)
    | llm
    | StrOutputParser()
)

print(chain.invoke("What's a black hole"))

Using PHYSICS
A black hole is a region in space where the gravitational pull is so strong that nothing—not even light—can escape from it. This extreme gravity occurs because a large amount of mass is compressed into a very small area, creating a powerful gravitational field.

The boundary around a black hole, called the **event horizon**, marks the point of no return: once anything crosses this boundary, it cannot come back. Inside the event horizon, matter and energy are pulled toward a central point called the **singularity**, where density becomes infinite and our current laws of physics break down.

Black holes form when extremely massive stars collapse under their own gravity at the end of their life cycles. There are different types of black holes, including **stellar-mass black holes** (formed from collapsing stars), **supermassive black holes** (found at the centers of galaxies, including our Milky Way), and possibly **intermediate-mass black holes**.

They were first predicted

Trace:

https://smith.langchain.com/public

# Query Construction 查询构造

![query_construction](https://github.com/tivon-x/bRAG-langchain/blob/main/notebooks/image/query_construction.png?raw=1)

使用典型的检索增强生成 （RAG） ，用户查询将转换为向量表示。然后将此向量与源文档的向量表示进行比较，以查找最相似的向量。这对于非结构化数据来说效果相当好，但对结构化数据不一定有效。

许多用户查询的最佳答案不仅是在嵌入空间中查找类似的文档或数据，而且要利用数据中固有的结构并在用户查询中表示。

例如，考虑查询“1980年的外星人电影有哪些”。其中有一部分（外星人）可能需要语义查找，但还有一个部分（“年份 == 1980”）需要精确查找。

查询构造是将自然语言查询转换为您正在与之交互的数据库的查询语言。

| **例子**                | **数据源**            | **参考**                                               |
| --------------------------- | -------------------------- | ------------------------------------------------------------ |
| **Text-to-metadata-filter** | Vectorstores               | [文档](https://python.langchain.com/docs/how_to/self_query/) |
| **Text-to-SQL**             | SQL DB                     | [**Docs**](https://python.langchain.com/docs/tutorials/sql_qa/)**,** [**blog**](https://blog.langchain.dev/llms-and-sql/)**,** [**blog**](https://blog.langchain.dev/incorporating-domain-specific-knowledge-in-sql-llm-solutions/) |
| **Text-to-SQL+ Semantic**   | PGVecvtor supported SQL DB | [**Cookbook**](https://github.com/langchain-ai/langchain/blob/master/cookbook/retrieval_in_sql.ipynb?ref=blog.langchain.dev) |
| **Text-to-Cypher**          | Graph databases            | [**Blog**](https://blog.langchain.dev/using-a-knowledge-graph-to-implement-a-devops-rag-application/)**,** [**Blog**](https://blog.langchain.dev/implementing-advanced-retrieval-rag-strategies-with-neo4j/)**,** [**Docs**](https://python.langchain.com/docs/tutorials/graph/) |



相关文档：

https://blog.langchain.dev/query-construction/

https://blog.langchain.dev/enhancing-rag-based-applications-accuracy-by-constructing-and-leveraging-knowledge-graphs/

## 针对元数据的查询构建

Flow:

![metadata](https://github.com/tivon-x/bRAG-langchain/blob/main/notebooks/image/metadata.png?raw=1)

许多向量数据库都包含元数据字段。

这使得基于元数据过滤特定块成为可能。

元数据相关文档：

https://python.langchain.com/v0.1/docs/use_cases/query_analysis/


让我们看看可能在YouTube转录数据库中看到的一些元数据示例。

In [14]:
from langchain_community.document_loaders import YoutubeLoader
import yt_dlp

def fetch_video_info(url: str):
    """Fetch metadata of a YouTube video using yt-dlp."""
    ydl_opts = {
        "quiet": True,
        "format": "best",
    }
    with yt_dlp.YoutubeDL(ydl_opts) as ydl:
        info = ydl.extract_info(url, download=False)
    return info

# 使用yt-dlp获取元数据
video_url = "https://www.youtube.com/watch?v=pbAd8O1Lvm4"
video_info = fetch_video_info(video_url)

print(video_info)

# 增加相应的元数据
loader = YoutubeLoader.from_youtube_url(
    video_url, add_video_info=False
)

# 手动添加元数据
docs = loader.load()
for doc in docs:
    doc.metadata.update({
        "title": video_info.get("title", "Unknown"),
        "description": video_info.get("description", "No description available"),
        "uploader": video_info.get("uploader", "Unknown uploader"),
        "upload_date": video_info.get("upload_date", "Unknown date"),
    })

# Print the metadata
print(docs[0].metadata)




{'id': 'pbAd8O1Lvm4', 'title': 'Self-reflective RAG with LangGraph: Self-RAG and CRAG', 'formats': [{'format_id': 'sb3', 'format_note': 'storyboard', 'ext': 'mhtml', 'protocol': 'mhtml', 'acodec': 'none', 'vcodec': 'none', 'url': 'https://i.ytimg.com/sb/pbAd8O1Lvm4/storyboard3_L0/default.jpg?sqp=-oaymwENSDfyq4qpAwVwAcABBqLzl_8DBgjymIyuBg==&sigh=rs$AOn4CLCZSRMgkSsr7R3mQgKlyKscgQgrFg', 'width': 48, 'height': 27, 'fps': 0.0945179584120983, 'rows': 10, 'columns': 10, 'fragments': [{'url': 'https://i.ytimg.com/sb/pbAd8O1Lvm4/storyboard3_L0/default.jpg?sqp=-oaymwENSDfyq4qpAwVwAcABBqLzl_8DBgjymIyuBg==&sigh=rs$AOn4CLCZSRMgkSsr7R3mQgKlyKscgQgrFg', 'duration': 1058.0}], 'audio_ext': 'none', 'video_ext': 'none', 'vbr': 0, 'abr': 0, 'tbr': None, 'resolution': '48x27', 'aspect_ratio': 1.78, 'filesize_approx': None, 'http_headers': {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.70 Safari/537.36', 'Accept': 'text/html,application/xhtm

### 1. 手动构建
手动构建生成包含过滤等操作的结构化查询

假设我们建立了一个索引：

1. 允许我们对每个文档的`contents`和`title`进行非结构化搜索

2. 对`view count`、`publication date`和`length`使用范围过滤。

我们希望将自然语言转换为结构化搜索查询。

我们可以为结构化搜索查询定义一个模式。

In [15]:
import datetime
from typing import Literal, Optional, Tuple
from pydantic import BaseModel, Field

class TutorialSearch(BaseModel):
    """在关于某个软件库的教程视频数据库中进行搜索。"""

    content_search: str = Field(
        ...,
        description="Similarity search query applied to video transcripts.",
    )
    title_search: str = Field(
        ...,
        description=(
            "Alternate version of the content search query to apply to video titles. "
            "Should be succinct and only include key words that could be in a video "
            "title."
        ),
    )
    min_view_count: Optional[int] = Field(
        None,
        description="Minimum view count filter, inclusive. Only use if explicitly specified.",
    )
    max_view_count: Optional[int] = Field(
        None,
        description="Maximum view count filter, exclusive. Only use if explicitly specified.",
    )
    earliest_publish_date: Optional[datetime.date] = Field(
        None,
        description="Earliest publish date filter, inclusive. Only use if explicitly specified.",
    )
    latest_publish_date: Optional[datetime.date] = Field(
        None,
        description="Latest publish date filter, exclusive. Only use if explicitly specified.",
    )
    min_length_sec: Optional[int] = Field(
        None,
        description="Minimum video length in seconds, inclusive. Only use if explicitly specified.",
    )
    max_length_sec: Optional[int] = Field(
        None,
        description="Maximum video length in seconds, exclusive. Only use if explicitly specified.",
    )

    def pretty_print(self) -> None:
        for field in self.__fields__:
            if getattr(self, field) is not None and getattr(self, field) != getattr(
                self.__fields__[field], "default", None
            ):
                print(f"{field}: {getattr(self, field)}")

现在，我们提示LLM生成查询

In [16]:
from langchain_core.prompts import ChatPromptTemplate

system = """You are an expert at converting user questions into database queries. \
You have access to a database of tutorial videos about a software library for building LLM-powered applications. \
Given a question, return a database query optimized to retrieve the most relevant results.

If there are acronyms or words you are not familiar with, do not try to rephrase them."""
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        ("human", "{question}"),
    ]
)

structured_llm = llm.with_structured_output(TutorialSearch)
query_analyzer = prompt | structured_llm

In [17]:
query_analyzer.invoke({"question": "bRAGAI - Generative AI Platform coming soon"}).pretty_print()

content_search: bRAGAI - Generative AI Platform coming soon
title_search: bRAGAI Generative AI Platform


In [18]:
query_analyzer.invoke(
    {"question": "videos on chat langchain published in 2023"}
).pretty_print()

content_search: chat langchain
title_search: chat langchain
earliest_publish_date: 2023-01-01
latest_publish_date: 2024-01-01


In [19]:
query_analyzer.invoke(
    {"question": "videos that are focused on the topic of chat langchain that are published before 2024"}
).pretty_print()

content_search: chat langchain
title_search: chat langchain
latest_publish_date: 2024-01-01


In [20]:
query_analyzer.invoke(
    {
        "question": "how to use multi-modal models in an agent, only videos under 5 minutes"
    }
).pretty_print()

content_search: how to use multi-modal models in an agent
title_search: multi-modal models agent
max_length_sec: 300


### 2. self-querying “自查询”检索
自查询[检索器](https://python.langchain.ac.cn/docs/concepts/retrievers/)`SelfQueryRetriever`正如其名，具有查询自身的能力。具体来说，给定任何自然语言查询，检索器使用查询构建 LLM 链来编写结构化查询，然后将该结构化查询应用于其底层[向量存储](https://python.langchain.ac.cn/docs/concepts/vectorstores/)。这使得检索器不仅可以使用用户输入的查询与存储文档的内容进行语义相似性比较，还可以从用户查询中提取关于存储文档元数据的过滤器，并执行这些过滤器。

[自查询检索](https://python.langchain.com/docs/how_to/self_query).

为了演示目的，我们将使用 Chroma 向量存储。我们创建了一个小型演示文档集，其中包含电影摘要。

注意： 自查询检索器需要您安装 lark 包。

In [21]:
%pip install --upgrade --quiet  lark langchain-chroma

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/111.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━[0m [32m102.4/111.0 kB[0m [31m3.1 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m111.0/111.0 kB[0m [31m2.6 MB/s[0m eta [36m0:00:00[0m
[?25h

In [22]:
from langchain_chroma import Chroma
from langchain_core.documents import Document

docs = [
    Document(
        page_content="A bunch of scientists bring back dinosaurs and mayhem breaks loose",
        metadata={"year": 1993, "rating": 7.7, "genre": "science fiction"},
    ),
    Document(
        page_content="Leo DiCaprio gets lost in a dream within a dream within a dream within a ...",
        metadata={"year": 2010, "director": "Christopher Nolan", "rating": 8.2},
    ),
    Document(
        page_content="A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea",
        metadata={"year": 2006, "director": "Satoshi Kon", "rating": 8.6},
    ),
    Document(
        page_content="A bunch of normal-sized women are supremely wholesome and some men pine after them",
        metadata={"year": 2019, "director": "Greta Gerwig", "rating": 8.3},
    ),
    Document(
        page_content="Toys come alive and have a blast doing so",
        metadata={"year": 1995, "genre": "animated"},
    ),
    Document(
        page_content="Three men walk into the Zone, three men walk out of the Zone",
        metadata={
            "year": 1979,
            "director": "Andrei Tarkovsky",
            "genre": "thriller",
            "rating": 9.9,
        },
    ),
]
vectorstore = Chroma.from_documents(docs, DashScopeEmbeddings(model="text-embedding-v4"))

现在我们可以实例化我们的`SelfQueryRetriever`。为此，我们需要预先提供一些关于文档支持的元数据字段的信息以及文档内容的简短描述。

In [23]:
from langchain.chains.query_constructor.schema import AttributeInfo
from langchain.retrievers.self_query.base import SelfQueryRetriever

metadata_field_info = [
    AttributeInfo(
        name="genre",
        description="The genre of the movie. One of ['science fiction', 'comedy', 'drama', 'thriller', 'romance', 'action', 'animated']",
        type="string",
    ),
    AttributeInfo(
        name="year",
        description="The year the movie was released",
        type="integer",
    ),
    AttributeInfo(
        name="director",
        description="The name of the movie director",
        type="string",
    ),
    AttributeInfo(
        name="rating", description="A 1-10 rating for the movie", type="float"
    ),
]
document_content_description = "Brief summary of a movie"
retriever = SelfQueryRetriever.from_llm(
    llm,
    vectorstore,
    document_content_description,
    metadata_field_info,
)

测试一下！

In [24]:
# This example only specifies a filter
retriever.invoke("I want to watch a movie rated higher than 8.5")

[Document(id='c0f1a94d-4743-4ce9-9d62-1e2df83943dc', metadata={'year': 1979, 'rating': 9.9, 'genre': 'thriller', 'director': 'Andrei Tarkovsky'}, page_content='Three men walk into the Zone, three men walk out of the Zone'),
 Document(id='dec09de0-02b2-4cbe-a01e-8ee3e01f99ef', metadata={'rating': 8.6, 'year': 2006, 'director': 'Satoshi Kon'}, page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea')]

In [25]:
# This example specifies a query and a filter
retriever.invoke("Has Greta Gerwig directed any movies about women")

[Document(id='04948275-17b1-43df-b53b-a4b71e93a2ba', metadata={'director': 'Greta Gerwig', 'rating': 8.3, 'year': 2019}, page_content='A bunch of normal-sized women are supremely wholesome and some men pine after them')]

In [27]:
# This example specifies a composite filter
retriever.invoke("What's a highly rated (above 8.5) science fiction film?")

[Document(id='dec09de0-02b2-4cbe-a01e-8ee3e01f99ef', metadata={'director': 'Satoshi Kon', 'rating': 8.6, 'year': 2006}, page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea'),
 Document(id='c0f1a94d-4743-4ce9-9d62-1e2df83943dc', metadata={'genre': 'thriller', 'year': 1979, 'director': 'Andrei Tarkovsky', 'rating': 9.9}, page_content='Three men walk into the Zone, three men walk out of the Zone')]

In [28]:
# This example specifies a query and composite filter
retriever.invoke(
    "What's a movie after 1990 but before 2005 that's all about toys, and preferably is animated"
)

[Document(id='540e8618-566f-4322-8819-234ef2bc47bc', metadata={'year': 1995, 'genre': 'animated'}, page_content='Toys come alive and have a blast doing so')]

##### 过滤器 k
我们还可以使用自查询检索器来指定 k：要获取的文档数量。

我们可以通过将 enable_limit=True 传递给构造函数来做到这一点。

In [29]:
retriever = SelfQueryRetriever.from_llm(
    llm,
    vectorstore,
    document_content_description,
    metadata_field_info,
    enable_limit=True,
)

# This example only specifies a relevant query
retriever.invoke("What are two movies about dinosaurs")

[Document(id='fd4c6821-ff6d-4a2a-879d-05d2302280da', metadata={'genre': 'science fiction', 'rating': 7.7, 'year': 1993}, page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose'),
 Document(id='540e8618-566f-4322-8819-234ef2bc47bc', metadata={'year': 1995, 'genre': 'animated'}, page_content='Toys come alive and have a blast doing so')]

使用`SelfQueryRetriever`需要确保您的查询构造器运行良好。通常，这需要调整提示、提示中的示例、属性描述等。有关通过改进酒店库存数据查询构造器的示例，[请查看此 cookbook](https://github.com/langchain-ai/langchain/blob/master/cookbook/self_query_hotel_search.ipynb)。

下一个关键要素是结构化查询转换器。这是负责将结构化查询转换为您正在使用的向量存储语法的元数据过滤器的对象。LangChain 附带了许多内置转换器。要查看所有这些转换器，请前往[集成部分](https://python.langchain.ac.cn/docs/integrations/retrievers/self_query/)。

# Conclusion

This notebook explored two key components of building advanced RAG systems:

1. **Routing Strategies**:
   - **Logical Routing**: Using function calling to classify and route queries to appropriate data sources
   - **Semantic Routing**: Leveraging embeddings and cosine similarity to match queries with relevant prompt templates

2. **Query Construction**:
   - **Metadata Filtering**: Building structured queries that combine semantic search with metadata filters
   - **Schema Definition**: Using Pydantic models to define structured search parameters
   - **Query Analysis**: Converting natural language questions into structured database queries

These techniques enable:
- More precise and relevant document retrieval
- Better handling of diverse data sources
- Structured filtering based on metadata
- Natural language interface for complex queries

The combination of intelligent routing and structured query construction forms the foundation for building more sophisticated and accurate RAG systems.