In [22]:
!pip install -qU langchain langchain-community langchain-openai youtube-transcript-api pytube langchain-chroma langchain-text-splitters

# Build a Query Analysis System

In this session, we will build a query analysis end-to-end system.

## Setup

In [None]:
import os

langchain_api_key = 'your_langchain_api_key_here'  # Replace with your actual LangChain API key
os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ['LANGCHAIN_API_KEY'] = langchain_api_key

openai_api_key = 'your_openai_api_key_here'  # Replace with your actual OpenAI API key
os.environ['OPENAI_API_KEY'] = openai_api_key

### Load documents

We can use the `YouTubeLoader` to load trasncripts of a few LangChain videos:

In [15]:
from langchain_community.document_loaders import YoutubeLoader

urls = [
    "https://www.youtube.com/watch?v=HAn9vnJy6S4",
    "https://www.youtube.com/watch?v=dA1cHGACXCo",
    "https://www.youtube.com/watch?v=ZcEMLz27sL4",
    "https://www.youtube.com/watch?v=hvAPnpSfSGo",
    "https://www.youtube.com/watch?v=EhlPDL4QrWY",
    "https://www.youtube.com/watch?v=mmBo8nlu2j0",
    "https://www.youtube.com/watch?v=rQdibOsL1ps",
    "https://www.youtube.com/watch?v=28lC4fqukoc",
    "https://www.youtube.com/watch?v=es-9MgxB-uc",
    "https://www.youtube.com/watch?v=wLRHwKuKvOE",
    "https://www.youtube.com/watch?v=ObIltMaRJvY",
    "https://www.youtube.com/watch?v=DjuXACWYkkU",
    "https://www.youtube.com/watch?v=o7C9ld6Ln-M",
]

docs = []
for url in urls:
    docs.extend(
        YoutubeLoader.from_youtube_url(
            url,
            add_video_info=False, # bug if set to True
        ).load()
    )

In [16]:
docs[0]

Document(metadata={'source': 'HAn9vnJy6S4'}, page_content="hello today I want to talk about open gpts open gpts is a project that we built here at linkchain uh that replicates the GPT store in a few ways so it creates uh end user-facing friendly interface to create different Bots and these Bots can have access to different tools and they can uh be given files to retrieve things over and basically it's a way to create a variety of bots and expose the configuration of these Bots to end users it's all open source um it can be used with open AI it can be used with other models as as we'll see um and it's an exciting way to create a a GPT store like experience if you're building a more focused platform an internal platform or any of that so we launched this a few months ago actually right when uh open AI released their GPT store and but we haven't really dove into what's going on or how to use it um and so there's several things that I want to cover in this video there's maybe two main area

Since `add_video_info=True` causes error, we cannot fetch other metadata besides `source`.

In [17]:
docs[0].metadata

{'source': 'HAn9vnJy6S4'}

A sample from a document's contents:

In [19]:
docs[0].page_content[:500]

"hello today I want to talk about open gpts open gpts is a project that we built here at linkchain uh that replicates the GPT store in a few ways so it creates uh end user-facing friendly interface to create different Bots and these Bots can have access to different tools and they can uh be given files to retrieve things over and basically it's a way to create a variety of bots and expose the configuration of these Bots to end users it's all open source um it can be used with open AI it can be us"

### Indexing documents

Whenever we perform retrieval we need to create an index of documents that we can query. We will use a vector store to index our documents, and we will chunk them first to make our retrievals more concise and precise:

In [25]:
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000)
chunked_docs = text_splitter.split_documents(docs)

embeddings = OpenAIEmbeddings(model='text-embedding-3-small')

vectorstore = Chroma.from_documents(
    chunked_docs,
    embeddings,
)

## Retrieval without query analysis

We can perform similarity search on a user question directly to find chunks relevant to the question:

In [26]:
query = "How do i build a RAG agent?"

search_results = vectorstore.similarity_search(query)

print(search_results[0].metadata)
print(search_results[0].page_content[:500])

{'source': 'HAn9vnJy6S4'}
hardcoded that it will always do a retrieval step here the assistant decides whether to do a retrieval step or not sometimes this is good sometimes this is bad sometimes it you don't need to do a retrieval step when I said hi it didn't need to call it tool um but other times you know the the llm might mess up and not realize that it needs to do a retrieval step and so the rag bot will always do a retrieval step so it's more focused there because this is also a simpler architecture so it's always


## Query analysis

### Query schema

In [27]:
from typing import Optional
from pydantic import BaseModel, Field

class Search(BaseModel):
    """Search over a database of tutorial videos about a software library."""

    query: str = Field(
        ...,
        description="Similarity search query applied to video transcripts.",
    )

### Query generation

To convert user questions to structured queries we will make use of OpenAI's tool-calling API. Specifically we will use the new `ChatModel.with_structured_output()` constructor to handle passing the schema to the model and parsing the output.

In [28]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI


system = """You are an expert at converting user questions into database queries. \
You have access to a database of tutorial videos about a software library for building LLM-powered applications. \
Given a question, return a list of database queries optimized to retrieve the most relevant results.

If there are acronyms or words you are not familiar with, do not try to rephrase them."""

prompt = ChatPromptTemplate.from_messages(
    [
        ('system', system),
        ('human', '{question}'),
    ]
)

llm = ChatOpenAI(model='gpt-3.5-turbo', temperature=0)

structured_llm = llm.with_structured_output(Search)

query_analyzer = {'question': RunnablePassthrough()} | prompt | structured_llm

If we give our previous query:

In [29]:
query_analyzer.invoke("How do I build a RAG agent?")

Search(query='build RAG agent')

## Retrieval with query analysis

In our example, we specified `tool_choice="Search"`. This will force the LLM to call one-and only one-tool, meaning that we will always have one optimized query to look up.

In [30]:
from typing import List
from langchain_core.documents import Document

def retrieval(search: Search) -> List[Document]:
    if search.query:
        return vectorstore.similarity_search(search.query)


retrieval_chain = query_analyzer | retrieval

In [31]:
results = retrieval_chain.invoke("How do I build a RAG agent?")

[doc.metadata['source'] for doc in results]

['HAn9vnJy6S4', 'HAn9vnJy6S4', 'ZcEMLz27sL4', 'hvAPnpSfSGo']