### What is a Query Builder (Query Constructor) in RAG?
- A Query Builder is a preprocessing step in the RAG (Retrieval-Augmented Generation) pipeline that transforms the user question into a more effective and retrieval-friendly query. Instead of sending the raw question to the retriever, it enriches or modifies the question (sometimes by adding task context, metadata, or rephrasing) to retrieve better documents from the vector store.

###Why Do We Use a Query Builder?
In RAG, document retrieval quality is crucial. A vague or too-specific question might lead to poor retrieval. A Query Builder helps in:

- Improving retrieval accuracy.

- Making vague queries more specific.

- Adding context from the conversation or prompt.

- Aligning with domain-specific language (e.g., legal, medical).

### How to Use Query Builder in Langchain
- Now let’s implement your RAG pipeline with a Query Builder in Colab. We'll use an LLM-based query constructor to rephrase the question before retrieval.

Chainable Builders – Built using Langchain’s Runnable interfaces. Combine Dynamic Builder with Runnables.

In [1]:
# ===================== INSTALL DEPENDENCIES =====================
!pip install -q langchain sentence-transformers faiss-cpu pypdf groq langchain-community langchain-groq

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m31.3/31.3 MB[0m [31m73.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m313.2/313.2 kB[0m [31m30.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m131.1/131.1 kB[0m [31m17.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m105.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.2/45.2 kB[0m [31m4.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m363.4/363.4 MB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.8/13.8 MB[0m [31m120.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.6/24.6 MB[0m [31m97.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [2]:
# ================== IMPORTS ==================
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnableMap, RunnableLambda, RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_groq import ChatGroq

In [5]:
# ================== LOAD & SPLIT PDF ==================
loader = PyPDFLoader("/content/solid-python.pdf")
documents = loader.load_and_split()

splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
docs = splitter.split_documents(documents)

In [6]:
# ================== EMBEDDINGS + VECTORSTORE ==================
embedding_model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
vectorstore = FAISS.from_documents(docs, embedding_model)
retriever = vectorstore.as_retriever(search_type="mmr", search_kwargs={"k": 5})

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [14]:
# ================== DEFINE LLM ==================
from google.colab import userdata
llm = ChatGroq(
    model_name="llama-3.3-70b-versatile",
    api_key=userdata.get("GROQ_API_KEY")
)
llm

ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x7d393853c210>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x7d393853ccd0>, model_name='llama-3.3-70b-versatile', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [15]:
# ================== QUERY BUILDER ==================

# Prompt to improve the question for better retrieval
query_builder_prompt = PromptTemplate.from_template(
    "You are a helpful assistant. Rewrite the following user question to be more clear and retrieval-friendly:\n\nOriginal Question: {question}\n\nImproved Question:"
)

# Use the same LLM to rewrite the query
query_builder_chain = query_builder_prompt | llm | StrOutputParser()
query_builder_chain

PromptTemplate(input_variables=['question'], input_types={}, partial_variables={}, template='You are a helpful assistant. Rewrite the following user question to be more clear and retrieval-friendly:\n\nOriginal Question: {question}\n\nImproved Question:')
| ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x7d393853c210>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x7d393853ccd0>, model_name='llama-3.3-70b-versatile', model_kwargs={}, groq_api_key=SecretStr('**********'))
| StrOutputParser()

In [16]:
# ================== RETRIEVAL USING QUERY BUILDER ==================

# Step 1: Rewrite user question
query_rewrite_runnable = RunnableLambda(lambda x: {"question": x["question"]}) | query_builder_chain

# Step 2: Get relevant docs using rewritten query
retriever_runnable = query_rewrite_runnable | RunnableLambda(lambda q: retriever.get_relevant_documents(q))
retriever_runnable

RunnableLambda(lambda x: {'question': x['question']})
| PromptTemplate(input_variables=['question'], input_types={}, partial_variables={}, template='You are a helpful assistant. Rewrite the following user question to be more clear and retrieval-friendly:\n\nOriginal Question: {question}\n\nImproved Question:')
| ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x7d393853c210>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x7d393853ccd0>, model_name='llama-3.3-70b-versatile', model_kwargs={}, groq_api_key=SecretStr('**********'))
| StrOutputParser()
| RunnableLambda(lambda q: retriever.get_relevant_documents(q))

In [17]:
# ================== FINAL PROMPT TEMPLATE ==================
prompt = PromptTemplate.from_template(
    "Use the following context to answer the question:\n\n{context}\n\nQuestion: {question}"
)

In [18]:
# ================== FINAL RAG CHAIN ==================
rag_chain = (
    RunnableMap({
        "context": retriever_runnable,
        "question": RunnablePassthrough()
    })
    | prompt
    | llm
    | StrOutputParser()
)
rag_chain

{
  context: RunnableLambda(lambda x: {'question': x['question']})
           | PromptTemplate(input_variables=['question'], input_types={}, partial_variables={}, template='You are a helpful assistant. Rewrite the following user question to be more clear and retrieval-friendly:\n\nOriginal Question: {question}\n\nImproved Question:')
           | ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x7d393853c210>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x7d393853ccd0>, model_name='llama-3.3-70b-versatile', model_kwargs={}, groq_api_key=SecretStr('**********'))
           | StrOutputParser()
           | RunnableLambda(lambda q: retriever.get_relevant_documents(q)),
  question: RunnablePassthrough()
}
| PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template='Use the following context to answer the question:\n\n{context}\n\nQuestion: {question}')
| ChatGroq(client=<groq.resources.chat.com

In [19]:
# ================== RUN THE CHAIN ==================
question = "Explain solid?"
response = rag_chain.invoke({"question": question})

print("Final Response:\n")
print(response)

  retriever_runnable = query_rewrite_runnable | RunnableLambda(lambda q: retriever.get_relevant_documents(q))


Final Response:

SOLID is an acronym that stands for five design principles of object-oriented programming (OOP) that aim to promote cleaner, more robust, and updatable code for software development in object-oriented languages. Each letter in SOLID represents a principle for development:

- **S** - Single Responsibility Principle (SRP): This principle asserts that a class should have only one reason to change, meaning that a class should have only one job or responsibility. This makes the class more focused and easier to maintain.

- **O** - Open/Closed Principle (OCP): This principle states that software entities (classes, modules, functions, etc.) should be open for extension but closed for modification. In other words, you should be able to add new functionality without changing the existing code.

- **L** - Liskov Substitution Principle (LSP): This principle says that subtypes should be substitutable for their base types. This means that any code that uses a base type should be ab