# Rag From Scratch: Query Transformations

Query transformations are a set of approaches focused on re-writing and / or modifying questions for retrieval.

:

## Enviornment

`(1) Packages`

In [2]:
! pip install langchain_community tiktoken langchain-google-genai langchainhub chromadb langchain

Collecting langchain_community
  Downloading langchain_community-0.3.26-py3-none-any.whl.metadata (2.9 kB)
Collecting langchain-google-genai
  Downloading langchain_google_genai-2.1.5-py3-none-any.whl.metadata (5.2 kB)
Collecting langchainhub
  Downloading langchainhub-0.1.21-py3-none-any.whl.metadata (659 bytes)
Collecting chromadb
  Downloading chromadb-1.0.13-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.0 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain_community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain_community)
  Downloading pydantic_settings-2.10.1-py3-none-any.whl.metadata (3.4 kB)
Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain_community)
  Downloading httpx_sse-0.4.1-py3-none-any.whl.metadata (9.4 kB)
Collecting filetype<2.0.0,>=1.2.0 (from langchain-google-genai)
  Downloading filetype-1.2.0-py2.py3-none-any.whl.metadata (6.5 kB)
Collecting google-a

`(2) LangSmith`

https://docs.smith.langchain.com/

In [3]:
import os
from google.colab import userdata
os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'
LANGCHAIN_API_KEY = userdata.get('LANGCHAIN_API_KEY')
os.environ['LANGCHAIN_API_KEY'] = LANGCHAIN_API_KEY

`(3) API Keys`

In [4]:
os.environ['GOOGLE_API_KEY'] = userdata.get('GOOGLE_API_KEY')
os.environ['PINECONE_API_KEY'] = userdata.get('PINECONE_API_KEY')


## Part 5: Multi Query

Flow:


Docs:

* https://python.langchain.com/docs/modules/data_connection/retrievers/MultiQueryRetriever

### Index

In [6]:
#### INDEXING ####

# Load blog
import bs4
from langchain_community.document_loaders import WebBaseLoader
from bs4 import BeautifulSoup
import bs4
import re
from langchain_core.documents import Document
import requests

class CustomWebLoader:
    def __init__(self):
        self.session = requests.Session()

    def load_html(self, url):
        """Enhanced HTML loader with better error handling"""
        try:
            response = self.session.get(url, timeout=30)
            response.raise_for_status()

            # Check if content-type is PDF
            content_type = response.headers.get('Content-Type', '')
            if 'application/pdf' in content_type:
                return self.load_pdf_from_url(url)

            soup = BeautifulSoup(response.content, 'html.parser')

            # Remove unwanted elements
            for element in soup(['script', 'style', 'nav', 'footer', 'iframe', 'noscript']):
                element.decompose()

            # Try to find main content areas - modified to include the classes from bs_kwargs
            article = (soup.find('article') or
                      soup.find('main') or
                      soup.find(class_=re.compile('content|main|body|post|post-content|post-title|post-header')) or
                      soup.find('div', role='main') or
                      soup)

            # Extract all text with structure
            content = self._extract_structured_content(article)
            if not content:
                raise ValueError("No content extracted from HTML")

            return [Document(page_content=content, metadata={'source': url, 'type': 'html'})]

        except Exception as e:
            print(f"Error loading {url}: {str(e)}")
            return []

    def _extract_structured_content(self, element):
        """Extract content while preserving document structure"""
        content = []

        def process_element(elem):
            if isinstance(elem, bs4.NavigableString):
                text = elem.strip()
                if text and len(text) > 10:
                    content.append(text)
                return

            tag = elem.name
            if not tag:
                return

            text = elem.get_text(' ', strip=True)
            if not text or len(text) <= 10:
                return

            # Handle headings
            if tag.startswith('h') and tag[1:].isdigit():
                level = int(tag[1:])
                content.append(f"\n{'#'*level} {text}\n")
            # Handle list items
            elif tag == 'li':
                content.append(f"- {text}")
            # Handle table cells
            elif tag in ['td', 'th']:
                content.append(f"[TABLE CELL] {text}")
            # Handle regular paragraphs
            elif tag == 'p':
                content.append(text)
            # Recursively process containers
            else:
                for child in elem.children:
                    process_element(child)

        process_element(element)
        full_text = '\n'.join(content)
        full_text = re.sub(r'\n{3,}', '\n\n', full_text)
        full_text = re.sub(r'[ \t]{2,}', ' ', full_text)
        return full_text.strip()

    def load(self, web_paths):
        """Load documents from multiple URLs"""
        docs = []
        for url in web_paths:
            docs.extend(self.load_html(url))
        return docs

# Usage
loader = CustomWebLoader()
blog_docs = loader.load([
    "https://lilianweng.github.io/posts/2023-06-23-agent/",
    "https://www.analyticsvidhya.com/blog/2023/05/regression-vs-classification/",
    "https://www.simplilearn.com/regression-vs-classification-in-machine-learning-article",
    "https://www.springboard.com/blog/data-science/regression-vs-classification/"
])
# Split
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=300,
    chunk_overlap=50)

# Make splits
splits = text_splitter.split_documents(blog_docs)

# Index
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_community.vectorstores import Chroma
vectorstore = Chroma.from_documents(documents=splits,
                                    embedding=GoogleGenerativeAIEmbeddings(model="models/embedding-001"))

retriever = vectorstore.as_retriever()



Error loading https://www.springboard.com/blog/data-science/regression-vs-classification/: 403 Client Error: Forbidden for url: https://www.springboard.com/blog/data-science/regression-vs-classification/


ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given
ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event ClientCreateCollectionEvent: capture() takes 1 positional argument but 3 were given


In [7]:
print(vectorstore._collection.count())
vectorstore._collection.get(include=["documents"])

ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event CollectionGetEvent: capture() takes 1 positional argument but 3 were given


70


{'ids': ['8c44c5e7-4c58-47ff-8ddd-d42d2f0d6e85',
  '699bfe9b-a94d-4d9f-aebf-671cc8f55103',
  '3849c904-e0ad-41b1-a9ab-be8c38fc0070',
  'ca7a2869-499f-4aa8-b2bd-dfac02ce47cc',
  '297b84d6-bdbb-4748-a4a5-c3a818e69293',
  '449d453f-1131-4e6f-9666-2d7e77d5b1d8',
  '6c324ebb-c5db-4c98-a494-9dff15039650',
  '70889c7d-2c88-45ea-af9e-d2120b2eae8d',
  'a209ed80-3eaa-4249-89e3-e3ab0a6644a3',
  '8393a80c-0daf-48ee-8473-3cdb40674dcd',
  '330a5aeb-ea3f-4a0e-af4f-052133248a23',
  '356bad39-39e8-4968-bd7e-7cc87404df71',
  '3f1ee4ec-5441-4299-83b1-0053b12ae22b',
  'ea34f3d8-baf1-45df-801d-2f29b4022075',
  'c74fe19c-876d-40b7-93d0-43d2a9017831',
  'f098db4e-7d6a-47c7-9c78-c3dbbca58d79',
  '8659b0b8-3c9f-4ea0-ae32-02291b214dd3',
  '006f8483-305c-402d-b49a-a124cdfc13d9',
  '8a6d6c26-56a5-43c2-b99e-d406c90694a8',
  '3f6ffe21-52cf-4c04-bdb7-88427c2246ea',
  'b3b56b3f-3d5b-4159-9187-e03482941d6c',
  '90dea652-146d-40cc-a310-1cd4694040da',
  '2c63df0c-50a4-45f3-9034-5acd1923efa9',
  '78199e42-5b2b-4947-ac8f-

### Prompt

In [8]:
from langchain.prompts import ChatPromptTemplate

# Multi Query: Different Perspectives
template = """You are an AI language model assistant. Your task is to generate five
different versions of the given user question to retrieve relevant documents from a vector
database. By generating multiple perspectives on the user question, your goal is to help
the user overcome some of the limitations of the distance-based similarity search.
Provide these alternative questions separated by newlines. Original question: {question}"""
prompt_perspectives = ChatPromptTemplate.from_template(template)

from langchain_core.output_parsers import StrOutputParser
from langchain_google_genai import ChatGoogleGenerativeAI

generate_queries = (
    prompt_perspectives
    | ChatGoogleGenerativeAI(model="gemini-2.0-flash",temperature=0)
    | StrOutputParser()
    | (lambda x: x.split("\n"))
)

In [9]:
from langchain.load import dumps, loads

def get_unique_union(documents: list[list]):
    """ Unique union of retrieved docs """
    # Flatten list of lists, and convert each Document to string
    flattened_docs = [dumps(doc) for sublist in documents for doc in sublist]
    # Get unique documents
    unique_docs = list(set(flattened_docs))
    # Return
    return [loads(doc) for doc in unique_docs]

# Retrieve
question = "5 differences between regression and classfication"
retrieval_chain = generate_queries | retriever.map() | get_unique_union
docs = retrieval_chain.invoke({"question":question})
len(docs)

ERROR:chromadb.telemetry.product.posthog:Failed to send telemetry event CollectionQueryEvent: capture() takes 1 positional argument but 3 were given
  return [loads(doc) for doc in unique_docs]


20

In [10]:
from operator import itemgetter
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.runnables import RunnablePassthrough

# RAG
template = """Answer the following question based on this context:

{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash",temperature=0)

final_rag_chain = (
    {"context": retrieval_chain,
     "question": itemgetter("question")}
    | prompt
    | llm
    | StrOutputParser()
)

final_rag_chain.invoke({"question":question})

'Here are 5 differences between Regression and Classification algorithms, based on the provided documents:\n\n1.  **Output Variable Type:** Regression algorithms require the output variable to be continuous or a real value, while classification algorithms require a discrete output value.\n2.  **Task:** Regression algorithms map an input value (x) to a continuous output variable (y). Classification algorithms map an input value (x) to a discrete output variable (y).\n3.  **Data Type:** Regression algorithms are used with continuous data, while classification algorithms are used with discrete data.\n4.  **Goal:** Regression attempts to find the best-fit line to predict the output accurately. Classification tries to find the decision boundary, which divides the dataset into different classes.\n5.  **Problem Solving:** Regression algorithms solve regression problems like house price prediction and weather prediction. Classification algorithms solve classification problems like identifying 

## Part 6: RAG-Fusion

Docs:

* https://github.com/langchain-ai/langchain/blob/master/cookbook/rag_fusion.ipynb?ref=blog.langchain.dev

Blog / repo:

* https://towardsdatascience.com/forget-rag-the-future-is-rag-fusion-1147298d8ad1

### Prompt

In [11]:
from langchain.prompts import ChatPromptTemplate

# RAG-Fusion: Related
template = """You are a helpful assistant that generates multiple search queries based on a single input query. \n
Generate multiple search queries related to: {question} \n
Output (4 queries):"""
prompt_rag_fusion = ChatPromptTemplate.from_template(template)

In [12]:
from langchain_core.output_parsers import StrOutputParser
from langchain_google_genai import ChatGoogleGenerativeAI

generate_queries = (
    prompt_rag_fusion
    | ChatGoogleGenerativeAI(model="gemini-2.0-flash",temperature=0)
    | StrOutputParser()
    | (lambda x: x.split("\n"))
)

In [13]:
from langchain.load import dumps, loads

def reciprocal_rank_fusion(results: list[list], k=60):
    """ Reciprocal_rank_fusion that takes multiple lists of ranked documents
        and an optional parameter k used in the RRF formula """

    # Initialize a dictionary to hold fused scores for each unique document
    fused_scores = {}

    # Iterate through each list of ranked documents
    for docs in results:
        # Iterate through each document in the list, with its rank (position in the list)
        for rank, doc in enumerate(docs):
            # Convert the document to a string format to use as a key (assumes documents can be serialized to JSON)
            doc_str = dumps(doc)
            # If the document is not yet in the fused_scores dictionary, add it with an initial score of 0
            if doc_str not in fused_scores:
                fused_scores[doc_str] = 0
            # Retrieve the current score of the document, if any
            previous_score = fused_scores[doc_str]
            # Update the score of the document using the RRF formula: 1 / (rank + k)
            fused_scores[doc_str] += 1 / (rank + k)

    # Sort the documents based on their fused scores in descending order to get the final reranked results
    reranked_results = [
        (loads(doc), score)
        for doc, score in sorted(fused_scores.items(), key=lambda x: x[1], reverse=True)
    ]

    # Return the reranked results as a list of tuples, each containing the document and its fused score
    return reranked_results

retrieval_chain_rag_fusion = generate_queries | retriever.map() | reciprocal_rank_fusion
docs = retrieval_chain_rag_fusion.invoke({"question": question})
docs

[(Document(metadata={'type': 'html', 'source': 'https://www.simplilearn.com/regression-vs-classification-in-machine-learning-article'}, page_content='## Types of Regression\n\nHere are the types of Regression algorithms commonly found in the Machine Learning field:\n- Decision Tree Regression :\xa0The primary purpose of this regression is to divide the dataset into smaller subsets. These subsets are created to plot the value of any data point connecting to the problem statement.\n- Principal Components Regression:\xa0This regression technique is widely used. There are many independent variables, or multicollinearity exists in your data.\n- Polynomial Regression:\xa0This type fits a non-linear equation by using the polynomial functions of an independent variable.\n- Random Forest Regression: Random Forest regression is heavily used in Machine Learning. It uses multiple decision trees to predict the output. Random data points are chosen from the given dataset and used to build a decision

In [14]:
from langchain_core.runnables import RunnablePassthrough

# RAG
template = """Answer the following question based on this context:

{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

final_rag_chain = (
    {"context": retrieval_chain_rag_fusion,
     "question": itemgetter("question")}
    | prompt
    | llm
    | StrOutputParser()
)

final_rag_chain.invoke({"question":question})

'Based on the provided documents, here are five differences between Regression and Classification algorithms:\n\n1.  **Output Variable Type:** Regression algorithms predict a continuous output variable (real value), while classification algorithms predict a discrete output variable.\n2.  **Task:** Regression algorithms map an input value (x) to a continuous output variable (y), while classification algorithms map an input value (x) to a discrete output variable (y).\n3.  **Data Type:** Regression algorithms are used with continuous data, while classification algorithms are used with discrete data.\n4.  **Goal:** Regression attempts to find the best-fit line to predict the output accurately, while classification tries to find the decision boundary that divides the dataset into different classes.\n5.  **Problem Solving:** Regression algorithms solve regression problems like house price prediction and weather prediction, while classification algorithms solve classification problems like i

Trace:

https://smith.langchain.com/public/071202c9-9f4d-41b1-bf9d-86b7c5a7525b/r

## Part 7: Decomposition

In [15]:
from langchain.prompts import ChatPromptTemplate

# Decomposition
template = """You are a helpful assistant that generates multiple sub-questions related to an input question. \n
The goal is to break down the input into a set of sub-problems / sub-questions that can be answers in isolation. \n
Generate multiple search queries related to: {question} \n
Output (3 queries):"""
prompt_decomposition = ChatPromptTemplate.from_template(template)

In [16]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.output_parsers import StrOutputParser

# LLM
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash",temperature=0)

# Chain
generate_queries_decomposition = ( prompt_decomposition | llm | StrOutputParser() | (lambda x: x.split("\n")))

# Run
question = "What is Regression in machine learning"
questions = generate_queries_decomposition.invoke({"question":question})

In [17]:
questions

['1. **What is the mathematical definition of regression in machine learning?** (Focuses on the underlying mathematical principles)',
 '2. **What are the different types of regression algorithms in machine learning?** (Explores the various techniques used)',
 '3. **What are the common applications of regression in real-world scenarios?** (Focuses on practical uses)']


Papers:

* https://arxiv.org/pdf/2205.10625.pdf
* https://arxiv.org/abs/2212.10509.pdf

In [18]:
# Prompt
template = """Here is the question you need to answer:

\n --- \n {question} \n --- \n

Here is any available background question + answer pairs:

\n --- \n {q_a_pairs} \n --- \n

Here is additional context relevant to the question:

\n --- \n {context} \n --- \n

Use the above context and any background question + answer pairs to answer the question: \n {question}
"""

decomposition_prompt = ChatPromptTemplate.from_template(template)

In [19]:
from operator import itemgetter
from langchain_core.output_parsers import StrOutputParser

def format_qa_pair(question, answer):
    """Format Q and A pair"""

    formatted_string = ""
    formatted_string += f"Question: {question}\nAnswer: {answer}\n\n"
    return formatted_string.strip()

# llm
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash", temperature=0)

q_a_pairs = ""
for q in questions:

    rag_chain = (
    {"context": itemgetter("question") | retriever,
     "question": itemgetter("question"),
     "q_a_pairs": itemgetter("q_a_pairs")}
    | decomposition_prompt
    | llm
    | StrOutputParser())

    answer = rag_chain.invoke({"question":q,"q_a_pairs":q_a_pairs})
    q_a_pair = format_qa_pair(q,answer)
    q_a_pairs = q_a_pairs + "\n---\n"+  q_a_pair
    print(q_a_pair)

Question: 1. **What is the mathematical definition of regression in machine learning?** (Focuses on the underlying mathematical principles)
Answer: In machine learning, regression is a supervised learning technique that aims to model the relationship between a dependent variable (or target variable), denoted as *y*, and one or more independent variables (or features), denoted as *x*. Mathematically, the goal of regression is to find a function *f(x)* that maps the input variables *x* to a continuous output variable *y*. This function *f(x)* is often expressed as:

*y* = *f(x)* + ε

where:

*   *y* is the dependent variable (the value we want to predict).
*   *x* is the independent variable (or a vector of independent variables).
*   *f(x)* is the regression function, which represents the relationship between *x* and *y*.  This function can take various forms (linear, polynomial, etc.) depending on the chosen regression model.
*   ε is the error term (also known as noise or residual), r

In [20]:
answer

"Regression analysis is a versatile tool with numerous applications across various fields. Here are some common real-world scenarios where regression is used:\n\n*   **Economics and Finance:**\n    *   **Predicting stock prices:** Regression models can be used to analyze historical stock data and other economic indicators to forecast future stock prices.\n    *   **Forecasting economic growth:** Governments and financial institutions use regression to predict GDP growth, inflation rates, and unemployment rates.\n    *   **Credit risk assessment:** Banks use regression models to assess the creditworthiness of loan applicants by analyzing their financial history and other relevant factors.\n    *   **Real estate appraisal:** Estimating property values based on location, size, features, and comparable sales data.\n\n*   **Marketing and Sales:**\n    *   **Sales forecasting:** Predicting future sales based on past sales data, marketing campaigns, and economic conditions.\n    *   **Custome

Trace:

Question 1: https://smith.langchain.com/public/faefde73-0ecb-4328-8fee-a237904115c0/r

Question 2: https://smith.langchain.com/public/6142cad3-b314-454e-b2c9-15146cfcce78/r

Question 3: https://smith.langchain.com/public/84bdca0f-0fa4-46d4-9f89-a7f25bd857fe/r

### Answer individually


In [21]:
# Answer each sub-question individually

from langchain import hub
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
from langchain_core.output_parsers import StrOutputParser
from langchain_google_genai import ChatGoogleGenerativeAI

# RAG prompt
prompt_rag = hub.pull("rlm/rag-prompt")

def retrieve_and_rag(question,prompt_rag,sub_question_generator_chain):
    """RAG on each sub-question"""

    # Use our decomposition /
    sub_questions = sub_question_generator_chain.invoke({"question":question})

    # Initialize a list to hold RAG chain results
    rag_results = []

    for sub_question in sub_questions:

        # Retrieve documents for each sub-question
        retrieved_docs = retriever.get_relevant_documents(sub_question)

        # Use retrieved documents and sub-question in RAG chain
        answer = (prompt_rag | llm | StrOutputParser()).invoke({"context": retrieved_docs,
                                                                "question": sub_question})
        rag_results.append(answer)
        print(sub_question)
        print(answer)
    return rag_results,sub_questions

# Wrap the retrieval and RAG process in a RunnableLambda for integration into a chain
answers, questions = retrieve_and_rag(question, prompt_rag, generate_queries_decomposition)

  retrieved_docs = retriever.get_relevant_documents(sub_question)


1. **What is the mathematical definition of regression in machine learning?** (Focuses on the underlying mathematical principles)
Regression algorithms find correlations between dependent and independent variables. The algorithm's task is to find the mapping function so we can map the input variable "x" to the continuous output variable "y." It attempts to find the best fit line, which predicts the output more accurately.
2. **What are the different types of regression algorithms in machine learning?** (Explores the various techniques used)
The different types of regression algorithms include Decision Tree Regression, Principal Components Regression, Polynomial Regression, and Random Forest Regression. Other types are Simple Linear Regression and Support Vector Regression. These algorithms vary in complexity and approach, suitable for different types of data and prediction tasks.
3. **What are the common applications of regression in real-world scenarios?** (Focuses on practical uses)


In [22]:
def format_qa_pairs(questions, answers):
    """Format Q and A pairs"""

    formatted_string = ""
    for i, (question, answer) in enumerate(zip(questions, answers), start=1):
        formatted_string += f"Question {i}: {question}\nAnswer {i}: {answer}\n\n"
    return formatted_string.strip()

context = format_qa_pairs(questions, answers)
print(context)
# Prompt
template = """Here is a set of Q+A pairs:

{context}

Use these to synthesize an answer to the question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

final_rag_chain = (
    prompt
    | llm
    | StrOutputParser()
)

final_rag_chain.invoke({"context":context,"question":question})

Question 1: 1. **What is the mathematical definition of regression in machine learning?** (Focuses on the underlying mathematical principles)
Answer 1: Regression algorithms find correlations between dependent and independent variables. The algorithm's task is to find the mapping function so we can map the input variable "x" to the continuous output variable "y." It attempts to find the best fit line, which predicts the output more accurately.

Question 2: 2. **What are the different types of regression algorithms in machine learning?** (Explores the various techniques used)
Answer 2: The different types of regression algorithms include Decision Tree Regression, Principal Components Regression, Polynomial Regression, and Random Forest Regression. Other types are Simple Linear Regression and Support Vector Regression. These algorithms vary in complexity and approach, suitable for different types of data and prediction tasks.

Question 3: 3. **What are the common applications of regressi

'Regression in machine learning is a process of finding correlations between dependent and independent variables, aiming to map input variables (x) to a continuous output variable (y). The core task is to find a mapping function, often represented as a "best fit line," that accurately predicts the output. There are various regression algorithms, including Simple Linear Regression, Polynomial Regression, Decision Tree Regression, Random Forest Regression, Principal Components Regression, and Support Vector Regression, each suited for different data types and prediction complexities. Regression finds practical application in real-world scenarios like predicting house prices and weather patterns by identifying the best-fit line to estimate continuous output values. These algorithms can be broadly categorized into linear and non-linear types.'

Trace:

https://smith.langchain.com/public/d8f26f75-3fb8-498a-a3a2-6532aa77f56b/r

## Part 8: Step Back


Paper:

* https://arxiv.org/pdf/2310.06117.pdf

In [23]:
# Few Shot Examples
from langchain_core.prompts import ChatPromptTemplate, FewShotChatMessagePromptTemplate
examples = [
    {
        "input": "Could the members of The Police perform lawful arrests?",
        "output": "what can the members of The Police do?",
    },
    {
        "input": "Jan Sindel’s was born in what country?",
        "output": "what is Jan Sindel’s personal history?",
    },
]
# We now transform these to example messages
example_prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "{input}"),
        ("ai", "{output}"),
    ]
)
few_shot_prompt = FewShotChatMessagePromptTemplate(
    example_prompt=example_prompt,
    examples=examples,
)
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            """You are an expert at world knowledge. Your task is to step back and paraphrase a question to a more generic step-back question, which is easier to answer. Here are a few examples:""",
        ),
        # Few shot examples
        few_shot_prompt,
        # New question
        ("user", "{question}"),
    ]
)

In [25]:
generate_queries_step_back = prompt | ChatGoogleGenerativeAI(model="gemini-2.0-flash",temperature=0) | StrOutputParser()
question = "explain decision tree regression model and how it works"
generate_queries_step_back.invoke({"question": question})

'What is decision tree?'

In [26]:
# Response prompt
response_prompt_template = """You are an expert of world knowledge. I am going to ask you a question. Your response should be comprehensive and not contradicted with the following context if they are relevant. Otherwise, ignore them if they are not relevant.

# {normal_context}
# {step_back_context}

# Original Question: {question}
# Answer:"""
response_prompt = ChatPromptTemplate.from_template(response_prompt_template)

chain = (
    {
        # Retrieve context using the normal question
        "normal_context": RunnableLambda(lambda x: x["question"]) | retriever,
        # Retrieve context using the step-back question
        "step_back_context": generate_queries_step_back | retriever,
        # Pass on the question
        "question": lambda x: x["question"],
    }
    | response_prompt
    | ChatGoogleGenerativeAI(model="gemini-2.0-flash",temperature=0)
    | StrOutputParser()
)

chain.invoke({"question": question})

"A decision tree regression model is a type of supervised machine learning algorithm used for predicting continuous values. It works by partitioning the data into smaller subsets based on a series of if-else conditions, ultimately creating a tree-like structure that can be used to predict the value of a target variable.\n\nHere's a breakdown of how it works:\n\n1.  **Data Preparation:** The model is trained on a dataset containing independent variables (features) and a continuous dependent variable (target).\n2.  **Tree Building:**\n    *   The algorithm starts by considering all independent variables and potential split points for each variable.\n    *   For each split point, it divides the data into two subsets.\n    *   It then calculates the Sum of Squared Errors (SSE) for each split. The SSE measures the difference between the predicted values and the actual values in each subset.\n    *   The variable and split point that result in the lowest SSE are chosen as the split for the c

## Part 9: HyDE
Docs:

* https://github.com/langchain-ai/langchain/blob/master/cookbook/hypothetical_document_embeddings.ipynb

Paper:

* https://arxiv.org/abs/2212.10496

In [27]:
from langchain.prompts import ChatPromptTemplate

# HyDE document genration
template = """Please write a scientific paper passage to answer the question
Question: {question}
Passage:"""
prompt_hyde = ChatPromptTemplate.from_template(template)

from langchain_core.output_parsers import StrOutputParser
from langchain_google_genai import ChatGoogleGenerativeAI

generate_docs_for_retrieval = (
    prompt_hyde | ChatGoogleGenerativeAI(model="gemini-2.0-flash",temperature=0) | StrOutputParser()
)

# Run
question = "What is task decomposition for LLM agents?"
generate_docs_for_retrieval.invoke({"question":question})

"## Task Decomposition in Large Language Model Agents: A Critical Overview\n\nTask decomposition is a fundamental cognitive process that involves breaking down a complex, high-level goal into a series of smaller, more manageable sub-tasks. In the context of Large Language Model (LLM) agents, task decomposition refers to the process of partitioning a user's initial request or objective into a sequence of actionable steps that the agent can execute, often leveraging external tools or APIs. This decomposition is crucial for enabling LLM agents to tackle intricate problems that require reasoning, planning, and interaction with the environment.\n\nThe effectiveness of task decomposition directly impacts the overall performance of the LLM agent. A well-decomposed task sequence allows the agent to: (1) systematically explore the problem space, (2) leverage specialized tools for specific sub-tasks, (3) maintain a coherent plan towards achieving the final goal, and (4) recover from errors or un

In [28]:
# Retrieve
retrieval_chain = generate_docs_for_retrieval | retriever
retireved_docs = retrieval_chain.invoke({"question":question})
retireved_docs

[Document(metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'type': 'html'}, page_content='Chain of thought (CoT; Wei et al. 2022 ) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\nTree of Thoughts ( Yao et al. 2023 ) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\nTask decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.

In [29]:
# RAG
template = """Answer the following question based on this context:

{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

final_rag_chain = (
    prompt
    | llm
    | StrOutputParser()
)

final_rag_chain.invoke({"context":retireved_docs,"question":question})

'Task decomposition for LLM agents is the process of breaking down large tasks into smaller, more manageable subgoals. This enables the agent to handle complex tasks more efficiently. Task decomposition can be done by:\n\n1.  LLM with simple prompting (e.g., "Steps for XYZ. 1.", "What are the subgoals for achieving XYZ?")\n2.  Using task-specific instructions (e.g., "Write a story outline." for writing a novel)\n3.  With human inputs.'