https://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_5_to_9.ipynb

some deviations from the source code because i dont wanna pay for embeddings from openai, or hit openai models. All openAI integration is replaced with ollama.

I also removed langsmith integration. don't think it's needed. just a frontend for LLM debugging which i can achieve with `langchain.debug = True`

HyDE - For RAGs, we normally retrieve documents by comparing embeddings with the question. But a question is written in the style of each user and typically quite different from the documents to which it's compared against.

To resolve this HyDE attempts to ask a LLM to create a hypothetical document, based on your input question, then perform a Retrieval search using this hypothetical document instead. 

In [1]:
from langchain_community.vectorstores import Chroma
# Load documents
import bs4
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

# setting debug to true will allow us to see what is langchain actually creating
import langchain 
langchain.debug = True 

# Get embedding model
from langchain_ollama import OllamaEmbeddings

# Get chat model
from langchain_ollama.chat_models import ChatOllama

from langchain.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

from operator import itemgetter

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [2]:
# Everything in this cell is from previous notebooks
# Load docs from bs4
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
blog_docs = loader.load()

# Split docs
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=300, 
    chunk_overlap=50)

splits = text_splitter.split_documents(blog_docs)

# Get embedding ollama model
embed = OllamaEmbeddings(
    model="nomic-embed-text"
)

# Embed
vectorstore = Chroma.from_documents(
    documents=splits, 
    embedding=embed)

# Set up a retriever
embed = OllamaEmbeddings(
    model="nomic-embed-text"
)

# Embed
retriever = vectorstore.as_retriever(
    search_kwargs={"k": 5}, # How many to retrieve
    search_type='mmr'       # 'similarity' by default
)

# Get llm
llm = ChatOllama(model="llama3.2:3b-instruct-q5_K_M", temperature=0)

In [5]:
hyde_prompt_template = """Please write a paper/document to answer the question:
Question: {question}"""

hyde_prompt = ChatPromptTemplate.from_template(hyde_prompt_template)

question = 'what is task decomposition for LLM agents'

hyde_chain = (hyde_prompt | llm | StrOutputParser())
hyde_chain.invoke({'question': question})

[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence] Entering Chain run with input:
[0m{
  "question": "what is task decomposition for LLM agents"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > prompt:ChatPromptTemplate] Entering Prompt run with input:
[0m{
  "question": "what is task decomposition for LLM agents"
}
[36;1m[1;3m[chain/end][0m [1m[chain:RunnableSequence > prompt:ChatPromptTemplate] [1ms] Exiting Prompt run with output:
[0m[outputs]
[32;1m[1;3m[llm/start][0m [1m[chain:RunnableSequence > llm:ChatOllama] Entering LLM run with input:
[0m{
  "prompts": [
    "Human: Please write a paper/document to answer the question:\nQuestion: what is task decomposition for LLM agents"
  ]
}
[36;1m[1;3m[llm/end][0m [1m[chain:RunnableSequence > llm:ChatOllama] [59.28s] Exiting LLM run with output:
[0m{
  "generations": [
    [
      {
        "text": "**Task Decomposition for Large Language Model (LLM) Agents**\n\n**Abstract**\n\nTask decomposition is

"**Task Decomposition for Large Language Model (LLM) Agents**\n\n**Abstract**\n\nTask decomposition is a crucial aspect of artificial intelligence that involves breaking down complex tasks into smaller, manageable sub-tasks. In the context of Large Language Model (LLM) agents, task decomposition plays a vital role in enabling these agents to perform a wide range of cognitive and decision-making tasks. This paper provides an overview of task decomposition for LLM agents, including its importance, benefits, and challenges.\n\n**Introduction**\n\nLarge Language Models (LLMs) have revolutionized the field of artificial intelligence by achieving state-of-the-art performance in various natural language processing (NLP) tasks. However, these models are typically trained on large datasets and lack the ability to generalize to new, unseen tasks. Task decomposition is a technique that addresses this limitation by breaking down complex tasks into smaller sub-tasks, allowing LLM agents to learn an

In [6]:
actual_prompt_template = """Answer the following question based on this context:

{context}

Question: {question}
"""
actual_prompt = ChatPromptTemplate.from_template(actual_prompt_template)

hyde_rag = (
    {
        'context': itemgetter('question') | hyde_chain | retriever,
        'question': itemgetter('question')
    }
    | actual_prompt
    | llm
)

hyde_rag.invoke({'question': question})

[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence] Entering Chain run with input:
[0m{
  "question": "what is task decomposition for LLM agents"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableParallel<context,question>] Entering Chain run with input:
[0m{
  "question": "what is task decomposition for LLM agents"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableParallel<context,question> > chain:RunnableSequence] Entering Chain run with input:
[0m{
  "question": "what is task decomposition for LLM agents"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableParallel<context,question> > chain:RunnableSequence > chain:RunnableLambda] Entering Chain run with input:
[0m{
  "question": "what is task decomposition for LLM agents"
}
[36;1m[1;3m[chain/end][0m [1m[chain:RunnableSequence > chain:RunnableParallel<context,question> > chain:RunnableSequence > chain:RunnableLambda] [0ms] Exiting Chain r

AIMessage(content='According to the text, task decomposition for LLM (Large Language Model) agents involves breaking down a complicated task into smaller and simpler steps. This can be done in three ways:\n\n1. Using simple prompting, such as "Steps for XYZ." or "What are the subgoals for achieving XYZ."\n2. Using task-specific instructions, such as "Write a story outline" for writing a novel.\n3. With human inputs.\n\nTask decomposition is used to help LLM agents plan ahead and effectively explore the solution space, which remains challenging due to their limited context length and struggle with adjusting plans when faced with unexpected errors.', additional_kwargs={}, response_metadata={'model': 'llama3.2:3b-instruct-q5_K_M', 'created_at': '2025-02-06T15:19:02.372361651Z', 'done': True, 'done_reason': 'stop', 'total_duration': 49955587987, 'load_duration': 23160991, 'prompt_eval_count': 1539, 'prompt_eval_duration': 40850000000, 'eval_count': 125, 'eval_duration': 9081000000, 'messag