https://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_5_to_9.ipynb

some deviations from the source code because i dont wanna pay for embeddings from openai, or hit openai models. All openAI integration is replaced with ollama.

I also removed langsmith integration. don't think it's needed. just a frontend for LLM debugging which i can achieve with `langchain.debug = True`

Step-back - Make the LLM generate a more abstract version of the question. E.g. "What's the derivative of the exponential function?" --more abstract--> "What's a derivative in calculus?"


In [1]:
from langchain_community.vectorstores import Chroma
# Load documents
import bs4
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

# setting debug to true will allow us to see what is langchain actually creating
import langchain 
langchain.debug = True 

# Get embedding model
from langchain_ollama import OllamaEmbeddings

# Get chat model
from langchain_ollama.chat_models import ChatOllama

from langchain.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

from operator import itemgetter

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [42]:
# Everything in this cell is from previous notebooks
# Load docs from bs4
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
blog_docs = loader.load()

# Split docs
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=300, 
    chunk_overlap=50)

splits = text_splitter.split_documents(blog_docs)

# Get embedding ollama model
embed = OllamaEmbeddings(
    model="nomic-embed-text"
)

# Embed
vectorstore = Chroma.from_documents(
    documents=splits, 
    embedding=embed)

# Set up a retriever
embed = OllamaEmbeddings(
    model="nomic-embed-text"
)

# Embed
retriever = vectorstore.as_retriever(
    search_kwargs={"k": 5}, # How many to retrieve
    search_type='mmr'       # 'similarity' by default
)

# Get llm
llm = ChatOllama(model="llama3.2:3b-instruct-q5_K_M", temperature=0)

In [7]:
# This part looks so unnecessary. probably easier to just build the fewshotprompt myself instead calling these unnecessary fns

from langchain_core.prompts import ChatPromptTemplate, FewShotChatMessagePromptTemplate
examples = [
    {
        "input": "Could the members of The Police perform lawful arrests?",
        "output": "what can the members of The Police do?",
    },
    {
        "input": "Jan Sindel’s was born in what country?",
        "output": "what is Jan Sindel’s personal history?",
    },
]

example_prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "{input}"),
        ("ai", "{output}"),
    ]
)
few_shot_prompt = FewShotChatMessagePromptTemplate(
    example_prompt=example_prompt,
    examples=examples,
)
print(few_shot_prompt)

examples=[{'input': 'Could the members of The Police perform lawful arrests?', 'output': 'what can the members of The Police do?'}, {'input': 'Jan Sindel’s was born in what country?', 'output': 'what is Jan Sindel’s personal history?'}] input_variables=[] input_types={} partial_variables={} example_prompt=ChatPromptTemplate(input_variables=['input', 'output'], input_types={}, partial_variables={}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], input_types={}, partial_variables={}, template='{input}'), additional_kwargs={}), AIMessagePromptTemplate(prompt=PromptTemplate(input_variables=['output'], input_types={}, partial_variables={}, template='{output}'), additional_kwargs={})])


In [47]:
examples = [
    {
        "input": "Could the members of The Police perform lawful arrests?",
        "output": "what can the members of The Police do?",
    },
    {
        "input": "Jan Sindel’s was born in what country?",
        "output": "what is Jan Sindel’s personal history?",
    },
]

few_shot_prompt = '\n'.join([f"user: {example['input']}\nassistant: {example['output']}" for example in examples])
print(few_shot_prompt)

user: Could the members of The Police perform lawful arrests?
assistant: what can the members of The Police do?
user: Jan Sindel’s was born in what country?
assistant: what is Jan Sindel’s personal history?


In [50]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            """You are an expert at world knowledge. Your task is to step back and paraphrase a question to a more generic step-back question, which is easier to answer. Here are a few examples:""",
        ),
        # Few shot examples
        few_shot_prompt,
        # New question
        ("user", "Please paraphrase the following question into a more generic version:\n{question}"),
    ]
)

In [51]:
question = 'What is task decomposition for LLM agents?'

stepback_chain = prompt | llm | StrOutputParser()
stepback_chain.invoke({'question': question})

[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence] Entering Chain run with input:
[0m{
  "question": "What is task decomposition for LLM agents?"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > prompt:ChatPromptTemplate] Entering Prompt run with input:
[0m{
  "question": "What is task decomposition for LLM agents?"
}
[36;1m[1;3m[chain/end][0m [1m[chain:RunnableSequence > prompt:ChatPromptTemplate] [1ms] Exiting Prompt run with output:
[0m[outputs]
[32;1m[1;3m[llm/start][0m [1m[chain:RunnableSequence > llm:ChatOllama] Entering LLM run with input:
[0m{
  "prompts": [
    "System: You are an expert at world knowledge. Your task is to step back and paraphrase a question to a more generic step-back question, which is easier to answer. Here are a few examples:\nHuman: user: Could the members of The Police perform lawful arrests?\nassistant: what can the members of The Police do?\nuser: Jan Sindel’s was born in what country?\nassistant: what is Jan Sindel’s

"Here's a paraphrased version:\n\nWhat are the fundamental building blocks of complex tasks that an artificial intelligence system can perform?"

In [57]:
# Response prompt 
response_prompt_template = """You are an expert of world knowledge. I am going to ask you a question. Your response should be comprehensive and not contradicted with the following context if they are relevant. Otherwise, ignore them if they are not relevant.

# {normal_context}
# {step_back_context}

# Original Question: {question}
# Answer:"""

response_prompt = ChatPromptTemplate.from_template(response_prompt_template)

rag_chain = (
    {
        'normal_context': itemgetter('question') | retriever,
        'step_back_context': itemgetter('question') | stepback_chain | retriever,
        'question': itemgetter('question')
    }
    | response_prompt 
    | llm
)
answer = rag_chain.invoke({'question': question})

[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence] Entering Chain run with input:
[0m{
  "question": "What is task decomposition for LLM agents?"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableParallel<normal_context,step_back_context,question>] Entering Chain run with input:
[0m{
  "question": "What is task decomposition for LLM agents?"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableParallel<normal_context,step_back_context,question> > chain:RunnableSequence] Entering Chain run with input:
[0m{
  "question": "What is task decomposition for LLM agents?"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > chain:RunnableParallel<normal_context,step_back_context,question> > chain:RunnableSequence > chain:RunnableLambda] Entering Chain run with input:
[0m{
  "question": "What is task decomposition for LLM agents?"
}
[36;1m[1;3m[chain/end][0m [1m[chain:RunnableSequence > chain:RunnableParallel<normal_co

In [60]:
# note that the prompt thats retrieved looks a little dirty. It contains the prompt source etc, instead of just the text.
print(answer.content)

Based on the provided text, here's a possible answer:

"Task decomposition for LLM (Large Language Model) agents refers to the process of breaking down complex tasks into smaller, more manageable sub-tasks. This allows the agent to focus on one task at a time and improve its performance on each individual sub-task.

In the context of LLM agents, task decomposition typically involves identifying the key components or features that need to be extracted from the input data in order to complete the task. For example, if the task is to generate text summarization, the agent might decompose it into smaller tasks such as:

* Tokenization: breaking down the input text into individual words or tokens
* Part-of-speech tagging: identifying the grammatical category of each word (e.g. noun, verb, adjective)
* Named entity recognition: identifying specific entities mentioned in the text (e.g. people, places, organizations)

By decomposing the task into smaller sub-tasks, the LLM agent can use its la