# Build a Local RAG Application

In [1]:
# Document loading, retrieval methods and text splitting
%pip install -qU langchain langchain_community

# Local vector store via Chroma
%pip install -qU langchain_chroma

# Local inference and embeddings via Ollama
%pip install -qU langchain_ollama

# Web Loader
%pip install -qU beautifulsoup4

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


## Document Loading

In [1]:
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
data = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
all_splits = text_splitter.split_documents(data)

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [2]:
from langchain_chroma import Chroma
from langchain_ollama import OllamaEmbeddings

local_embeddings = OllamaEmbeddings(model="nomic-embed-text")
vectorstore = Chroma.from_documents(documents=all_splits, embedding=local_embeddings)

In [3]:
question = "What are the approaches to Task Decomposition?"
docs = vectorstore.similarity_search(question)
len(docs)

4

In [4]:
docs[0]

Document(metadata={'description': 'Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\nAgent System Overview In a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:', 'language': 'en', 'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'title': "LLM Powered Autonomous Agents | Lil'Log"}, page_content='Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.')

In [5]:
from langchain_ollama import ChatOllama

model = ChatOllama(
    model="llama3.1",
)

In [6]:
response_message = model.invoke(
    "Simulate a rap battle between Stephen Colbert and John Oliver"
)

print(response_message.content)

**The scene is set: A dark, crowded comedy club. The audience is on the edge of their seats as two comedic heavyweights face off in the ultimate rap battle showdown. In the blue corner, we have Stephen Colbert, aka "The O'Reilly Factor's" erstwhile host. And in the red corner, it's John Oliver, the biting satirist from Last Week Tonight. Let's get ready to rumble!**

**Round 1: Stephen Colbert**
(Speaking confidently)
Yo, I'm the king of late-night TV fame,
From The Daily Show to this comedy game.
My wit is sharp, my satire's tight,
I take on politics with all my might.

**(Colbert starts rapping)**

I'm not a pundit, no Fox in me,
Just a satirist, with sarcasm as my decree.
I skewer the right, and sometimes the left,
My words are truth, don't get it twisted, I'm correct!

**Round 1: John Oliver**
(Speaking with a hint of amusement)
Hold up, Stephen, let's not get ahead,
Your time at The Daily Show was quite... well-bred.
But since then, you've been playing the fool,
A conservative, tr

## Using in a chain

In [7]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template(
    "Summarize the main themes in these retrieved docs: {docs}"
)

# Docu -> Str by concat and ignoring metadata
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

chain = {"docs": format_docs} | prompt | model | StrOutputParser()

question = "What are the approaches to Task Decomposition?"

docs = vectorstore.similarity_search(question)

chain.invoke(docs)

'The main themes in these retrieved documents appear to be:\n\n1. **Task Decomposition**: Breaking down complex tasks into smaller, manageable subgoals, which can be achieved through various methods such as:\n\t* Using Large Language Models (LLM) with simple prompting\n\t* Utilizing task-specific instructions\n\t* Incorporating human inputs\n2. **Planning and Execution**: A system or agent that can plan ahead, execute specific tasks, and log results efficiently.\n3. **Autonomous Agent System**: An overview of a system powered by LLMs, which involves multiple components, including:\n\t* Planning: Breaking down complex tasks into smaller subgoals and planning ahead\n\t* Task Execution: Expert models executing on specific tasks and logging results\n4. **Self-Improvement**: The ability of an agent to reflect on past actions, learn from mistakes, and refine them for future steps, leading to improved quality in final results.\n\nThese themes seem to be related to the development of autonomou

## Q&A

In [9]:
from langchain_core.runnables import RunnablePassthrough

RAG_TEMPLATE = """
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.

<context>
{context}
</context>

Answer the following question:

{question}"""

rag_prompt = ChatPromptTemplate.from_template(RAG_TEMPLATE)

chain = (
    RunnablePassthrough.assign(context=lambda input: format_docs(input["context"]))
    | rag_prompt
    | model
    | StrOutputParser()
)

question = "What are the approaches to Task Decomposition?"

docs = vectorstore.similarity_search(question)

# Run
chain.invoke({"context": docs, "question": question})

'There are three approaches to Task Decomposition: (1) using Large Language Models (LLM) with simple prompting, (2) using task-specific instructions, and (3) with human inputs. These approaches help break down large tasks into smaller, manageable subgoals for efficient handling. This process enables the agent to plan ahead and improve the quality of final results through reflection and refinement.'

## Q&A with retrieval

In [10]:
retriever = vectorstore.as_retriever()

In [11]:
qa_chain = (
    {"context": retriever| format_docs, "question": RunnablePassthrough()}
    |rag_prompt
    |model
    |StrOutputParser()
)

In [12]:
question = "What are the approaches to Task Decomposition?"

qa_chain.invoke(question)

'Task decomposition can be done through three approaches: (1) using Large Language Models (LLM) with simple prompting, (2) utilizing task-specific instructions, and (3) receiving human inputs. This allows agents to break down complex tasks into manageable subgoals.'