# Build a Retrieval Augmented Generation (RAG) App
Este archivo es una prueba siguiendo el tutorial descripto en la página: https://python.langchain.com/docs/tutorials/rag/

# Installation

```console
py venv -m venv
.\\venv\scripts\activate
pip install --quiet --upgrade langchain langchain-community langchain-chroma
pip install -qU langchain-openai
pip install bs4
```

## LangSmith
Many of the applications you build with LangChain will contain multiple steps with multiple invocations of LLM calls. As these applications get more complex, it becomes crucial to be able to inspect what exactly is going on inside your chain or agent. The best way to do this is with LangSmith.

In [1]:
import os
import utils

os.environ["LANGCHAIN_TRACING_V2"] = utils.config["LANG"]["LANGCHAIN_TRACING_V2"]
os.environ["LANGCHAIN_API_KEY"] = utils.config["LANG"]["LANGCHAIN_API_KEY"]


## 1. Indexing: Load
We need to first load the blog post contents. We can use DocumentLoaders for this, which are objects that load in data from a source and return a list of Documents. A Document is an object with some page_content (str) and metadata (dict).

In this case we’ll use the WebBaseLoader, which uses urllib to load HTML from web URLs and BeautifulSoup to parse it to text. We can customize the HTML -> text parsing by passing in parameters into the BeautifulSoup parser via bs_kwargs (see BeautifulSoup docs). In this case only HTML tags with class “post-content”, “post-title”, or “post-header” are relevant, so we’ll remove all others.

In [2]:
import bs4
from langchain_community.document_loaders import WebBaseLoader

# Only keep post title, headers, and content from the full HTML.
bs4_strainer = bs4.SoupStrainer(class_=("post-title", "post-header", "post-content"))
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs={"parse_only": bs4_strainer},
)
docs = loader.load()

print(len(docs[0].page_content))

print(docs[0].page_content[:500])

USER_AGENT environment variable not set, consider setting it to identify your requests.


43131


      LLM Powered Autonomous Agents
    
Date: June 23, 2023  |  Estimated Reading Time: 31 min  |  Author: Lilian Weng


Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.
Agent System Overview#
In


In [3]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200, add_start_index=True
)
all_splits = text_splitter.split_documents(docs)

print(len(all_splits))
print(len(all_splits[0].page_content))
print(all_splits[10].metadata)

66
969
{'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'start_index': 7056}


In [4]:
from langchain_chroma import Chroma
# from langchain_openai import AzureOpenAIEmbeddings
from langchain_openai import OpenAIEmbeddings

# embed = AzureOpenAIEmbeddings(
#     model=utils.config["EMB"]["MODEL"],
#     azure_endpoint=utils.config["EMB"]["ENDPOINT"],
#     api_key=utils.config["EMB"]["API_KEY"],
#     api_version=utils.config["EMB"]["API_VERSION"],
# )

os.environ["OPENAI_API_KEY"] = utils.config["EMB"]["OPENAI_API_KEY"]
vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())

In [5]:
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 6})

retrieved_docs = retriever.invoke("What are the approaches to Task Decomposition?")

print(len(retrieved_docs))

print(retrieved_docs[0].page_content)

6
Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.


In [6]:
# os.environ["AZURE_OPENAI_ENDPOINT"] = utils.config["LLM"]["ENDPOINT"]
# os.environ["AZURE_OPENAI_API_KEY"] = utils.config["LLM"]["API_KEY"]
# os.environ["AZURE_OPENAI_API_VERSION"] = utils.config["LLM"]["API_VERSION"]
# os.environ["AZURE_OPENAI_DEPLOYMENT"] = utils.config["LLM"]["DEPLOYMENT"]

# from langchain_openai import AzureChatOpenAI

# llm = AzureChatOpenAI(
#     azure_endpoint=utils.config["LLM"]["ENDPOINT"],
#     azure_deployment=utils.config["LLM"]["DEPLOYMENT"],
#     openai_api_version=utils.config["LLM"]["API_VERSION"],
# )


os.environ["OPENAI_API_KEY"] = utils.config["LLM"]["OPENAI_API_KEY"]
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini")


We’ll use a prompt for RAG that is checked into the LangChain prompt hub (here).

In [7]:
from langchain import hub

prompt = hub.pull("rlm/rag-prompt")

example_messages = prompt.invoke(
    {"context": "filler context", "question": "filler question"}
).to_messages()

example_messages

print(example_messages[0].content)

You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: filler question 
Context: filler context 
Answer:


We’ll use the LCEL Runnable protocol to define the chain, allowing us to

pipe together components and functions in a transparent way
automatically trace our chain in LangSmith
get streaming, async, and batched calling out of the box.
Here is the implementation:

In [8]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

for chunk in rag_chain.stream("What is Task Decomposition?"):
    print(chunk, end="", flush=True)

Task decomposition is the process of breaking down complex tasks into smaller, manageable steps to facilitate better planning and execution. It often involves techniques like Chain of Thought (CoT), where a model is prompted to think step-by-step, or Tree of Thoughts, which explores multiple reasoning possibilities at each stage. This approach helps in organizing and simplifying the problem-solving process.