In [1]:
from dotenv import load_dotenv
import os


load_dotenv()

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")
os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGCHAIN_API_KEY")
os.environ["LANGCHAIN_TRACING_V2"] = "true"

### 1. Indexing: Load

In [3]:
import bs4
from langchain_community.document_loaders import WebBaseLoader

# only keep post title, headers, and content from the full html.
bs4_strainer = bs4.SoupStrainer(class_=("post-title", "post-header", "post-content"))
loader = WebBaseLoader(
    web_path=("https://lilianweng.github.io/posts/2023-06-23-agent/"),
    bs_kwargs={"parse_only": bs4_strainer},
)

docs = loader.load()

len(docs[0].page_content)

43131

### 2. Indexing: Split

In [4]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200, add_start_index=True
)

all_splits = text_splitter.split_documents(docs)

len(all_splits)

66

### 3. Indexing: Store

In [5]:
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings

verctorStore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())

### 4. Retrival and Generation: Retrieve

In [6]:
retriver = verctorStore.as_retriever(search_type="similarity", search_kwargs={"k": 6})
retrived_data = retriver.invoke("what are the approaches to task decomposition?")

len(retrived_data)

6

In [7]:
retrived_data[2].page_content

'Resources:\n1. Internet access for searches and information gathering.\n2. Long Term memory management.\n3. GPT-3.5 powered Agents for delegation of simple tasks.\n4. File output.\n\nPerformance Evaluation:\n1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.\n2. Constructively self-criticize your big-picture behavior constantly.\n3. Reflect on past decisions and strategies to refine your approach.\n4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.'

### 5. Retrieval and Generation: Generate

In [10]:
from langchain import hub

prompt = hub.pull("rlm/rag-prompt")

example_message = prompt.invoke(
    {"context": "filler context", "question": "filler question"}
).to_messages()

example_message

[HumanMessage(content="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: filler question \nContext: filler context \nAnswer:")]

In [11]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI

llm = ChatOpenAI()

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_chain = (
    {"context": retriver | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

for chunk in rag_chain.stream("what is Task Decomposition"):
    print(chunk, end="", flush=True)

Task decomposition is a technique used to break down complex tasks into smaller and simpler steps for better understanding and execution. It involves transforming big tasks into multiple manageable tasks, enhancing model performance on complex tasks. Task decomposition can be done by using simple prompting techniques, task-specific instructions, or with human inputs.

In [12]:
for chunk in rag_chain.stream("who is author of this blog?"):
    print(chunk, end="", flush=True)

The author of the blog "LLM-powered Autonomous Agents" is Lilian Weng.

In [13]:
for chunk in rag_chain.stream("how many sections can i find in this blog and how much time it may take to read entire blog for an average human beging?"):
    print(chunk, end="", flush=True)

The blog consists of multiple sections, but the exact number is not specified in the provided context. The time it may take to read the entire blog for an average human being would depend on the length and complexity of the content, making it difficult to estimate accurately. Due to the lack of specific details, it is challenging to provide a precise answer regarding the number of sections and the reading time for the blog.

In [15]:
for chunk in rag_chain.stream("how many components we have in blog and name them?"):
    print(chunk, end="", flush=True)

There are three components in a blog: Content, Tool Use, and Tree of Thoughts.

In [17]:
for chunk in rag_chain.stream("what is this blog about?"):
    print(chunk, end="", flush=True)

The blog is about providing instructions for writing code, ensuring that every detail of the architecture is implemented accurately. It includes steps to lay out core classes, functions, and methods, outputting all code in a markdown format. The performance evaluation emphasizes continuous improvement and smart, efficient task completion.