Load Documents

In [1]:
import bs4
from langchain_community.document_loaders import WebBaseLoader


bs4_strainer = bs4.SoupStrainer(class_=('post-title',"post-header", "post-content"))

loader = WebBaseLoader(web_paths = ("https://lilianweng.github.io/posts/2023-06-23-agent/",), bs_kwargs = {"parse_only": bs4_strainer})

docs = loader.load()

assert len(docs) == 1
print(len(docs[0].page_content))

USER_AGENT environment variable not set, consider setting it to identify your requests.


43130


Splitting Documents

In [2]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 1000,
    chunk_overlap = 200,
    add_start_index = True,
)

all_splits = text_splitter.split_documents(docs)

print(len(all_splits))

66


Storing Documents

In [3]:
import getpass
import os

if not os.environ.get("MISTRALAI_API_KEY"):
  os.environ["MISTRALAI_API_KEY"] = getpass.getpass("Enter API key for MistralAI: ")

from langchain_mistralai import MistralAIEmbeddings

embeddings = MistralAIEmbeddings(model="mistral-embed")

  from .autonotebook import tqdm as notebook_tqdm


In [5]:
from langchain.chat_models import init_chat_model

llm = init_chat_model("mistral-large-latest", model_provider="mistralai")

from langchain_core.vectorstores import InMemoryVectorStore

vector_store = InMemoryVectorStore(embeddings)

In [6]:
document_ids = vector_store.add_documents(documents=all_splits)
print(document_ids)

['4a4b6366-2b55-4485-a181-ac9b9b8b7675', 'a619e7f7-0154-447d-9fe6-3acb4e95433d', '06f88e38-4c21-4fa5-8e03-675bac0af9d5', 'bd61cc38-8965-45d0-ae5f-ff08525148ee', '073dbc1c-b03f-4055-8c55-aa9848715482', '65ad5e65-0821-42ac-8d37-6a4783439b73', 'c729866e-ad35-4cdd-9126-9ca7cd671ae7', 'c5a72909-3486-456f-b89e-ba092cb4d46a', '6077f466-30fc-4a2b-b2f6-e27a57ce12ab', '5779fa12-c9e1-4082-ba25-1425db311692', '129f90b7-60ca-4802-918f-bf5dcb2e8c6d', '79bedd1b-9c1c-4af3-9682-8b00110e5380', '1bb4b386-e02b-4364-914d-1dfc76b51767', '901361cd-2c36-43c5-bccc-b115bb7e3b95', 'f5572192-100e-406f-b850-c4225d2b2043', '478b8cea-c0c0-4e2d-984a-7c598058f3c2', 'e6b0babf-360f-42fc-8e4e-8722ee710d35', 'dfff670e-2437-4ca1-a103-3029001bd749', 'df4fb718-0599-4a72-8041-e1085418e858', 'd931ebba-0dcd-4b06-9949-bc98955244fe', '67acab27-80a3-4896-ad46-8a5621237876', '6b5e3cc3-b136-425a-a29a-6b6db18ca9a0', 'e6e0d4da-341d-4252-ad25-dc879a427a10', '958fb291-d02d-4529-9d6d-87d68b7f33d6', 'e27729f8-6ac8-48d5-98a6-06ae90ffa352',

Retrieval and Generation

In [21]:
from langchain import hub

prompt = hub.pull("rlm/rag-prompt")

example_message =  prompt.invoke({
    "context": "(context goes here)",
    "question": "(question goes here)",
}).to_messages()

print(example_message)

[HumanMessage(content="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: (question goes here) \nContext: (context goes here) \nAnswer:", additional_kwargs={}, response_metadata={})]




In [26]:
question = "what is self reflection?"

retrieved_docs = vector_store.similarity_search(question)
docs_content = "\n\n".join(doc.page_content for doc in retrieved_docs)
print(docs_content)
answer = llm.invoke(prompt.invoke({"question": question, "context": docs_content}))
answer.content

Fig. 3. Illustration of the Reflexion framework. (Image source: Shinn & Labash, 2023)
The heuristic function determines when the trajectory is inefficient or contains hallucination and should be stopped. Inefficient planning refers to trajectories that take too long without success. Hallucination is defined as encountering a sequence of consecutive identical actions that lead to the same observation in the environment.
Self-reflection is created by showing two-shot examples to LLM and each example is a pair of (failed trajectory, ideal reflection for guiding future changes in the plan). Then reflections are added into the agent’s working memory, up to three, to be used as context for querying LLM.

Memory stream: is a long-term memory module (external database) that records a comprehensive list of agents’ experience in natural language.

Each element is an observation, an event directly provided by the agent.
- Inter-agent communication can trigger new natural language statements.


Re

'Self-reflection is a process that allows autonomous agents to improve by reviewing and learning from past actions and mistakes. It is particularly useful in real-world tasks where trial and error are common. In the context of the Reflexion framework, self-reflection is facilitated by providing examples of failed trajectories and ideal reflections to a language model, which then guides future planning.'

In [23]:
answer.content

'Task decomposition in AI agents is done through methods like Chain of Thought (CoT), where the agent is prompted to "think step by step" to break down complex tasks into smaller, manageable ones. This process transforms large tasks into multiple simpler tasks, making them easier to handle and providing insight into the model\'s thinking process. The agent can then plan and execute these smaller tasks in a logical order.'