### Conversation Q&A Chatbot

In many Q&A application we want to allow the user to have a back-and-forth conversation, meaning the application needs some sort of "memory" of past questions and answers, and some logic for incorporating those into its current thinking.

In this guid we focuns on adding logic for incorpirating historical messages. Further details on chat history management is covered in the previous videos.

We will cover two approaches:
- Chains, in which we always excute a retrieval step;
- Agents, in which we give an LLM discretion over whether and how to execute a retrieval step (or multiple steps).

In [13]:
import os
from dotenv import load_dotenv
load_dotenv()

from langchain_groq import ChatGroq

groq_api_key = os.getenv("GROQ_API_KEY")

llm = ChatGroq(api_key=groq_api_key, 
               model="llama-3.1-8b-instant"
)

llm

ChatGroq(profile={'max_input_tokens': 131072, 'max_output_tokens': 8192, 'image_inputs': False, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': False, 'tool_calling': True}, client=<groq.resources.chat.completions.Completions object at 0x00000185F0E43970>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x00000185F0E70FA0>, model_name='llama-3.1-8b-instant', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [14]:
!pip install bs4



In [15]:
os.environ['HUGGINGFACEHUB_API_TOKEN'] = os.getenv("HUGGINGFACEHUB_API_TOKEN")
from langchain_huggingface import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

In [16]:
!pip install -U langchain langchain-core langchain-community langchain-chroma



In [17]:
from langchain_chroma import Chroma
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.prompts import ChatPromptTemplate
# from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_classic.chains import create_retrieval_chain
from langchain_classic.chains.combine_documents import create_stuff_documents_chain

In [18]:
# 1. Load, chunk and index the contents of the blog to create a retriever
import bs4
from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)

docs = loader.load()
docs


[Document(metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, page_content='\n\n      LLM Powered Autonomous Agents\n    \nDate: June 23, 2023  |  Estimated Reading Time: 31 min  |  Author: Lilian Weng\n\n\nBuilding agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\nAgent System Overview#\nIn a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:\n\nPlanning\n\nSubgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.\nReflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistake

In [19]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
texts = text_splitter.split_documents(docs)

In [20]:
vectorStore = Chroma.from_documents(documents=texts, embedding=embeddings)
retriever = vectorStore.as_retriever()
retriever

VectorStoreRetriever(tags=['Chroma', 'HuggingFaceEmbeddings'], vectorstore=<langchain_chroma.vectorstores.Chroma object at 0x00000185F0E733A0>, search_kwargs={})

In [21]:
## Prompt template
prompt_template = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer  "
    "the question. If you don't know the answer, just say that you don't know, "
    "answer concise."
    "\n\n"
    "{context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", prompt_template),
        ("human", "{input}")
    ]
)

In [22]:
question_answering_chain = create_stuff_documents_chain(llm, prompt)
rag_chain = create_retrieval_chain(
    retriever,
    question_answering_chain
)

In [23]:
response = rag_chain.invoke(
    {"input": "What are the main components of an autonomous agent as described in the blog post?"}
)

response

{'input': 'What are the main components of an autonomous agent as described in the blog post?',
 'context': [Document(id='948e5ed9-544e-482f-899d-e6271b9b73d3', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, page_content='LLM Powered Autonomous Agents\n    \nDate: June 23, 2023  |  Estimated Reading Time: 31 min  |  Author: Lilian Weng\n\n\nBuilding agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\nAgent System Overview#\nIn a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:\n\nPlanning\n\nSubgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient han

In [24]:
response['answer']

'According to the blog post, the main components of an LLM-powered autonomous agent system include:\n\n1. Planning\n   - Subgoal and decomposition: Breaking down large tasks into smaller, manageable subgoals.\n   - Reflection and refinement: Self-criticism, self-reflection, learning from mistakes, and refining actions for future steps.\n\n2. Memory'

In [None]:
rag_chain.invoke(
    {"input": "How do we achieve it?"}  ## not able to understand the context properly
                                        ## it donot have the memory of previous question
)

{'input': 'How do we achieve it?',
 'context': [Document(id='2dc48a46-af82-4ca5-b7a5-151bf9bbb6db', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, page_content='Component One: Planning#\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\nTask Decomposition#\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process ca

## Adding the Chat History

In [29]:
from langchain_classic.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder

contextualize_q_system_prompt = (
    "Given a chat history and the latest user question "
    "which might reference context from the chat history."
    "formulate a standalone question which can be understood "
    "without the chat history. Do NOT answer the question, "
    "just reformulate it if needed and otherwise return the original question."
)

contextualize_q_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_q_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}")
    ]
)

In [30]:
history_aware_retriever = create_history_aware_retriever(llm, retriever, contextualize_q_prompt)
history_aware_retriever


RunnableBinding(bound=RunnableBranch(branches=[(RunnableLambda(lambda x: not x.get('chat_history', False)), RunnableLambda(lambda x: x['input'])
| VectorStoreRetriever(tags=['Chroma', 'HuggingFaceEmbeddings'], vectorstore=<langchain_chroma.vectorstores.Chroma object at 0x00000185F0E733A0>, search_kwargs={}))], default=ChatPromptTemplate(input_variables=['chat_history', 'input'], input_types={'chat_history': list[typing.Annotated[typing.Union[typing.Annotated[langchain_core.messages.ai.AIMessage, Tag(tag='ai')], typing.Annotated[langchain_core.messages.human.HumanMessage, Tag(tag='human')], typing.Annotated[langchain_core.messages.chat.ChatMessage, Tag(tag='chat')], typing.Annotated[langchain_core.messages.system.SystemMessage, Tag(tag='system')], typing.Annotated[langchain_core.messages.function.FunctionMessage, Tag(tag='function')], typing.Annotated[langchain_core.messages.tool.ToolMessage, Tag(tag='tool')], typing.Annotated[langchain_core.messages.ai.AIMessageChunk, Tag(tag='AIMessag

In [41]:
qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system",
         "You are a helpful assistant. Use the following context to answer the question.\n\n{context}"),
        ("human", "{input}")
    ]
)

In [42]:
question_answering_chain = create_stuff_documents_chain(llm, qa_prompt)
rag_chain = create_retrieval_chain(
    history_aware_retriever,
    question_answering_chain
)

In [45]:
from langchain_core.messages import AIMessage, HumanMessage
chat_history = []
question = "What is Self-Reflextion"
response1 = rag_chain.invoke(
    {"input": question,
     "chat_history": chat_history}
)


chat_history.extend(
    [
        HumanMessage(content=question),
        AIMessage(content=response1['answer'])
    ]
)

question2 = "How is it useful?"
response2 = rag_chain.invoke(
    {"input": question2,
     "chat_history": chat_history}
)

print(response2['answer'])

The Reflexion framework is useful in several ways:

1. **Improves Iterative Planning**: By incorporating self-reflection, the agent can refine past action decisions and correct previous mistakes, leading to improved planning and decision-making.

2. **Enhances Real-World Tasks**: Self-reflection is essential for real-world tasks where trial and error are inevitable. The Reflexion framework helps the agent to learn from its mistakes and adapt to new situations.

3. **Reduces Hallucination**: Hallucination occurs when the agent encounters a sequence of consecutive identical actions that lead to the same observation in the environment. The heuristic function in the Reflexion framework determines when the trajectory contains hallucination and stops it, reducing the likelihood of hallucination.

4. **Increases Efficiency**: The framework's ability to detect inefficient planning and stop it helps the agent to focus on more productive paths, reducing the time spent on unsuccessful trajectorie

In [46]:
chat_history

[HumanMessage(content='What is Self-Reflextion', additional_kwargs={}, response_metadata={}),
 AIMessage(content="Self-reflection in the context of the Reflexion framework is a process where the agent analyzes its past actions and decisions to identify areas for improvement and adjust its future behavior accordingly. It is created by showing two-shot examples to a Large Language Model (LLM), where each example consists of a pair of a failed trajectory and an ideal reflection for guiding future changes in the plan.\n\nIn more detail, self-reflection is generated as follows:\n\n1. Two-shot examples are created by presenting a failed trajectory and an ideal reflection to the LLM.\n2. These examples are used to train the LLM to generate self-reflections that can guide the agent's future decisions.\n3. The self-reflections are added to the agent's working memory, up to a maximum of three, to be used as context for querying the LLM.\n\nSelf-reflection allows the agent to:\n\n* Identify ineff

In [55]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

store = {}

def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]


conversational_rag_chain = RunnableWithMessageHistory(
    rag_chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="chat_history",
    output_messages_key="answer"
)

In [58]:
conversational_rag_chain.invoke(
    {"input": "What is Self-Reflection?"},
    config = {
        "configurable": {"session_id": "abc123"}
    },  # contruct a key "abc123" for session_id
)['answer']

'Self-reflection is a vital aspect that allows autonomous agents to improve iteratively by refining past action decisions and correcting previous mistakes. It is a crucial component in real-world tasks where trial and error are inevitable.\n\nIn the context of the given framework, self-reflection is achieved through the ReAct (Yao et al. 2023) model, which integrates reasoning and acting within a Large Language Model (LLM). It prompts the LLM to generate reasoning traces in natural language, incorporating explicit steps for the agent to think and reflect on its actions.\n\nThe self-reflection process involves:\n\n1. Identifying past mistakes or inefficient planning.\n2. Analyzing the causes of these mistakes.\n3. Refining future action decisions based on the reflections.\n\nThis process allows the agent to learn from its experiences, adapt to new situations, and improve its performance over time.'

In [None]:
## Check context is remembered or not
conversational_rag_chain.invoke(
    {"input": "What are the common types of it?"},
    config = {
        "configurable": {"session_id": "abc123"}
    },  # contruct a key "abc123" for session_id
)['answer']

"Based on the provided context, self-reflection in autonomous agents can be understood as a process to improve by refining past action decisions and correcting previous mistakes. The types mentioned directly are not explicitly stated, however, the following are general categories of self-reflection common in various contexts:\n\n1. **Retroactive Reflection**: This involves looking back at past decisions, actions, and events to identify mistakes or areas for improvement.\n\n2. **Proactive Reflection**: This type focuses on anticipating potential future scenarios and planning accordingly to prevent mistakes or improve outcomes.\n\n3. **Self-Awareness Reflection**: This involves understanding one's own thought processes, emotions, and biases to make more informed decisions.\n\n4. **Meta-Reflection**: This is a higher-level reflection that involves evaluating one's own reflection processes and considering how to improve them.\n\nThese categories can be adapted and applied to autonomous age