### Conversation Q&A Chatbot

In many Q&A applications we want to allow the user to have a back-and-forth conversation, meaning the application needs some sort of "memory" of the psat questions and answers, and some logic for incorporating those into its current thinking.

In this guide we focus on adding logic for incorporating historical messages. Further details on chat history management is covered in the previous videos.

We will cover two approaches:
- Chains, in which we always execute a retrieval step.
- Agents, in which we give an LLM discretion over whether and how to execute a retrieval step (or multiple steps)

langchain_core
langchain_community
langchain_huggingface
langchain_text_spliter
lanchain_chroma
langchain_groq
bs4

langchain==0.3.26
langchain-chroma==0.1.4
langchain-community==0.3.26
langchain-core==0.3.66
langchain-groq==0.3.4
langchain-huggingface==0.2.0
langchain-text-splitters==0.3.8
bs4==0.0.2

In [1]:
import os
from dotenv import load_dotenv
load_dotenv()
from langchain_groq import ChatGroq

groq_api_key = os.getenv('GROQ_API_KEY')

llm = ChatGroq(groq_api_key=groq_api_key, model='llama3-8b-8192')

In [2]:
from langchain_huggingface import HuggingFaceEmbeddings

os.environ['HF_TOKEN'] = os.getenv('HF_TOKEN')
embeddings = HuggingFaceEmbeddings(model_name='all-MiniLM-L6-v2')

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
import bs4

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [4]:
# Step 1: Load the HTML content
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(class_=("post-content", "post-header"))
    )
)

docs = loader.load()

In [5]:
# Step 2: Split the text into chunk
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)

In [6]:
# Step 3: Store the vectore in Chroma db
vectorstore = Chroma.from_documents(documents=splits, embedding=embeddings)
retriever = vectorstore.as_retriever()

In [7]:
# Step 4: Prompt Template
system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}")
    ]
)

In [8]:
# Step 5: Chain
question_answer_chain = create_stuff_documents_chain(llm, prompt)
rag_chain = create_retrieval_chain(retriever, question_answer_chain)


In [9]:
# Step 6: Response
response = rag_chain.invoke({"input": "What is Self-Reflection"})
response['answer']

'Self-reflection is a vital aspect that allows autonomous agents to improve iteratively by refining past action decisions and correcting previous mistakes. It plays a crucial role in real-world tasks where trial and error are inevitable.'

In [10]:
# Step 7: Adding Chat History
from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder
from langchain_core.messages import HumanMessage

contextualize_que_system_prompt = (
    "Given a chat history and the latest user question "
    "which might reference context in the chat history, "
    "formulate a standalone question which can be understood "
    "without the chat history. Do NOT answer the question, "
    "just reformulate it if needed and otherwise return it as is."
)

contextualize_que_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_que_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}")
    ]
)

history_aware_retriever = create_history_aware_retriever(llm, retriever, contextualize_que_prompt)

In [11]:
qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}")
    ]
)

In [12]:
# Step 8: Reponse
question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)
rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)

In [13]:
from langchain_core.messages import AIMessage, HumanMessage

chat_history = []
question = "What is Self-Reflection"
response = rag_chain.invoke({"input": question, "chat_history": chat_history})

chat_history.extend(
    [
        HumanMessage(content=question),
        AIMessage(content=response['answer'])
    ]
)

response['answer']

'Self-Reflection is a vital aspect that allows autonomous agents to improve iteratively by refining past action decisions and correcting previous mistakes. It is a process that enables agents to reflect on their past actions, identify mistakes, and adjust their behavior to improve future outcomes.'

In [14]:
question2 = "Tell me more about it?"
response2 = rag_chain.invoke({"input": question, "chat_history": chat_history})
response2['answer']

'Self-Reflection is a mechanism that allows autonomous agents to reflect on their past actions, identify mistakes, and adjust their behavior to improve future outcomes.'

In [15]:
chat_history

[HumanMessage(content='What is Self-Reflection', additional_kwargs={}, response_metadata={}),
 AIMessage(content='Self-Reflection is a vital aspect that allows autonomous agents to improve iteratively by refining past action decisions and correcting previous mistakes. It is a process that enables agents to reflect on their past actions, identify mistakes, and adjust their behavior to improve future outcomes.', additional_kwargs={}, response_metadata={})]

In [22]:
# Chat Session Id
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

store = {}

def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

conversational_rag_chain = RunnableWithMessageHistory(
    rag_chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="chat_history",
    output_messages_key="answer"
)

In [None]:
conversational_rag_chain.invoke(
    {"input": "What is task decomposition?"},
    config={
        "configurable": {"session_id": "abc123"}
    },
)["answer"]

'Task Decomposition is the process of breaking down a complex task into smaller, more manageable subtasks.'

In [27]:
conversational_rag_chain.invoke(
    {"input": "What are common ways of doing it?"},
    config={
        "configurable": {"session_id": "abc123"}
    },
)["answer"]

'According to the provided context, common ways of doing task decomposition include:\n\n1. Using simple prompting techniques, such as "Steps for XYZ.\\n1." or "What are the subgoals for achieving XYZ?"\n2. Using task-specific instructions, such as "Write a story outline." for writing a novel\n3. Using human inputs'