### Conversation Q&A Chatbot
In many Q&A applications we want to allow the user to have a back-and-forth conversation, meaning the application needs some sort of "memory" of past questions and answers, and some logic for incorporating those into its current thinking.

In this guide we focus on adding logic for incorporating historical messages. Further details on chat history management is covered in the previous videos.

We will cover two approaches:

- Chains, in which we always execute a retrieval step;
- Agents, in which we give an LLM discretion over whether and how to execute a retrieval step (or multiple steps).

In [160]:
from fastapi import FastAPI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_groq import ChatGroq
import os 
from langserve import add_routes
from dotenv import load_dotenv
load_dotenv()
from langchain_chroma import Chroma 
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.prompts import ChatPromptTemplate
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings 
import bs4

os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGCHAIN_API_KEY")
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = os.getenv("LANGCHAIN_PROJECT")

groq_api_key= os.getenv("GROQ_API_KEY")
model = ChatGroq(model= 'moonshotai/kimi-k2-instruct', groq_api_key =groq_api_key)

hf_api_key= os.getenv("HF_API_KEY")
emb =  HuggingFaceEmbeddings(model_name= "all-MiniLM-L6-v2")



'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: fde502b1-4631-49be-88e7-febfd884cf6c)')' thrown while requesting HEAD https://huggingface.co/api/resolve-cache/models/sentence-transformers/all-MiniLM-L6-v2/c9745ed1d9f207416be6d2e6f8de32d1f16199bf/config_sentence_transformers.json
Retrying in 1s [Retry 1/5].
'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: a6797f99-338f-4845-b669-55fe8434c685)')' thrown while requesting HEAD https://huggingface.co/api/resolve-cache/models/sentence-transformers/all-MiniLM-L6-v2/c9745ed1d9f207416be6d2e6f8de32d1f16199bf/tokenizer_config.json
Retrying in 1s [Retry 1/5].
'(MaxRetryError('HTTPSConnectionPool(host=\'huggingface.co\', port=443): Max retries exceeded with url: /api/resolve-cache/models/sentence-transformers/all-MiniLM-L6-v2/c9745ed1d9f207416be6d2e6f8de32d1f16199bf/tokenizer_config.json (Caused by

In [27]:
loader = WebBaseLoader(
    web_path=("https://www.ibm.com/think/topics/langchain") ,
    bs_kwargs= dict(
        parse_only = bs4.SoupStrainer(
            class_=('post-content','content-page page basicpage publish','post-header')
        )
    )
)

In [32]:
docs  = loader.load()
docs

[Document(metadata={'source': 'https://www.ibm.com/think/topics/langchain', 'title': 'Caret right'}, page_content="\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n                                   \n\n\n  \n  \n    \n    What is LangChain?\n\n  \n\n\n\n\n\n    \n\n\n                               \n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nAI Agents\n\n\n\nWelcome\n\n\n\n\n\nCaret right\n\nIntroduction\n\n\n\n\nOverview\n\n\n\n\nAI agents vs AI assistants\n\n\n\n\n\nCaret right\n\nAgentic AI\n\n\n\n\nWhat is agentic AI?\n\n\n\n\nWhy is agentic AI important?\n\n\n\n\n\n\nAgentic AI vs generative AI\n\n\n\n\n\nCaret right\n\nAI agent development\n\n\n\n\nWhat is AI agent development?\n\n\n\n\nAgentOps\n\n\n\n\nTutorial: AgentOps with IBM Telemetry using watsonx Orchestrate\n\n\n\n\nHow to build an AI agent\n\n\n\n\n\n\nEvolution of AI agents\n\n\n\n\n\nCaret right\n\nTypes of AI agents\n\n\n\n\nOverview\n\n\n\n\nGoal-based agent\n\n\n\n\nModel-based ref

In [36]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=100,chunk_overlap=10)

splits =  text_splitter.split_documents(docs)

vs  = Chroma.from_documents(documents=splits,embedding=emb)

In [37]:
retriever = vs.as_retriever()

retriever

VectorStoreRetriever(tags=['Chroma', 'HuggingFaceEmbeddings'], vectorstore=<langchain_chroma.vectorstores.Chroma object at 0x0000026C55171810>, search_kwargs={})

# Promt Template 

In [None]:
from langchain_core.prompts import ChatPromptTemplate , MessagesPlaceholder

system = """Your are helpfull agent Please answer the questions in just 20 words with important keywords.
If you don't know just say i don't know
"""
prompts = ChatPromptTemplate.from_messages(
    [
        ('system',system),
        ('human',"{input}"),
    ]
)



In [60]:
from langchain_core.output_parsers import StrOutputParser

question_answer_chain = (
    prompts
    | model
    | StrOutputParser()
)

rag_chain = (
    {
        "context": lambda x: retriever.invoke(x["input"]),
        "input": lambda x: x["input"],
    }
    | question_answer_chain
)


In [82]:
rag_chain.invoke({"input": "Say one bad pickupline?"})


'"Are you a loan? Because my interest in you is predatory and your rate is sky-high—run!"'

# Adding Chat History

In [118]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_core.messages import HumanMessage, AIMessage


In [119]:
context = """
You are a helpful agent.
Answer in just 20 words.
Use only keywords from the context.
If you don't know, say: I don't know.
"""


In [124]:
context_qa = ChatPromptTemplate.from_messages(
    [
        ("system", context),
        MessagesPlaceholder("chat_history"),
        ("human", "Context:\n{context}\n\nQuestion:\n{input}")
    ]
)


In [142]:
from operator import itemgetter

def retriever_to_text(docs):
    return "\n\n".join(doc.page_content for doc in docs)

history_qa = (
    {
        "input": itemgetter("input"),
        "chat_history": itemgetter("chat_history"),
        "context": itemgetter("input") | retriever | retriever_to_text
    }
    | context_qa
    | model
    | StrOutputParser()
)


In [153]:

question = "What is a Langchain"
response1  = history_qa.invoke({
    "input":  question ,
    "chat_history": chat_history
})

question = "Who is the founder of this"

history_qa.invoke({
    "input":question  ,
    "chat_history": chat_history
})



'LangChain framework chains prompts models memory tools orchestrate LLM workflows'

In [154]:
chat_history.extend(
    [
        HumanMessage(content=question),
        AIMessage(content=response1)
    ]
)

In [157]:
question = "Who is the founder of this"

history_qa.invoke({
    "input":question  ,
    "chat_history": chat_history
})


'Harrison Chase'