# **Conversational RAG Part 3:**

Here we use,

* BaseChatMessageHistory,
* InMemoryChatMessageHistory,
* RunnableWithMessageHistory,
* trim_messages

In [1]:
# install necessary libaries:

%pip install --upgrade --quiet sentence_transformers
%pip install --upgrade --quiet  langchain langchain-community langchainhub langchain-google-genai langchain-chroma bs4 boto3
%pip install --upgrade --quiet langchain-aws

In [25]:
# Load the Tokens:

from google.colab import userdata
import os

os.environ['GOOGLE_API_KEY'] = userdata.get('GEMINI_API_KEY')
os.environ['HF_TOKEN'] = userdata.get('HF_TOKEN')

## **Load LLM:**

In [26]:
from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(
    model="gemini-1.5-pro",
    temperature=0.4,
    max_tokens=512,
    timeout=None,
    max_retries=2,
    # other params...
)

print(model.invoke("hi").content)

Hi there! How can I help you today?



## **Load Embeddings from HF:**

In [None]:
# Get the Embeddings:

from langchain.embeddings import HuggingFaceEmbeddings

model_name = "sentence-transformers/all-MiniLM-L6-v2"
embeddings = HuggingFaceEmbeddings(model_name=model_name)

## **Vector Store:**

In [7]:
import bs4
from langchain import hub
from langchain_chroma import Chroma
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter



In [8]:
# Load, chunk and index the contents of the blog.
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=embeddings)

In [9]:
# retriever:

retriever = vectorstore.as_retriever(search_kwargs=dict(k=5))

In [None]:
retriever.get_relevant_documents(query="What is LLM?")

## **Build Conversational AI Bot:**

In [59]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.chains import create_history_aware_retriever
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain
from langchain_core.messages import HumanMessage, AIMessage

### **Step 1: Create History-Aware-Retriever**

In [27]:
# Create History Aware Retriever:

retriever_prompt = (
    "Given a chat history and the latest user question which might reference context in the chat history,"
    "formulate a standalone question which can be understood without the chat history."
    "Do NOT answer the question, just reformulate it if needed and otherwise return it as is."
)


contextualize_q_prompt  = ChatPromptTemplate.from_messages(
    [
        ("system", retriever_prompt),
        MessagesPlaceholder(variable_name="chat_history"),
        ("human", "{input}"),


     ]
)

history_aware_retriever = create_history_aware_retriever(model, retriever, contextualize_q_prompt)

### **Step 2: Define the Custom System Prompts and Create Question-Aware-Chain**

In [28]:
# Define the Custom System Prompts and Create Question-Aware-Chain:

system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer the question "
    "If you don't know the answer, say that you don't know."
    "Use three sentences maximum and keep the answer concise."
    "\n\n"
    "{context}"
)

qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        MessagesPlaceholder(variable_name="chat_history"),
        ("human", "{input}"),
    ]
)


question_answer_chain = create_stuff_documents_chain(model, qa_prompt)

### **Step 3: Create Rag-Chain using history_aware_retriever and question_answer_chain**

In [29]:
# Create Rag-Chain using history_aware_retriever and question_answer_chain:

rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)

### **Step 4: Use the Memory and Session to store the current conversation**

* ChatMessageHistory,
* BaseChatMessageHistory,
* RunnableWithMessageHistory

In [42]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory


#### **Create a Session to Store Conversation:**

In [49]:
store = {}

In [50]:
def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

#### **Create Conversational Rag Chain using memory and rag_chain**

In [51]:
# conversational_rag_chain:

conversational_rag_chain = RunnableWithMessageHistory(
    rag_chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="chat_history",
    output_messages_key="answer",
)

#### **Generate Response and store the current user session:**

In [52]:
conversational_rag_chain.invoke(
    {"input": "What is Chain of Thought Prompt?"},
    config={
        "configurable": {"session_id": "abc123"}
    },  # constructs a key "abc123" in `store`.
)["answer"]

'Chain-of-Thought (CoT) prompting encourages large language models to generate intermediate reasoning steps before arriving at a final answer.  This is done by providing a few examples of question-answer pairs with reasoning steps included in the prompt.  CoT improves reasoning and performance on complex tasks.\n'

In [54]:
store

{'abc123': InMemoryChatMessageHistory(messages=[HumanMessage(content='What is Chain of Thaught Prompt?', additional_kwargs={}, response_metadata={}), AIMessage(content='Chain-of-Thought (CoT) prompting encourages large language models to generate intermediate reasoning steps before arriving at a final answer.  This is done by providing a few examples of question-answer pairs with reasoning steps included in the prompt.  CoT improves reasoning and performance on complex tasks.\n', additional_kwargs={}, response_metadata={})])}

In [55]:
conversational_rag_chain.invoke(
    {"input": "What are common ways of doing it?"},
    config={"configurable": {"session_id": "abc123"}},
)["answer"]

'One common way is to simply add "Let\'s think step by step" to the prompt.  More generally, providing a few examples of question-answer pairs that include the reasoning steps in the prompt can elicit this behavior.  There are also more advanced techniques like Tree of Thoughts, which explores multiple reasoning possibilities.\n'

In [56]:
store

{'abc123': InMemoryChatMessageHistory(messages=[HumanMessage(content='What is Chain of Thaught Prompt?', additional_kwargs={}, response_metadata={}), AIMessage(content='Chain-of-Thought (CoT) prompting encourages large language models to generate intermediate reasoning steps before arriving at a final answer.  This is done by providing a few examples of question-answer pairs with reasoning steps included in the prompt.  CoT improves reasoning and performance on complex tasks.\n', additional_kwargs={}, response_metadata={}), HumanMessage(content='What are common ways of doing it?', additional_kwargs={}, response_metadata={}), AIMessage(content='One common way is to simply add "Let\'s think step by step" to the prompt.  More generally, providing a few examples of question-answer pairs that include the reasoning steps in the prompt can elicit this behavior.  There are also more advanced techniques like Tree of Thoughts, which explores multiple reasoning possibilities.\n', additional_kwarg

In [57]:
conversational_rag_chain.invoke(
    {"input": "What are the questions I asked before?"},
    config={"configurable": {"session_id": "abc123"}},
)["answer"]

'You asked about Chain of Thought prompting, specifically what it is and common ways to implement it.\n'

In [58]:
for message in store["abc123"].messages:
    if isinstance(message, AIMessage):
        prefix = "AI"
    else:
        prefix = "User"

    print(f"{prefix}: {message.content}\n")

User: What is Chain of Thaught Prompt?

AI: Chain-of-Thought (CoT) prompting encourages large language models to generate intermediate reasoning steps before arriving at a final answer.  This is done by providing a few examples of question-answer pairs with reasoning steps included in the prompt.  CoT improves reasoning and performance on complex tasks.


User: What are common ways of doing it?

AI: One common way is to simply add "Let's think step by step" to the prompt.  More generally, providing a few examples of question-answer pairs that include the reasoning steps in the prompt can elicit this behavior.  There are also more advanced techniques like Tree of Thoughts, which explores multiple reasoning possibilities.


User: What are the questions I asked before?

AI: You asked about Chain of Thought prompting, specifically what it is and common ways to implement it.




## **Note:**
Create an simple bot application and use the user authentication method using Flask.