# RAG with chat history

<img src="/home/mauricio/Documents/Projects/RAG-Mastery/Diagrams/RAG-chat-history.png" width="65%">

We'll start by copying and reusing certain components from our Simple_rag implementation:

- Document loading
- Text splitting
- Document retrieval

These components will remain largely unchanged, as they form the foundation of our RAG system.


In [18]:
import sys
import os
import tqdm as notebook_tqdm
from typing import List, Union

# Path to the directory containing config.py
config_path = '/home/mauricio/Documents/Projects/RAG-Mastery'

# Add the directory to sys.path
if config_path not in sys.path:
    sys.path.append(config_path)

# Now you can import the API_KEY from config.py
from config import API_KEY

path_to_docs = "/home/mauricio/Documents/Projects/RAG-Mastery/data"

import warnings
# Suppress the specific warning
warnings.filterwarnings("ignore", message="Could not download mistral tokenizer")
import logging
logging.getLogger().setLevel(logging.WARNING)
# Suppress HTTP request logs
logging.getLogger("httpx").setLevel(logging.WARNING)

# Suppress FAISS logs
logging.getLogger("faiss").setLevel(logging.WARNING)

# Suppress pikepdf logs
logging.getLogger("pikepdf").setLevel(logging.WARNING)

In [19]:
from langchain_mistralai.chat_models import ChatMistralAI
def get_llm_model():
        return ChatMistralAI(
            model_name="open-mixtral-8x22b", 
            mistral_api_key=API_KEY
        )
from langchain_unstructured import UnstructuredLoader
from langchain.schema.document import Document

def load_documents(folder_path):
    documents = []
    for file in os.listdir(folder_path):
        if file.endswith('.docx') or file.endswith('.pdf') or file.endswith('.txt'):
            loader = UnstructuredLoader(os.path.join(folder_path, file))
            documents.extend(loader.load())
        print("Document loaded lenght: ", len(documents))
    print("Documents loaded successfully ✅")
    print(documents[0].metadata.get("filename"))
    return documents
from langchain_text_splitters import RecursiveCharacterTextSplitter


def split_documents(documents, chunk_size= 1000, chunk_overlap= 200):
        try:
            text_splitter: RecursiveCharacterTextSplitter = RecursiveCharacterTextSplitter(
                chunk_size=chunk_size,
                chunk_overlap=chunk_overlap,
                length_function=len,
                separators=["\n\n", "\n", " ", ""]
            )
            splits: List[Document] = text_splitter.split_documents(documents)
            print("Split document successfully ✅")
            print("Documents split: ", len(splits))
            return splits
        except Exception as e:
            print(f"Error splitting documents: {e}")
            raise
from langchain_mistralai import MistralAIEmbeddings
from langchain_community.vectorstores import FAISS

def embed_documents(splits):
    try:
        embeddings = MistralAIEmbeddings(
            model="mistral-embed",
            mistral_api_key=API_KEY
            )
        vectorstore = FAISS.from_documents(documents=splits, embedding=embeddings)
        retriever = vectorstore.as_retriever(
            search_type="mmr",
            search_kwargs={"k": 6},
        )

        print("Retriever succesfuly created ✅")
        return retriever
    except Exception as e:
        print(f"Error embeding/retrieving documents {e}")
        raise

We're now focusing on enhancing our RAG system with a sophisticated chat history component, this involves creating a sub-chain that reformulates the user's latest question by incorporating context from the entire conversation history. 

$$
(query, conversation\hspace{0.1cm}history) --> LLM --> rephrased\hspace{0.1cm}query --> retriever
$$

This process allows us to maintain conversation fluently while ensuring that each query to the retriever is relevant information.


In [20]:
from langchain_core.prompts import MessagesPlaceholder
from langchain_core.prompts import ChatPromptTemplate

contextualize_q_system_prompt = (
    """You are an AI assistant specializing in contextual analysis and question reformulation for an NFL rulebook system. Your task is to:

        1. Carefully analyze the provided chat history and the latest user question.
        2. Identify any contextual references or implied information from the chat history that are relevant to the latest question.
        3. Reformulate the question into a standalone format that incorporates all necessary context.
        4. Ensure the reformulated question can be fully understood without access to the chat history.
        5. If the original question is already standalone and doesn't require additional context, return it unchanged.

        Key Instructions:
        - Focus solely on reformulation; do not attempt to answer the question.
        - Maintain the original intent and scope of the question.
        - Use clear, concise language appropriate for an NFL rulebook context.
        - Include specific NFL terminology or references if they were part of the original context.

        Example:
        Original: "What's the penalty for that?"
        Reformulated: "What is the specific penalty for defensive pass interference in the NFL?"

        Output: Provide only the reformulated question or the original question if no reformulation is needed. Do not include any explanations or additional commentary.
    """
)

contextualize_q_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_q_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
    )



Now we will build the QA chain for the RAG. We will use the $history_aware_retriever$ function form lagchain. 

Also to generate the chain we have to use question_answer_chain with the context, chat_history and the $input$ as the input keys.

In [21]:
from langchain.chains import create_history_aware_retriever
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

system_prompt = (
    """
        You are an expert NFL rulebook assistant specializing in concise, accurate answers. 
        Your primary function is to provide clear and precise responses to questions about NFL rules and regulations. 
        Key Instructions:
            1. Analyze the provided context carefully.
            2. Use only the given context to formulate your answer. Do not rely on external knowledge.
            3. If the context doesn't contain sufficient information to answer the question, respond with 'I don't have enough information to answer that question based on the provided context.
            4. Limit your response to a maximum of three sentences.
            5. Prioritize clarity and accuracy over exhaustive explanations.
            6. If relevant, cite the specific NFL rule or section number.
            7. Avoid speculation or personal opinions.
            
        Context:
        {context}
    """
)

qa_prompt = ChatPromptTemplate.from_messages([
    ("system", system_prompt),
    MessagesPlaceholder("chat_history"),
    ("human", "{input}"),
])

Now let's create the:

* history_aware_retrieve

* question_answer_chain

* rag_chain


In [22]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_history_aware_retriever

documents = load_documents(path_to_docs)
splits = split_documents(documents)
retriever = embed_documents(splits)

llm = get_llm_model()

history_aware_retriever = create_history_aware_retriever(
    llm, retriever, contextualize_q_prompt)

question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)

rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)


Document loaded lenght:  1964
Documents loaded successfully ✅
2024-nfl-rulebook.pdf
Split document successfully ✅
Documents split:  1995
Retriever succesfuly created ✅


We're ready to try it!! 

In [29]:
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

store = {}


def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]


conversational_rag_chain = RunnableWithMessageHistory(
    rag_chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="chat_history",
    output_messages_key="answer",
)

In [30]:
result = conversational_rag_chain.invoke(
    {"input": "What are the new rules for kickoffs in the NFL for the 2024 season?"},
    config={
        "configurable": {"session_id": "nfl_rules_2024"}
    }
)
print(f"🔵 Chat History: {result['chat_history']}")
print(f"🟠 Answer: {result['answer']}")

🔵 Chat History: []
🟠 Answer: The new rule for kickoffs in the 2024 NFL season is that after the kick, the kicker may not cross the vicinity of the 50-yard line until the ball touches the ground or a player. Additionally, there is a new toss option for both halves and overtime, and a loss of 15 yards from the spot of the kickoff for the first half.


In [31]:
result_2 = conversational_rag_chain.invoke(
    {"input": "How do these new kickoff rules affect onside kick attempts?"},
    config={
        "configurable": {"session_id": "nfl_rules_2024"}
    }
)
print(f"🔵 Chat History: {result_2['chat_history']}")
print(f"🟠 Answer: {result_2['answer']}")

🔵 Chat History: [HumanMessage(content='What are the new rules for kickoffs in the NFL for the 2024 season?'), AIMessage(content='The new rule for kickoffs in the 2024 NFL season is that after the kick, the kicker may not cross the vicinity of the 50-yard line until the ball touches the ground or a player. Additionally, there is a new toss option for both halves and overtime, and a loss of 15 yards from the spot of the kickoff for the first half.')]
🟠 Answer: The new kickoff rules state that if an onside kick goes untouched beyond the onside kick setup zone, the kicking team will be penalized 15 yards from the kicking team's restraining line and the receiving team will take possession. This will make onside kick attempts more difficult for the kicking team.


In [32]:
result_3 = conversational_rag_chain.invoke(
    {"input": "Given these changes, what strategies might teams adopt for late-game situations when they're trailing?"},
    config={
        "configurable": {"session_id": "nfl_rules_2024"}
    }
)
print(f"🔵 Chat History: {result_3['chat_history']}")
print(f"🟠 Answer: {result_3['answer']}")

🔵 Chat History: [HumanMessage(content='What are the new rules for kickoffs in the NFL for the 2024 season?'), AIMessage(content='The new rule for kickoffs in the 2024 NFL season is that after the kick, the kicker may not cross the vicinity of the 50-yard line until the ball touches the ground or a player. Additionally, there is a new toss option for both halves and overtime, and a loss of 15 yards from the spot of the kickoff for the first half.'), HumanMessage(content='How do these new kickoff rules affect onside kick attempts?'), AIMessage(content="The new kickoff rules state that if an onside kick goes untouched beyond the onside kick setup zone, the kicking team will be penalized 15 yards from the kicking team's restraining line and the receiving team will take possession. This will make onside kick attempts more difficult for the kicking team.")]
🟠 Answer: Teams may choose to declare an onside kick if they are trailing in the fourth period, providing an opportunity to regain posse

In [33]:
result_4 = conversational_rag_chain.invoke(
    {"input": "How do the 2024 rule changes for kickoffs and late-game strategies impact player safety, especially for special teams players?"},
    config={
        "configurable": {"session_id": "nfl_rules_2024"}
    }
)
print(f"🔵 Chat History: {result_4['chat_history']}")
print(f"🟠 Answer: {result_4['answer']}")

🔵 Chat History: [HumanMessage(content='What are the new rules for kickoffs in the NFL for the 2024 season?'), AIMessage(content='The new rule for kickoffs in the 2024 NFL season is that after the kick, the kicker may not cross the vicinity of the 50-yard line until the ball touches the ground or a player. Additionally, there is a new toss option for both halves and overtime, and a loss of 15 yards from the spot of the kickoff for the first half.'), HumanMessage(content='How do these new kickoff rules affect onside kick attempts?'), AIMessage(content="The new kickoff rules state that if an onside kick goes untouched beyond the onside kick setup zone, the kicking team will be penalized 15 yards from the kicking team's restraining line and the receiving team will take possession. This will make onside kick attempts more difficult for the kicking team."), HumanMessage(content="Given these changes, what strategies might teams adopt for late-game situations when they're trailing?"), AIMessag