### Conversational Retrieval Chain

<b>Goal:</b> Make your RAG pipeline interactive and memory-aware

<b>Key Concepts:</b>
- Conversational Retrieval
- Chat Memory
- Answer grounding
- Source attribution

In [1]:
import os
from dotenv import load_dotenv
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_groq import ChatGroq
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

In [2]:
# Load environment variables
load_dotenv()

True

### Load the docs

In [3]:
# Directory containing PDFs
pdf_dir = "policies"

# Load all PDFs
loaders = [PyPDFLoader(os.path.join(pdf_dir, file)) for file in os.listdir(pdf_dir) if file.endswith(".pdf")]

# Load and combine documents
documents = []
for loader in loaders:
    documents.extend(loader.load())

print(f"‚úÖ Loaded {len(documents)} pages from {len(loaders)} documents:")
for file in os.listdir(pdf_dir):
    if file.endswith(".pdf"):
        print("-", file)

‚úÖ Loaded 15 pages from 3 documents:
- Leave-Policy.pdf
- posh-policy.pdf
- Salary_Policy.pdf


### Splitting into chunks

In [4]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = text_splitter.split_documents(documents)

print(f"‚úÖ Split into {len(chunks)} text chunks.")

‚úÖ Split into 37 text chunks.


Each ‚Äúchunk‚Äù is like a mini paragraph the AI can understand and embed.
This prevents token overflow and improves retrieval precision.

### Create Vector Store (Chroma) 

In [5]:
from langchain_community.embeddings import HuggingFaceEmbeddings

# Using a free open-source embedding model to avoid OpenAI key issues
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

# Create or reuse a Chroma vector database
vectorstore = Chroma.from_documents(chunks, embedding=embeddings, persist_directory="chroma_db")

# Initialize retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 1})
print("‚úÖ Vector store created and retriever ready.")

  embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")


‚úÖ Vector store created and retriever ready.


"k": 1 ensures we retrieve the most relevant chunk to avoid token overflow.<br>
If you set k=3, it retrieves 3 most similar chunks, but that increases prompt size.

### Set up LLM

In [6]:
llm = ChatGroq(
    model="llama-3.1-8b-instant", 
    api_key=os.getenv("GROQ_API_KEY")
)

print("‚úÖ Connected to Groq LLM successfully.")

‚úÖ Connected to Groq LLM successfully.


### Creating Conversational Memory

In [10]:
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True,
    output_key="answer"
)
print("‚úÖ Conversation memory initialized.")

‚úÖ Conversation memory initialized.


This memory stores past user‚ÄìAI exchanges.

Without it, the system would treat every question as a new, unrelated query.

Think of it as the ‚Äúchat log‚Äù behind your assistant‚Äôs responses.

### Building the Conversational Retrieval Chain

In [11]:
qa_chain = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=retriever,
    memory=memory,
    return_source_documents=True
)

print("‚úÖ Conversational Retrieval Chain is ready.")

‚úÖ Conversational Retrieval Chain is ready.


In [None]:
print("\nüí¨ Conversational RAG Assistant Ready!")
print("Type 'exit' to quit.\n")

chat_history = []  # optional, if you want to keep record manually

while True:
    query = input("You: ").strip()
    if query.lower() in ("exit", "quit"):
        print("üëã Goodbye!")
        break

    # ‚úÖ print user query explicitly (so it's visible in notebook output)
    print(f"\nüßë‚Äçüíª You asked: {query}\n")

    try:
        # run retrieval chain
        result = qa_chain.invoke({"question": query})
        answer = result.get("answer", "(no answer found)")

        # save to chat history (optional)
        chat_history.append({"user": query, "assistant": answer})

        # ‚úÖ print assistant's answer
        print("ü§ñ Assistant:", answer)

        # ‚úÖ print sources (if available)
        if "source_documents" in result:
            print("\nüìö Sources:")
            for doc in result["source_documents"]:
                print(f"- {doc.metadata.get('source', 'Unknown file')}")
        print()

    except Exception as e:
        print(f"‚ö†Ô∏è Error: {e}")


üí¨ Conversational RAG Assistant Ready!
Type 'exit' to quit.



You:  What is Casual Leave



üßë‚Äçüíª You asked: What is Casual Leave

ü§ñ Assistant: Casual Leave is a type of leave that an employee can take when they need a day off for personal or miscellaneous reasons, not related to illness or injury. It is a paid leave, but has certain limitations and rules. Casual Leave can be taken for a minimum of 0.5 days to a maximum of 3 days, and any remaining unused leave lapses at the end of the year.

üìö Sources:
- policies\Leave-Policy.pdf



You:  What is the step to register complaints



üßë‚Äçüíª You asked: What is the step to register complaints

ü§ñ Assistant: You can register complaints through the following methods:

1. Email: Submit your complaint to hrteam@BBIL Systems.com.
2. In-person discussion: You can discuss your complaint with any member of the Internal Committee mentioned in the policy within 3 months of the occurrence of the act of Sexual Harassment.

üìö Sources:
- policies\posh-policy.pdf



You:  What is the salary document about?



üßë‚Äçüíª You asked: What is the salary document about?

ü§ñ Assistant: The salary document appears to be a public version of a salary policy. It is likely a guide or a set of rules that outlines how salaries are structured, paid, and managed within an entity. The document is part of a larger document set, titled "Talent Management and Culture" with a code of GT-02 and a revision number of 03.

üìö Sources:
- policies\Salary_Policy.pdf

