# RAG (Retrieval-Augmented Generation)

RAG is a technique used in AI where a model retrieves relevant information from an external source (like a database or document storage) and then generates a response based on that information.

## Simple Breakdown:
1. **Retrieval** – The system searches for relevant documents or text from a knowledge base.  
2. **Augmentation** – The retrieved information is given to the AI model as extra context.  
3. **Generation** – The AI uses this information to generate a well-informed response.  

## Why Use RAG?
- **Improves accuracy** by providing up-to-date and relevant facts.  
- **Reduces hallucination** (wrong or made-up answers).  
- **Helps AI work with limited training data** by pulling real-time knowledge.  

## Example:
Imagine you have an AI chatbot for legal advice:  

- The user asks: *"What is the penalty for breaking a contract?"*  
- The AI **retrieves** relevant laws and case studies.  
- It **augments** its knowledge with this data.  
- It **generates** a response like:  

  *"According to XYZ law, breaking a contract may result in fines or legal action depending on the terms stated."*


In [2]:
import os

# Set the API key in the environment variable

os.environ['OPENAI_API_KEY'] = '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'

## **RAG-based Legal Document Analysis system**

## Knowledge Base Query Engine
- **Description:**

Build a query engine that allows users to ask questions about a specific knowledge base (e.g., company policies or research papers etc). The engine retrieves relevant information and generates answers.

### **LangChain Features Used:**

- Document loaders to ingest and preprocess knowledge base data.
- Retrieval-augmented generation (RAG) for answering questions.
- Memory to refine search results based on user feedback.

###  *Example Workflow:*
- The user asks a question about the knowledge base.
- The engine retrieves relevant documents and generates an answer.
- The user can ask follow-up questions to refine the results.



In [3]:
# pip install langchain openai faiss-cpu tiktoken

In [4]:
# pip install pypdf

export OPENAI_API_KEY="your_api_key_here"

In [19]:
# Ingest and Preprocess the Knowledge Base
from langchain.document_loaders import DirectoryLoader, TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import PyPDFLoader, UnstructuredWordDocumentLoader

# Load documents from the directory
# loader = DirectoryLoader('knowledge_base/', glob="**/*.txt", loader_cls=TextLoader)
# documents = loader.load()

# Load documents from the PDF files
documents = PyPDFLoader(r'C:\Users\Abcom\Desktop\LangChain_Course\Indian_Constitution.pdf').load()
print(documents[0].page_content)

# Split documents into smaller chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = text_splitter.split_documents(documents)

print(f"Loaded {len(chunks)} chunks of text.")

THE CONSTITUTION OF INDIA 
[As on       May, 2022] 
2022
Loaded 1199 chunks of text.


In [6]:
# Create a vector Database
from langchain_openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS

# Generate embeddings for the chunks
embeddings = OpenAIEmbeddings()

# Store embeddings in a FAISS vector database
vector_store = FAISS.from_documents(chunks, embeddings)

# Save the vector store locally (optional)
vector_store.save_local("faiss_index")

In [8]:
# Implement Retrieval-Augmented Generation (RAG)
from langchain.chains import RetrievalQA
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain
from langchain_openai import ChatOpenAI


# Load the vector store
vector_store = FAISS.load_local("faiss_index", embeddings, allow_dangerous_deserialization=True)


# Create a retriever
retriever = vector_store.as_retriever(search_type="similarity", search_kwargs={"k": 5})

# Initialize the LLM
llm = ChatOpenAI(model_name="gpt-4o", temperature=0)

# Create the RAG chain


# Add memory to the chain


# memory = ConversationBufferMemory(memory_key="chat_history", output_key="result")
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True, output_key="answer")



qa_chain_with_memory = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=retriever,
    memory=memory,
    return_source_documents=True
)


In [9]:
def query_engine():
    print("Welcome to the Knowledge Base Query Engine!")
    chat_history = []  # Store conversation history

    while True:
        query = input("\nAsk a question (or type 'exit' to quit): ")
        if query.lower() == "exit":
            print("Goodbye!")
            break

        # Retrieve and generate an answer
        result = qa_chain_with_memory.invoke({"question": query, "chat_history": chat_history})  

        answer = result["answer"]  # Use "answer" instead of "result"
        source_documents = result["source_documents"]

        print("\nAnswer:")
        print(answer)

        print("\nSource Documents:")
        for i, doc in enumerate(source_documents):
            print(f"{i + 1}. {doc.metadata['source']}")

        # Update chat history
        chat_history.append((query, answer))

if __name__ == "__main__":
    query_engine()


Welcome to the Knowledge Base Query Engine!

Answer:
Hello! How can I assist you today?

Source Documents:
1. C:\Users\Abcom\Desktop\LangChain_Course\Indian_Constitution.pdf
2. C:\Users\Abcom\Desktop\LangChain_Course\Indian_Constitution.pdf
3. C:\Users\Abcom\Desktop\LangChain_Course\Indian_Constitution.pdf
4. C:\Users\Abcom\Desktop\LangChain_Course\Indian_Constitution.pdf
5. C:\Users\Abcom\Desktop\LangChain_Course\Indian_Constitution.pdf

Answer:
Hello! I can help answer questions, provide information, assist with problem-solving, and more. How can I assist you today?

Source Documents:
1. C:\Users\Abcom\Desktop\LangChain_Course\Indian_Constitution.pdf
2. C:\Users\Abcom\Desktop\LangChain_Course\Indian_Constitution.pdf
3. C:\Users\Abcom\Desktop\LangChain_Course\Indian_Constitution.pdf
4. C:\Users\Abcom\Desktop\LangChain_Course\Indian_Constitution.pdf
5. C:\Users\Abcom\Desktop\LangChain_Course\Indian_Constitution.pdf
Goodbye!
