# RAG System with Qdrant and Gemini

This notebook implements a complete Retrieval Augmented Generation (RAG) system using:
- **Qdrant** (in Docker) as the vector database
- **SentenceTransformers** for embeddings
- **Gemini** as the LLM
- **LangChain** to connect everything

## 2. Import Libraries and Setup

In [None]:
import os
import time
from dotenv import load_dotenv
import google.generativeai as genai
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import SentenceTransformerEmbeddings
from langchain_community.vectorstores import Qdrant
from langchain import PromptTemplate
from langchain.chains import RetrievalQA
from langchain_google_genai import ChatGoogleGenerativeAI
from qdrant_client import QdrantClient
from qdrant_client.http.models import Distance, VectorParams, PointStruct
from langchain_community.document_loaders import UnstructuredPowerPointLoader

## 3. Configure Environment

Set up the Google API key for Gemini. You can store it in a `.env` file or input it directly below.

In [None]:
# Load environment variables
load_dotenv()

# Set Google API key
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
if not GOOGLE_API_KEY:
    # Uncomment and input your key if not using .env
    # GOOGLE_API_KEY = "your_google_api_key_here"
    pass

os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY
genai.configure(api_key=GOOGLE_API_KEY)

print(f"✅ Google API key configured")

## 4. Set Up Embedding Model

We'll use the SentenceTransformers library with the `all-MiniLM-L6-v2` model, which produces 384-dimensional embeddings.

In [None]:
# Set up the embedding model
embeddings = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
embedding_dimension = embeddings.client.get_sentence_embedding_dimension()
print(f"✅ Encoder ready, dim = {embedding_dimension}")

## 5. Connect to Qdrant and Create Collection

Connect to the Qdrant server running in Docker and set up the vector collection.

In [None]:
# Connect to Qdrant Docker container
qdrant_client = QdrantClient(host="localhost", port=6333)
collection_name = "pdf_documents"

# Check if collection exists, if not create it
collections = qdrant_client.get_collections().collections
collection_names = [collection.name for collection in collections]

if collection_name in collection_names:
    # Delete collection if it exists
    qdrant_client.delete_collection(collection_name=collection_name)
    print(f"🗑️ Deleted existing collection: {collection_name}")

# Create new collection
qdrant_client.create_collection(
    collection_name=collection_name,
    vectors_config=VectorParams(size=embedding_dimension, distance=Distance.COSINE),
)
print(f"✅ Created new Qdrant collection: {collection_name}")
print(f"🌐 Qdrant dashboard available at: http://localhost:6333/dashboard")

## 6. Load and Split PDF Documents

We'll load PDFs from the `./pdfs` directory and split them into manageable chunks.

In [None]:
# Set up folder path and text splitter
folder_path = "./pdfs"  # folder containing your PDFs
loader_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)

# Check if folder exists
if not os.path.exists(folder_path):
    os.makedirs(folder_path)
    print(f"Created directory: {folder_path}")
    print("Please add your PDF files to this directory and run this cell again.")
else:
    # Load and process PDF files
    documents = []
    pdf_files = 0
    
    for fname in os.listdir(folder_path):
        if not fname.lower().endswith(".pdf"):
            continue
        pdf_files += 1
        path = os.path.join(folder_path, fname)
        print(f"Loading {fname}...")
        loader = PyPDFLoader(path)
        pages = loader.load_and_split(text_splitter=loader_splitter)
        documents.extend(pages)

    if pdf_files == 0:
        print(f"No PDF files found in {folder_path}. Please add PDF files and run this cell again.")
    else:
        print(f"✅ Loaded and split {len(documents)} chunks from {pdf_files} PDF files.")

## 7. Create Vector Store and Add Documents

We'll create embeddings for all document chunks and store them in Qdrant.

In [None]:
# Check if documents were loaded before proceeding
if 'documents' in locals() and len(documents) > 0:
    # Create vector store
    start_time = time.time()
    print("Creating embeddings and uploading to Qdrant...")
    
    vectorstore = Qdrant.from_documents(
        documents,
        embeddings,
        url="http://localhost:6333",
        collection_name=collection_name,
    )
    
    elapsed_time = time.time() - start_time
    print(f"✅ Uploaded {len(documents)} document chunks to Qdrant in {elapsed_time:.2f} seconds")
    print(f"🌐 View your collection at: http://localhost:6333/dashboard/#/collections/{collection_name}")
else:
    print("No documents loaded. Please run the previous cell successfully first.")

## 8. Set Up the LLM and Prompt Template

We'll use Google's Gemini Pro model and create a prompt template for consistent answers.

In [None]:
# Set up the LLM (Gemini)
llm = ChatGoogleGenerativeAI(
            model="gemini-2.0-flash",
            google_api_key=GOOGLE_API_KEY,
            temperature=0.2,
            convert_system_message_to_human=True
        )

# Create the prompt template
template = """You are an expert assistant. Use the following context to answer the user's question.

Context:
{context}

Question:
{question}

Answer:
1. Summary:  
   Provide a succinct explanatory summary (1–4 sentences).

2. Key Points:  
   List the main supporting details in bullet form.

Example format:

1. Summary:  
   The primary purpose of vector databases is to store and query dense vector embeddings for similarity search.

2. Key Points:  
   - Vector databases offer fully managed services, eliminating infrastructure overhead.  
   - They support cosine and dot-product similarity metrics for fast nearest-neighbor retrieval.  
   - They integrate seamlessly with popular embedding libraries like SentenceTransformer.  
   - They provide automatic indexing to scale to millions of vectors.

Now, answer the question below following this format:
{question}
"""

prompt = PromptTemplate(
    input_variables=["context", "question"],
    template=template
)

print("✅ Prompt template created")

## 9. Create the RAG Chain

Now we'll connect all components to create our RAG system.

In [None]:
# Check if vectorstore exists
if 'vectorstore' in locals():
    # Create RetrievalQA chain
    retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
    qa_chain = RetrievalQA.from_chain_type(
        llm=llm,
        chain_type="stuff",
        retriever=retriever,
        chain_type_kwargs={"prompt": prompt},
        return_source_documents=True
    )
    
    print("✅ RAG chain with Gemini is ready.")
else:
    print("Vectorstore not created. Please run previous cells successfully first.")

## 10. Test with Example Questions

Let's test our RAG system with some example questions.

In [None]:
# Define a function to handle questions
def ask_question(question):
    if 'qa_chain' not in locals() and 'qa_chain' not in globals():
        print("RAG chain not created. Please run previous cells successfully first.")
        return None
    
    result = qa_chain({"query": question})
    
    print("\n" + "="*50)
    print(f"QUESTION: {question}")
    print("="*50)
    
    print("\nANSWER:")
    print(result["result"])
    
    return result

# Interactive question loop
def interactive_qa_session():
    print("\n" + "="*50)
    print("INTERACTIVE Q&A SESSION")
    print("Type 'exit', 'quit', or 'q' to end the session")
    print("="*50 + "\n")
    
    while True:
        user_question = input("\nEnter your question: ")
        
        # Check if user wants to exit
        if user_question.lower() in ['exit', 'quit', 'q']:
            print("\nEnding Q&A session. Goodbye!")
            break
        
        # Process the question
        ask_question(user_question)

# Start the interactive session
if 'qa_chain' in locals() or 'qa_chain' in globals():
    interactive_qa_session()
else:
    print("RAG chain not created. Please run previous cells successfully first.")
    
    # Optional: Ask if user wants to try an example question anyway
    try_example = input("Would you like to try an example question anyway? (y/n): ")
    if try_example.lower() in ['y', 'yes']:
        example_question = "What is the Anti-Corruption Layer (ACL) pattern"
        result = ask_question(example_question)