# RAG System using LlamaIndex and LanceDB

This notebook demonstrates how to build a Retrieval-Augmented Generation (RAG) system for question answering on the jp-handbook-full.pdf document using:
- LlamaIndex for document processing and querying
- LanceDB as the vector store

## Install Required Dependencies

First, let's install the necessary libraries if they aren't already installed.

In [1]:
# Install required packages
! pip install llama-index llama-index-readers-file llama-index-llms-google-genai llama-index-embeddings-google-genai lancedb pypdf python-dotenv google-generativeai



## Import Libraries

Now let's import the necessary libraries for our RAG system.

In [2]:
# Import required libraries
import os
import lancedb
from dotenv import load_dotenv
from llama_index.core import SimpleDirectoryReader  # Updated import path
from llama_index.core import Settings, VectorStoreIndex, StorageContext  # Core imports
from llama_index.vector_stores.lancedb import LanceDBVectorStore
from llama_index.llms.google_genai import GoogleGenAI
from llama_index.embeddings.google_genai import GoogleGenAIEmbedding
from llama_index.core.node_parser import SentenceSplitter  # Updated import path

# Load environment variables from .env file
load_dotenv()

# Set up the Gemini models for LLM and embeddings
import google.generativeai as genai

# Get API key from environment variables
api_key = os.getenv("GOOGLE_API_KEY")
if not api_key:
    print("Warning: GOOGLE_API_KEY not found in environment variables.")
    print("Please create a .env file in the project root with your Google API key:")
    print("GOOGLE_API_KEY=your-api-key-here")
else:
    print("Google API key loaded successfully from environment variables.")
    genai.configure(api_key=api_key)

    # Set up the LLM using gemini-2.0-flash
    llm = GoogleGenAI(model="gemini-2.0-flash", temperature=0.1)

    # Set up the embedding model using gemini-embedding-exp
    embed_model = GoogleGenAIEmbedding(model_name="models/text-embedding-004", api_key=api_key)

    # Configure LlamaIndex to use these models
    Settings.llm = llm
    Settings.embed_model = embed_model

  from .autonotebook import tqdm as notebook_tqdm


Google API key loaded successfully from environment variables.


## Load and Process the PDF Document

We'll load the jp-handbook-full.pdf document and split it into chunks for better retrieval.

In [3]:
# Define the path to the PDF document
pdf_path = '../jp-handbook-full.pdf'

# Check if the file exists
if os.path.exists(pdf_path):
    print(f"Loading document: {pdf_path}")
else:
    print(f"Error: Document not found at {pdf_path}")

# Load the document using SimpleDirectoryReader
documents = SimpleDirectoryReader(input_files=[pdf_path]).load_data()
print(f"Loaded {len(documents)} document(s)")

Loading document: ../jp-handbook-full.pdf
Loaded 132 document(s)
Loaded 132 document(s)


## Parse the Document into Nodes

Split the document into smaller chunks for more effective retrieval.

In [4]:
# Create a node parser to split the document into chunks
node_parser = SentenceSplitter(chunk_size=1024, chunk_overlap=200)

# Parse the document into nodes
nodes = node_parser.get_nodes_from_documents(documents)
print(f"Document split into {len(nodes)} nodes")

Document split into 132 nodes


## Set Up LanceDB Vector Store

Create a LanceDB vector store to efficiently store and retrieve document vectors.

In [5]:
# Create LanceDB database and table
db = lancedb.connect('./lancedb')
table_name = 'jp_handbook'

# Remove existing table if it exists
if table_name in db.table_names():
    db.drop_table(table_name)

# Create a new LanceDB vector store
vector_store = LanceDBVectorStore(db=db, table_name=table_name)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# Create an index from the nodes
index = VectorStoreIndex(nodes, storage_context=storage_context)
print("Vector index created successfully")

ServerError: 503 UNAVAILABLE. {'error': {'code': 503, 'message': 'The service is currently unavailable.', 'status': 'UNAVAILABLE'}}

## Create a Query Engine

Set up a query engine to answer questions based on the indexed document.

In [None]:
# Create a query engine
query_engine = index.as_query_engine(similarity_top_k=5)
print("Query engine ready for questions")

## Ask Questions

Now you can ask questions about the jp-handbook-full.pdf document.

In [None]:
def ask_question(question):
    # print(f"Q: {question}")
    try:
        response = query_engine.query(question)
        markdown_response = f"""
### Question
{question}

### Answer
{response}

### Sources
"""
        for i, source_node in enumerate(response.source_nodes):
            content_preview = source_node.node.get_content()[:150].replace('\n', ' ').strip()
            markdown_response += f"{i+1}. {content_preview}...\n\n"
            
        from IPython.display import Markdown, display
        display(Markdown(markdown_response))
        return response
    except Exception as e:
        error_message = f"""
### Question
{question}

### Error
```
{str(e)}
```

If this is a RESOURCE_EXHAUSTED error, you've hit Google API quota limits. Try again later.
"""
        from IPython.display import Markdown, display
        display(Markdown(error_message))
        return None

In [None]:
# Example questions - replace with your own questions about the handbook
question = "List the duties and services of a Justice of the Peace."
response = ask_question(question)

In [None]:
# Try another question
question = "List the procedures for arresting a criminal?"
response = ask_question(question)

## Multiple Q&A



In [None]:
import time

# List of questions to ask about the JP handbook
jp_questions = [
    "What are the main responsibilities of a Justice of the Peace?",
    "What legal documents can a Justice of the Peace certify or witness?",
    "What is the procedure for handling bail applications?",
    "What ethical considerations should a Justice of the Peace keep in mind?",
    "How does a Justice of the Peace handle conflicts of interest?",
    "What are the requirements to become a Justice of the Peace?",
    "What are the limitations on a Justice of the Peace's authority?",
    "How should a Justice of the Peace handle confidentiality issues?",
    "What training is required for a Justice of the Peace?",
    "What record-keeping responsibilities does a Justice of the Peace have?"
]

# Function to iterate through questions with a delay to avoid rate limiting
def ask_multiple_questions(questions_list, delay_seconds=1):
    print(f"Processing {len(questions_list)} questions with {delay_seconds} second delay between each...\n")
    for i, question in enumerate(questions_list):
        ask_question(question)
        print(f"\nWaiting {delay_seconds} seconds before next question...")
        time.sleep(delay_seconds)
    
    print("All questions processed!")

In [None]:
# Call the function to process all questions
# You can set a longer delay if you encounter rate limiting issues
response = ask_multiple_questions(jp_questions)


## Interactive Q&A

You can use the cell below to interactively ask questions about the handbook.

In [None]:
# Interactive question answering
question = input("Ask a question about the handbook:")
response = ask_question(question)

## Conclusion

This RAG system allows you to ask questions about the jp-handbook-full.pdf document and get contextually relevant answers based on the content. Feel free to modify the query parameters or add more advanced features as needed.