# RAG Pipeline Testing

This notebook is for testing and developing the RAG  pipeline components.

## Setup & Imports

In [13]:
import sys
from pathlib import Path

# Add scripts directory to path
project_root = Path.cwd().parent
scripts_path = project_root / "scripts"
sys.path.insert(0, str(scripts_path))

In [14]:
# Import RAG components
from embedding import EmbeddingModel
from vector import FaissVectorStore
from llm_wrapper import generate_answer, DEFAULT_LLM_MODEL

## 1. Test Embedding Model

In [15]:
# Initialize embedding model
embedding_model = EmbeddingModel()

print(f"Model: {embedding_model.model_name}")
print(f"Embedding dimension: {embedding_model.dimension}")

Loading weights: 100%|██████████| 103/103 [00:00<00:00, 763.53it/s, Materializing param=pooler.dense.weight]                             
BertModel LOAD REPORT from: sentence-transformers/all-MiniLM-L6-v2
Key                     | Status     |  | 
------------------------+------------+--+-
embeddings.position_ids | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.


Model: sentence-transformers/all-MiniLM-L6-v2
Embedding dimension: 384


In [16]:
# Test embedding generation
test_texts = [
    "What are the citizenship requirements in Nepal?",
    "How to register a company in Nepal?",
    "What is the punishment for cyber crime?"
]

embeddings = embedding_model.embed(test_texts)

## 2. Test Vector Store

In [17]:
# Initialize vector store with paths relative to project root
index_path = project_root / "database" / "legal_faiss.index"
metadata_path = project_root / "database" / "legal_faiss_meta.json"

vector_store = FaissVectorStore(
    index_path=index_path,
    metadata_path=metadata_path
)

In [18]:
# Load the vector store (if index exists)
try:
    vector_store.load()
    print(f"Vector store loaded successfully!")
    print(f"Number of documents: {len(vector_store.metadata)}")
except FileNotFoundError as e:
    print(f"Vector store not found: {e}")
    print("You may need to build the index first using vector_store.build()")

Vector store loaded successfully!
Number of documents: 5900


## 3. Test LLM Wrapper

In [19]:
# Prepare context from search results
context_chunks = [result['text'] for result in results]

print(f"Number of context chunks: {len(context_chunks)}")

Number of context chunks: 3


In [20]:
# Generate answer using LLM
question = "What are the requirements to obtain Nepali citizenship?"

try:
    answer = generate_answer(
        question=question,
        context_chunks=context_chunks,
        max_tokens=512,
        temperature=0.3
    )
    print(f"Question: {question}\n")
    print(f"Answer:\n{answer}")
except Exception as e:
    print(f"Error generating answer: {e}")

Question: What are the requirements to obtain Nepali citizenship?

Answer:
According to Section 8 of the Nepal Citizenship Act 2063 (2006), to obtain Nepali citizenship by descent, a person must:

1. File an application in the prescribed form.
2. Attach copies of the following documents:
   a. Nepalese Citizenship Certificate of descendants of relatives within three generations from paternal or maternal or self side (except for Nepalese female citizens married to a foreigner).
   b. Recommendation from the concerned Village Development or Municipality certifying the place of birth and relationship.

To obtain Nepali citizenship by virtue of birth, a person must file an application with the required documents as prescribed in the Act.


## 4. Full RAG Pipeline Test

In [None]:
def ask_legal_question(question: str, top_k: int = 5) -> str:
    
    results = vector_store.search(
        query=question,
        embedding_model=embedding_model,
        top_k=top_k
    )
    
    context_chunks = [result['text'] for result in results]
    
    answer = generate_answer(
        question=question,
        context_chunks=context_chunks
    )
    
    return answer, results

In [22]:
# Test the full pipeline with different questions
test_questions = [
    "What is the punishment for cyber crime in Nepal?",
    "How to register a company in Nepal?",
    "What are the fundamental rights in Nepal's constitution?"
]

for q in test_questions:  
    print(f"Q: {q}")
    try:
        answer, sources = ask_legal_question(q)
        print(f"\nA: {answer}")
        print(f"\nSources: {[r['metadata'].get('source', 'Unknown') for r in sources]}")
    except Exception as e:
        print(f"Error: {e}")

Q: What is the punishment for cyber crime in Nepal?

A: I don't have enough information to answer this question. The provided context does not mention any specific punishment for cybercrime in Nepal.

Sources: ['Unknown', 'Unknown', 'Unknown', 'Unknown', 'Unknown']
Q: How to register a company in Nepal?

A: According to the provided context, to register a company in Nepal, a foreign company must submit the following documents to the Office, along with the application:

1. Permission obtained by the foreign company from the competent authority to carry on its business or transaction in Nepal.
2. Copies of the charter, certificate of incorporation, memorandum of association, articles of association of the company, and Nepalese translation thereof.
3. Full name, address of the registered office and principal place of business of the company, date of incorporation of the company, description of the paid-up capital and major objectives of such company.
4. Names, addresses of directors, mana