# RAG Query Engine Testing

This notebook tests the `RAGQueryEngine` class to ensure it produces similar results to the manual RAG implementation in notebook 002.

In [1]:
import os
import sys
from pathlib import Path

from rag_project.constants import (
    BOOKS_CHROMA_DB_DIR, 
    BOOKS_COLLECTION_NAME
)
from rag_project.query_data import create_rag_engine, RAGQueryEngine
from rag_project.rag_models import LLMConfig

  from .autonotebook import tqdm as notebook_tqdm


# Initialize RAG Query Engine

Create and configure the RAG Query Engine using the factory function.

In [2]:
# Create LLM configuration matching the manual implementation
llm_config = LLMConfig()

# Initialize RAG Query Engine
engine = create_rag_engine(
    chroma_path=BOOKS_CHROMA_DB_DIR,
    collection_name=BOOKS_COLLECTION_NAME,
    llm_config=llm_config
)

print(f"RAG Query Engine initialized: {engine}")
print(f"Configuration: {engine.get_config_info()}")

INFO: create_rag_engine called with parameters: chroma_path='/home/lealdx/personal/rag_chat_project/data/chroma-db/books', collection_name='books', model='TheBloke/Llama-2-7B-Chat-GGUF', model_type='llama', llm_config={'max_new_tokens': 256, 'temperature': 0.3, 'top_k': 40, 'top_p': 0.9, 'repetition_penalty': 1.05, 'stop': ['</s>', '[/INST]'], 'context_length': 2048, 'threads': 4, 'seed': 42, 'stream': True}
INFO: Initializing LLM component
INFO: load_local_llama called with parameters: model='TheBloke/Llama-2-7B-Chat-GGUF', model_type='llama', config={'max_new_tokens': 256, 'temperature': 0.3, 'top_k': 40, 'top_p': 0.9, 'repetition_penalty': 1.05, 'stop': ['</s>', '[/INST]'], 'context_length': 2048, 'threads': 4, 'seed': 42, 'stream': True}
INFO: Checking model file existence at path: /home/lealdx/personal/rag_chat_project/models/llama-2-7b-chat.Q4_K_M.gguf
INFO: Initializing CTransformers with model file: /home/lealdx/personal/rag_chat_project/models/llama-2-7b-chat.Q4_K_M.gguf


Fetching 1 files: 100%|██████████| 1/1 [00:00<00:00, 5482.75it/s]
Fetching 0 files: 0it [00:00, ?it/s]


INFO: load_local_llama completed successfully in 3.031 seconds. LLM type: CTransformers, Model: TheBloke/Llama-2-7B-Chat-GGUF
INFO: Initializing Chroma DB component
INFO: init_chroma called with parameters: chroma_str_path='/home/lealdx/personal/rag_chat_project/data/chroma-db/books', collection_name='books'
INFO: Initializing embeddings function
INFO: Embeddings function initialized successfully
INFO: Checking Chroma directory path: /home/lealdx/personal/rag_chat_project/data/chroma-db/books
INFO: Initializing Chroma DB at /home/lealdx/personal/rag_chat_project/data/chroma-db/books with collection 'books'
INFO: init_chroma completed successfully in 4.220 seconds. Collection: books, Path: /home/lealdx/personal/rag_chat_project/data/chroma-db/books, DB type: Chroma
INFO: Building engine configuration
INFO: Creating RAGQueryEngine instance
INFO: RAGQueryEngine.__init__ called with parameters: llm_type='CTransformers', chroma_type='Chroma', engine_config_keys=['model', 'model_type', 'chro

# Test Queries

Define the same test query used in notebook 002 for comparison.

In [3]:
# Same query used in notebook 002 for comparison
query_text = "Who is the White Rabbit and how does Alice first meet him?"

# Additional test queries from notebook 002 comments
specific_queries = [
    "What does Alice drink or eat that makes her change size?",
    "Describe the Duchess's kitchen and what happens there",
    "What games are played in Wonderland?",
    "What poems or songs are recited in the story?",
    "How does Alice's adventure end?"
]

print(f"Primary test query: {query_text}")
print(f"Additional test queries: {len(specific_queries)} queries available")

Primary test query: Who is the White Rabbit and how does Alice first meet him?
Additional test queries: 5 queries available


# Test 1: Basic Query Method

Test the basic `query()` method which returns just the answer.

In [4]:
# Test basic query method
print("=== Basic Query Test ===")
basic_answer = engine.query(query_text)
print(f"Question: {query_text}")
print(f"Answer: {basic_answer}")
print(f"Answer length: {len(basic_answer)} characters")

=== Basic Query Test ===
INFO: query called with parameters: question_length=58, documents_retrieve=3, max_length=500, min_similarity_score=0.500, question_preview='Who is the White Rabbit and how does Alice first meet him?'
INFO: Validating query inputs
INFO: validate_query_inputs called with parameters: question_length=58, max_length=500, question_preview='Who is the White Rabbit and how does Alice first meet him?'
INFO: validate_query_inputs completed successfully in 0.003 seconds. Question validated: length=58
INFO: Processing query: Who is the White Rabbit and how does Alice first m...
INFO: Retrieving documents
INFO: retrieve_documents called with parameters: question_length=58, documents_retrieve=3, min_similarity_score=0.500, question_preview='Who is the White Rabbit and how does Alice first meet him?'
INFO: Performing similarity search with Chroma DB
INFO: Filtering documents by similarity score threshold: 0.500
INFO: retrieve_documents completed successfully in 0.143 seconds.

# Test 2: Query with Metadata

Test the `query_with_metadata()` method which returns complete information like in notebook 002.

In [5]:
# Test query with metadata (similar to notebook 002 results)
print("=== Query with Metadata Test ===")
metadata_response = engine.query_with_metadata(query_text)

print(f"Question: {metadata_response.query}")
print(f"Answer: {metadata_response.answer}")
print(f"Sources: {metadata_response.sources}")
print(f"Retrieved documents: {metadata_response.retrieved_docs}")
print(f"Similarity scores: {metadata_response.similarity_scores}")

# Format similar to notebook 002 output
formatted_response = f"Response: {metadata_response.answer}\nSources: {metadata_response.sources}"
print("\n=== Formatted Response (like notebook 002) ===")
print(formatted_response)

=== Query with Metadata Test ===
INFO: query_with_metadata called with parameters: question_length=58, documents_retrieve=3, max_length=500, min_similarity_score=0.500, return_sources=True, question_preview='Who is the White Rabbit and how does Alice first meet him?'
INFO: Validating query inputs for metadata query
INFO: validate_query_inputs called with parameters: question_length=58, max_length=500, question_preview='Who is the White Rabbit and how does Alice first meet him?'
INFO: validate_query_inputs completed successfully in 0.002 seconds. Question validated: length=58
INFO: Processing query with metadata: Who is the White Rabbit and how does Alice first m...
INFO: Retrieving documents with scores
INFO: retrieve_documents_with_scores called with parameters: question_length=58, documents_retrieve=3, min_similarity_score=0.500, question_preview='Who is the White Rabbit and how does Alice first meet him?'
INFO: Performing similarity search with Chroma DB (with scores)
INFO: Filterin

# Test 3: Document Retrieval Comparison

Test the document retrieval functionality to compare with notebook 002 manual retrieval.

In [6]:
# Test document retrieval with scores (similar to notebook 002)
print("=== Document Retrieval Test ===")
docs, scores = engine.retrieve_documents_with_scores(query_text)

print(f"Query: {query_text}")
print(f"Found {len(docs)} relevant documents")

if len(docs) == 0:
    print("Unable to find matching results.")
else:
    print(f"Retrieved {len(docs)} documents with scores: {scores}")
    
    # Show first few characters of each document (like notebook 002 would show)
    print("\n=== Retrieved Documents Preview ===")
    for i, (doc, score) in enumerate(zip(docs, scores)):
        print(f"Document {i+1} (score: {score:.4f}):")
        print(f"  Content preview: {doc.page_content[:100]}...")
        print(f"  Source: {doc.metadata.get('source', 'Unknown')}")
        print()

=== Document Retrieval Test ===
INFO: retrieve_documents_with_scores called with parameters: question_length=58, documents_retrieve=3, min_similarity_score=0.500, question_preview='Who is the White Rabbit and how does Alice first meet him?'
INFO: Performing similarity search with Chroma DB (with scores)
INFO: Filtering documents and scores by similarity threshold: 0.500
INFO: retrieve_documents_with_scores completed successfully in 0.040 seconds. Retrieved 3 documents (filtered from 3 total) with scores: [0.7235418775839446, 0.6555808008565527, 0.6501221306982401]
Query: Who is the White Rabbit and how does Alice first meet him?
Found 3 relevant documents
Retrieved 3 documents with scores: [0.7235418775839446, 0.6555808008565527, 0.6501221306982401]

=== Retrieved Documents Preview ===
Document 1 (score: 0.7235):
  Content preview: So Alice began telling them her adventures from the time when she first saw the White Rabbit. She wa...
  Source: /home/lealdx/personal/rag_chat_project/dat

# Test 4: Multiple Queries Comparison

Test multiple queries to ensure consistent performance across different questions.

In [7]:
# Test multiple queries from the test set
print("=== Multiple Queries Test ===")

# Test a few queries from the specific_queries list
test_queries = specific_queries[:3]  # Test first 3 queries

for i, test_query in enumerate(test_queries, 1):
    print(f"\n--- Query {i} ---")
    print(f"Question: {test_query}")
    
    try:
        response = engine.query_with_metadata(test_query)
        print(f"Answer: {response.answer[:200]}..." if len(response.answer) > 200 else f"Answer: {response.answer}")
        print(f"Retrieved docs: {response.retrieved_docs}")
        print(f"Sources: {response.sources}")
    except Exception as e:
        print(f"Error processing query: {e}")
    
    print("-" * 50)

=== Multiple Queries Test ===

--- Query 1 ---
Question: What does Alice drink or eat that makes her change size?
INFO: query_with_metadata called with parameters: question_length=56, documents_retrieve=3, max_length=500, min_similarity_score=0.500, return_sources=True, question_preview='What does Alice drink or eat that makes her change size?'
INFO: Validating query inputs for metadata query
INFO: validate_query_inputs called with parameters: question_length=56, max_length=500, question_preview='What does Alice drink or eat that makes her change size?'
INFO: validate_query_inputs completed successfully in 0.001 seconds. Question validated: length=56
INFO: Processing query with metadata: What does Alice drink or eat that makes her change...
INFO: Retrieving documents with scores
INFO: retrieve_documents_with_scores called with parameters: question_length=56, documents_retrieve=3, min_similarity_score=0.500, question_preview='What does Alice drink or eat that makes her change size?'
INF

# Test 5: Configuration and Performance

Test different configurations and evaluate performance metrics.

In [8]:
# Test different configuration parameters
import time

print("=== Configuration and Performance Test ===")

# Test with different number of documents retrieved
print("\n--- Testing different document retrieval counts ---")
for doc_count in [1, 3, 5]:
    start_time = time.time()
    response = engine.query_with_metadata(
        query_text, 
        documents_retrieve=doc_count
    )
    end_time = time.time()
    
    print(f"Documents retrieved: {doc_count}")
    print(f"Response time: {end_time - start_time:.2f} seconds")
    print(f"Answer length: {len(response.answer)} characters")
    print(f"Actual docs found: {response.retrieved_docs}")
    print()

# Test with different similarity thresholds
print("--- Testing different similarity thresholds ---")
for threshold in [0.3, 0.5, 0.7]:
    response = engine.query_with_metadata(
        query_text, 
        min_similarity_score=threshold
    )
    print(f"Similarity threshold: {threshold}")
    print(f"Documents found: {response.retrieved_docs}")
    print(f"Scores: {response.similarity_scores}")
    print()

=== Configuration and Performance Test ===

--- Testing different document retrieval counts ---
INFO: query_with_metadata called with parameters: question_length=58, documents_retrieve=1, max_length=500, min_similarity_score=0.500, return_sources=True, question_preview='Who is the White Rabbit and how does Alice first meet him?'
INFO: Validating query inputs for metadata query
INFO: validate_query_inputs called with parameters: question_length=58, max_length=500, question_preview='Who is the White Rabbit and how does Alice first meet him?'
INFO: validate_query_inputs completed successfully in 0.001 seconds. Question validated: length=58
INFO: Processing query with metadata: Who is the White Rabbit and how does Alice first m...
INFO: Retrieving documents with scores
INFO: retrieve_documents_with_scores called with parameters: question_length=58, documents_retrieve=1, min_similarity_score=0.500, question_preview='Who is the White Rabbit and how does Alice first meet him?'
INFO: Performin

# Test 6: Error Handling and Edge Cases

Test the engine's robustness with edge cases and error conditions.

In [9]:
# Test edge cases and error handling
print("=== Error Handling and Edge Cases Test ===")

# Test 1: Empty query
print("--- Test 1: Empty query ---")
try:
    response = engine.query("")
    print(f"Unexpected success: {response}")
except ValueError as e:
    print(f"Expected error caught: {e}")

# Test 2: Very long query
print("\n--- Test 2: Very long query ---")
long_query = "What is Alice in Wonderland about? " * 100  # Very long query
try:
    response = engine.query(long_query)
    print(f"Unexpected success with long query")
except ValueError as e:
    print(f"Expected error caught: {e}")

# Test 3: Query with no relevant documents (very high similarity threshold)
print("\n--- Test 3: Query with high similarity threshold ---")
response = engine.query_with_metadata(
    "Completely unrelated query about quantum physics and rocket science",
    min_similarity_score=0.9
)
print(f"Response: {response.answer}")
print(f"Documents found: {response.retrieved_docs}")

# Test 4: Query with zero documents requested
print("\n--- Test 4: Zero documents requested ---")
try:
    response = engine.query_with_metadata(query_text, documents_retrieve=0)
    print(f"Unexpected success: {response}")
except ValueError as e:
    print(f"Expected error caught: {e}")

print("\n=== Edge cases testing completed ===\n")

=== Error Handling and Edge Cases Test ===
--- Test 1: Empty query ---
INFO: query called with parameters: question_length=0, documents_retrieve=3, max_length=500, min_similarity_score=0.500, question_preview=''
INFO: Validating query inputs
INFO: validate_query_inputs called with parameters: question_length=0, max_length=500, question_preview=''
ERROR: validate_query_inputs failed after 0.001 seconds: Question cannot be empty
ERROR: validate_query_inputs encountered unexpected error after 0.003 seconds: Question cannot be empty. Parameters: question_length=0, max_length=500
ERROR: query validation error after 0.009 seconds: Question cannot be empty. Question: ...
Expected error caught: Question cannot be empty

--- Test 2: Very long query ---
INFO: query called with parameters: question_length=3500, documents_retrieve=3, max_length=500, min_similarity_score=0.500, question_preview='What is Alice in Wonderland about? What is Alice in Wonderland about? What is Alice in Wonderland ab...'

# Summary and Comparison

Summary of the tests and comparison with notebook 002 results.

In [11]:
# Final summary and comparison with notebook 002
print("=== FINAL SUMMARY ===")
print(f"RAG Query Engine: {engine}")
print(f"Engine Configuration: {engine.get_config_info()}")

# Run one final test with the same query from notebook 002
print(f"\n=== Final Test with Original Query ===")
print(f"Query: {query_text}")

final_response = engine.query_with_metadata(query_text)
print(f"\nFinal Answer: {final_response.answer}")
print(f"Sources: {final_response.sources}")
print(f"Retrieved Documents: {final_response.retrieved_docs}")
print(f"Similarity Scores: {final_response.similarity_scores}")

print("\n=== Tests Completed Successfully ===\n")

=== FINAL SUMMARY ===
RAG Query Engine: RAGQueryEngine(model='TheBloke/Llama-2-7B-Chat-GGUF', collection='books')
INFO: get_config_info called
INFO: get_config_info completed successfully in 0.017 seconds. Config keys: ['model', 'model_type', 'chroma_path', 'collection_name', 'llm_config']
Engine Configuration: {'model': 'TheBloke/Llama-2-7B-Chat-GGUF', 'model_type': 'llama', 'chroma_path': '/home/lealdx/personal/rag_chat_project/data/chroma-db/books', 'collection_name': 'books', 'llm_config': {'max_new_tokens': 256, 'temperature': 0.3, 'top_k': 40, 'top_p': 0.9, 'repetition_penalty': 1.05, 'stop': ['</s>', '[/INST]'], 'context_length': 2048, 'threads': 4, 'seed': 42, 'stream': True}}

=== Final Test with Original Query ===
Query: Who is the White Rabbit and how does Alice first meet him?
INFO: query_with_metadata called with parameters: question_length=58, documents_retrieve=3, max_length=500, min_similarity_score=0.500, return_sources=True, question_preview='Who is the White Rabbit a