# Query Reranking Experiment - LLM-based Relevance

This notebook experiments with LLM-based reranking:
- **Initial Retrieval**: Get top 10 chunks with hybrid search
- **Reranking**: Use LLM to score relevance
- **Final Selection**: Return top 3 most relevant

**Goal:** Improve answer quality by filtering out less relevant chunks.

In [1]:
import sys
sys.path.append('..')

from src.vector_store import initialize_chroma_db
from src.hybrid_search import HybridSearchEngine
from src.reranker import rerank_chunks
from typing import List, Dict

  from .autonotebook import tqdm as notebook_tqdm


## Step 1: Load ChromaDB and Initialize Hybrid Search

In [2]:
print("Loading ChromaDB...")
client, collection = initialize_chroma_db(
    persist_directory="../chroma_db",
    collection_name="documents"
)
doc_count = collection.count()
print(f"‚úÖ Loaded {doc_count:,} documents")

if doc_count == 0:
    print("\n‚ùå ERROR: No documents in collection!")
    raise SystemExit("Cannot continue without documents")

Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given


Loading ChromaDB...
Initializing ChromaDB at: ../chroma_db
‚úÖ Loaded existing collection: documents
   Documents in collection: 31393
‚úÖ Loaded 31,393 documents


In [3]:
print("\nInitializing Hybrid Search Engine...")
hybrid_engine = HybridSearchEngine(collection)
print("‚úÖ Hybrid search ready")

Failed to send telemetry event CollectionGetEvent: capture() takes 1 positional argument but 3 were given



Initializing Hybrid Search Engine...
Building BM25 index from ChromaDB...
‚úÖ BM25 index built with 31,393 documents
‚úÖ Hybrid search ready


## Step 2: Test Query - Get Top 10 Chunks

In [4]:
# Test query
test_query = "What is CAN protocol used for?"
test_domain = "automotive"

print(f"Query: {test_query}")
print(f"Domain: {test_domain}")
print("\n" + "="*80)

# Get top 10 chunks from hybrid search
results = hybrid_engine.search(
    query=test_query,
    n_results=10,
    domain=test_domain,
    method="hybrid"
)

chunks = [r['document'] for r in results]
metadatas = [r['metadata'] for r in results]

print(f"\n‚úÖ Retrieved {len(chunks)} chunks from hybrid search")

Query: What is CAN protocol used for?
Domain: automotive



Failed to send telemetry event CollectionQueryEvent: capture() takes 1 positional argument but 3 were given



‚úÖ Retrieved 10 chunks from hybrid search


## Step 3: Display Initial Top 10

In [5]:
print("\n" + "="*80)
print("INITIAL TOP 10 (Hybrid Search):")
print("="*80)

for i, (chunk, meta) in enumerate(zip(chunks, metadatas), 1):
    source = meta.get('source', 'Unknown').split('/')[-1]
    page = meta.get('page', 'N/A')
    print(f"\n[{i}] {source} (Page {page})")
    print(f"    {chunk[:150]}...")


INITIAL TOP 10 (Hybrid Search):

[1] CAN.pdf (Page 74)
    bytes. Further, the network speed is limited to 1 Mbit/s, restricting the implementation of data-producing features. CAN 
 FD resolves these issues - ...

[2] CAN.pdf (Page 129)
    ISOBUS (ISO 11783) Explained - A Simple Intro 
 Need a simple, practical intro to ISOBUS (ISO 11783)? 
 In this guide we introduce the ISOBUS protocol...

[3] CAN.pdf (Page 135)
    retroÔ¨Åtting a CAN data logger to record all messages 
 being communicated on the CAN buses. However, in 
 the case of LOG, the aim is to allow for the...

[4] CAN.pdf (Page 143)
    CCP / XCP on CAN Explained - A Simple Intro 
 Need a simple intro to CCP/XCP on CAN bus? 
 In this practical tutorial, we introduce the basics of the ...

[5] CAN.pdf (Page 112)
    party device (e.g. a sensor-to-CAN module) to inject data into an existing CAN bus. If you do not ensure the global 
 uniqueness of the CAN IDs of ext...

[6] CAN.pdf (Page 112)
    Flags (not visible to the re

## Step 4: Rerank with LLM

In [6]:
print("\n" + "="*80)
print("RERANKING WITH LLM...")
print("="*80)

# Rerank to get top 3
reranked_chunks, reranked_metadatas = rerank_chunks(
    query=test_query,
    chunks=chunks,
    metadatas=metadatas,
    top_k=3,
    method="ollama"
)

print(f"\n‚úÖ Reranking complete")


RERANKING WITH LLM...
‚ö†Ô∏è  Reranking error: HTTPConnectionPool(host='localhost', port=11434): Read timed out. (read timeout=30), using original order

‚úÖ Reranking complete


## Step 5: Display Reranked Top 3

In [7]:
print("\n" + "="*80)
print("RERANKED TOP 3 (LLM-scored):")
print("="*80)

for i, (chunk, meta) in enumerate(zip(reranked_chunks, reranked_metadatas), 1):
    source = meta.get('source', 'Unknown').split('/')[-1]
    page = meta.get('page', 'N/A')
    print(f"\n[{i}] {source} (Page {page})")
    print(f"    {chunk[:200]}...")


RERANKED TOP 3 (LLM-scored):

[1] CAN.pdf (Page 74)
    bytes. Further, the network speed is limited to 1 Mbit/s, restricting the implementation of data-producing features. CAN 
 FD resolves these issues - making it future-proof. 
 What is CAN FD? 
 The CA...

[2] CAN.pdf (Page 129)
    ISOBUS (ISO 11783) Explained - A Simple Intro 
 Need a simple, practical intro to ISOBUS (ISO 11783)? 
 In this guide we introduce the ISOBUS protocol used in agricultural vehicles (like tractors) and...

[3] CAN.pdf (Page 135)
    retroÔ¨Åtting a CAN data logger to record all messages 
 being communicated on the CAN buses. However, in 
 the case of LOG, the aim is to allow for the data to be 
 exported in the ISO XML format (simi...


## Step 6: Compare - Which chunks were selected?

Let's see which positions from the top 10 were selected for the top 3.

In [8]:
print("\n" + "="*80)
print("COMPARISON: Original Position ‚Üí Reranked Position")
print("="*80)

for new_rank, reranked_chunk in enumerate(reranked_chunks, 1):
    # Find original position
    try:
        old_rank = chunks.index(reranked_chunk) + 1
        print(f"Position {old_rank} ‚Üí Position {new_rank}")
    except ValueError:
        print(f"? ‚Üí Position {new_rank} (new chunk)")


COMPARISON: Original Position ‚Üí Reranked Position
Position 1 ‚Üí Position 1
Position 2 ‚Üí Position 2
Position 3 ‚Üí Position 3


## Step 7: Test Multiple Queries

Compare reranking across different query types.

In [9]:
test_cases = [
    {"query": "What is CAN FD?", "domain": "automotive"},
    {"query": "How does OBD-II diagnostics work?", "domain": "automotive"},
    {"query": "Show me dresses under 1000 rupees", "domain": "fashion"},
]

def test_reranking(query: str, domain: str):
    print("\n" + "="*80)
    print(f"Query: {query}")
    print(f"Domain: {domain}")
    print("="*80)
    
    # Get top 10
    results = hybrid_engine.search(
        query=query,
        n_results=10,
        domain=domain,
        method="hybrid"
    )
    
    chunks = [r['document'] for r in results]
    metadatas = [r['metadata'] for r in results]
    
    print(f"\nüìä Top 3 WITHOUT reranking:")
    for i in range(min(3, len(chunks))):
        source = metadatas[i].get('source', 'Unknown').split('/')[-1]
        print(f"{i+1}. [{source}] {chunks[i][:100]}...")
    
    # Rerank
    print(f"\nüîÑ Reranking with LLM...")
    reranked_chunks, reranked_metadatas = rerank_chunks(
        query=query,
        chunks=chunks,
        metadatas=metadatas,
        top_k=3,
        method="ollama"
    )
    
    print(f"\n‚úÖ Top 3 WITH reranking:")
    for i in range(len(reranked_chunks)):
        source = reranked_metadatas[i].get('source', 'Unknown').split('/')[-1]
        print(f"{i+1}. [{source}] {reranked_chunks[i][:100]}...")
    
    # Show position changes
    print(f"\nüìà Position changes:")
    for new_rank, chunk in enumerate(reranked_chunks, 1):
        try:
            old_rank = chunks.index(chunk) + 1
            change = "‚Üë" if old_rank > new_rank else "‚Üí" if old_rank == new_rank else "‚Üì"
            print(f"  {change} Position {old_rank} ‚Üí {new_rank}")
        except ValueError:
            print(f"  ? Position ? ‚Üí {new_rank}")

# Run tests
for test in test_cases:
    test_reranking(test['query'], test['domain'])


Query: What is CAN FD?
Domain: automotive

üìä Top 3 WITHOUT reranking:
1. [CAN.pdf] bytes. Further, the network speed is limited to 1 Mbit/s, restricting the implementation of data-pro...
2. [CAN.pdf] retroÔ¨Åtting a CAN data logger to record all messages 
 being communicated on the CAN buses. However,...
3. [CAN.pdf] extension of the Classical 
 CAN data link layer. It 
 increases the payload from 
 8 to 64 bytes an...

üîÑ Reranking with LLM...
‚ö†Ô∏è  Reranking error: HTTPConnectionPool(host='localhost', port=11434): Read timed out. (read timeout=30), using original order

‚úÖ Top 3 WITH reranking:
1. [CAN.pdf] bytes. Further, the network speed is limited to 1 Mbit/s, restricting the implementation of data-pro...
2. [CAN.pdf] retroÔ¨Åtting a CAN data logger to record all messages 
 being communicated on the CAN buses. However,...
3. [CAN.pdf] extension of the Classical 
 CAN data link layer. It 
 increases the payload from 
 8 to 64 bytes an...

üìà Position changes:
  ‚Üí Posi

## Step 8: Analysis

**Expected Observations:**

1. **Without Reranking:**
   - Hybrid search orders by RRF score
   - May include some less relevant chunks

2. **With Reranking:**
   - LLM evaluates actual relevance to query
   - More contextually appropriate chunks selected
   - Position changes indicate improved ordering

**Trade-offs:**
- ‚úÖ Better quality: More relevant chunks
- ‚ö†Ô∏è  Slower: Additional LLM call (~2-3s)
- ‚úÖ Smarter: Understands context better

**Next Steps:**
- Integrate into qa_chain.py
- Add reranking option to main.py
- Evaluate impact on answer quality