# RAG 02: Decomposing the retrieval process

This example demonstrates query expansion and multi-step retrieval using pydantic-ai embeddings and Chroma.
Uses the LLM-chunked collection from example_RAG_02_load.ipynb.

## Initialize

In [None]:
from agentic_patterns.core.agents import get_agent, run_agent
from agentic_patterns.core.vectordb import get_vector_db, vdb_query

## Vector-db: Load existing collection

Assumes the 'books_llm_chunked' collection was populated by running example_RAG_02_load.ipynb first.

In [None]:
vdb = get_vector_db("books_llm_chunked")

In [None]:
# Check database has documents
count = vdb.count()
assert count > 0, (
    "Vector database is empty, please run example_RAG_02_load.ipynb first to populate it."
)
print(f"Collection has {count} documents")

## RAG

In [None]:
query = "Who is a man with two heads?"

### Query expansion

In [None]:
prompt = f"""
Given the following user query, reformulate the query in three to five different ways to retrieve relevant documents from the vector database.

{query}
"""

In [None]:
agent = get_agent(output_type=list[str])  # type: ignore
agent_run, nodes = await run_agent(agent, prompt=prompt, verbose=True)

assert agent_run is not None and agent_run.result is not None
reformulated_queries = agent_run.result.output

print(f"\nAnswer (len {len(reformulated_queries)}):")
for i, query in enumerate(reformulated_queries):
    print(f"{i + 1:2d}: {query}")

### Query vector database with metadata filtering

The `vdb_query` function supports a `filter` parameter for metadata constraints. Filtering at the database level is more efficient than post-retrieval filtering.

In [None]:
# Define the metadata filter - only retrieve documents from this book
book_name = "hhgttg"
metadata_filter = {"source": book_name}

# Query with each reformulated query, applying the filter
documents_with_scores = []
for q in reformulated_queries:
    print(f"Query: {q}")
    documents_with_scores.extend(vdb_query(vdb, query=q, filter=metadata_filter))

print(f"\nFound {len(documents_with_scores)} documents from '{book_name}'")

### Deduplication

Query expansion retrieves documents for each reformulated query. The same document may appear multiple times if it matches several variations. Deduplication removes these duplicates.

In [None]:
# Deduplicate by creating a unique key from source and chunk
seen_ids = set()
documents_deduplicated = []
for doc, meta, score in documents_with_scores:
    doc_id = f"{meta['source']}-{meta['chunk']}"
    if doc_id in seen_ids:
        continue
    documents_deduplicated.append((doc, meta, score, doc_id))
    seen_ids.add(doc_id)
print(f"Deduplicated to {len(documents_deduplicated)} unique documents")

### Sorting and limiting

Sort by similarity score and limit the number of results. In production systems, this step could use a cross-encoder model that jointly encodes query-document pairs for more accurate relevance scoring.

In [None]:
# Sort by score (index 2) and limit results
documents_sorted = sorted(documents_deduplicated, key=lambda x: x[2], reverse=True)

max_results = 10
if len(documents_sorted) > max_results:
    documents_sorted = documents_sorted[:max_results]

print(f"Top {len(documents_sorted)} documents by similarity score")

### Add results to prompt

In [None]:
docs_str = ""
for doc, meta, score, doc_id in documents_sorted:
    docs_str += (
        f"Similarity Score: {score:.3f}\nDocument ID: {doc_id}\nDocument:\n{doc}\n\n"
    )
    text = doc.replace("\n", " ")
    print(f"Score: {score:.3f}, ID: {doc_id}, Document: {text[:80]}...")

### Prompt: Grounding on retrieved documents

In [None]:
prompt = f"""
Given the following documents, answer the user's question.
Show used references (using document ids).

## Documents

{docs_str}

## User's question

{query}

"""

print(prompt[:1000])  # Print the first 1000 characters of the prompt

### Query the LLM with the vdb resuts

In [None]:
agent = get_agent()
agent_run, nodes = await run_agent(agent, prompt=prompt, verbose=True)

assert agent_run is not None and agent_run.result is not None
answer = agent_run.result.output
print(f"\nAnswer: {answer}")