# RAG vs Agentic RAG con Batman

## Objetivo
Comparar dos arquitecturas de retrieval-augmented generation:
- **Vanilla RAG**: Retrieve -> Generate.
- **Agentic RAG**: Route -> Rewrite -> Retrieve -> Filter -> Generate -> Grounding Check.

Usamos `gpt-5-mini` para generacion y `text-embedding-3-small` para retrieval.

In [None]:
from pathlib import Path
import sys
import pandas as pd

ROOT = Path.cwd()
if str(ROOT) not in sys.path:
    sys.path.append(str(ROOT))

from scripts.rag_pipelines import VanillaRAG, AgenticRAG
from scripts.vector_store_lab import build_index_from_json
from scripts.evaluation import (
    build_eval_questions,
    plot_architecture_difference,
    plot_pipeline_comparison,
    run_benchmark,
)

DATA_PATH = ROOT / 'data' / 'batman_comics.json'
OUTPUTS_DIR = ROOT / 'outputs'
OUTPUTS_DIR.mkdir(parents=True, exist_ok=True)

In [None]:
db, chunks, index_stats, chunk_stats = build_index_from_json(
    json_path=DATA_PATH,
    persist_dir=OUTPUTS_DIR / 'chroma_batman_rag',
    collection_name='batman_rag_lab',
    chunk_size=800,
    chunk_overlap=120,
    embedding_model='text-embedding-3-small',
)

print(index_stats)
print(chunk_stats)

In [None]:
vanilla = VanillaRAG(
    vector_db=db,
    model='gpt-5-mini',
    embedding_model='text-embedding-3-small',
    k=4,
)

agentic = AgenticRAG(
    vector_db=db,
    model='gpt-5-mini',
    embedding_model='text-embedding-3-small',
    k=6,
    min_docs_after_filter=3,
)

print('Pipelines initialized.')

In [None]:
query = 'Compara como Batman enfrenta a Bane en Knightfall versus su enfoque contra el Joker en The Killing Joke.'

vanilla_result = vanilla.run(query)
agentic_result = agentic.run(query)

comparison_preview = pd.DataFrame([
    {
        'pipeline': vanilla_result.pipeline,
        'latency_seconds': vanilla_result.latency_seconds,
        'groundedness': vanilla_result.groundedness,
        'retrieved_docs': len(vanilla_result.docs),
        'route': vanilla_result.route,
        'llm_provider': vanilla_result.llm_provider,
    },
    {
        'pipeline': agentic_result.pipeline,
        'latency_seconds': agentic_result.latency_seconds,
        'groundedness': agentic_result.groundedness,
        'retrieved_docs': len(agentic_result.docs),
        'route': agentic_result.route,
        'llm_provider': agentic_result.llm_provider,
    },
])
comparison_preview

## Diferencia estructural de pipelines

```mermaid
flowchart TD
  A["User Query"] --> B["Retrieve"] --> C["Generate"]

  D["User Query"] --> E["Route"] --> F["Rewrite"] --> G["Retrieve"] --> H["Filter"] --> I["Generate"] --> J["Grounding Check"]
```

In [None]:
questions = build_eval_questions()
benchmark_df = run_benchmark(vanilla=vanilla, agentic=agentic, queries=questions)
benchmark_df.head()

In [None]:
summary_df = plot_pipeline_comparison(
    benchmark_df,
    output_path=OUTPUTS_DIR / 'rag_vs_agentic_rag_metrics.png',
)
plot_architecture_difference(OUTPUTS_DIR / 'rag_vs_agentic_architecture.png')
summary_df

In [None]:
csv_path = OUTPUTS_DIR / 'rag_vs_agentic_benchmark.csv'
benchmark_df.to_csv(csv_path, index=False)
print(f'Saved benchmark rows: {len(benchmark_df)}')
print(f'CSV: {csv_path}')
print(f'Metrics plot: {OUTPUTS_DIR / "rag_vs_agentic_rag_metrics.png"}')
print(f'Architecture plot: {OUTPUTS_DIR / "rag_vs_agentic_architecture.png"}')

## Interpretacion orientada a ingenieria

- Si Agentic RAG mejora groundedness sin deteriorar demasiado latencia, suele ser la opcion de produccion para consultas complejas.
- Si el dominio es muy cerrado y estable, Vanilla RAG puede ser suficiente y mas barato.
- El punto clave no es "agentes por moda", sino el costo-beneficio medible por caso de uso.