# Advanced RAG Playground Notebook
This notebook lets you experiment with the modules in `src/` and reproduce the concepts from the Grok article: indexing, query processing, retrieval, reranking, REFRAG compression, and evaluation.

In [None]:
!pip install -r ../requirements.txt

In [None]:
from pathlib import Path

from src.data_loader import load_documents
from src.indexing import SemanticChunker
from src.pipeline import build_pipeline, run_pipeline
from src.evaluation import load_samples, evaluate_sample

DATA_PATH = Path('../data/knowledge_base.json')
EVAL_PATH = Path('../data/eval_questions.json')

## 1. Inspect the Knowledge Base

In [None]:
documents = load_documents(DATA_PATH)
len(documents), documents[0]

## 2. Build Index + Retriever

In [None]:
retriever, processor = build_pipeline(str(DATA_PATH))
retriever.indexer.chunks[:2]

## 3. Run the Pipeline for a Question

In [None]:
artifacts = run_pipeline(
    query="How does reranking improve the RAG pipeline?",
    data_path=str(DATA_PATH),
    retriever=retriever,
    processor=processor,
)
artifacts.refrag_summary, artifacts.answer_outline

## 4. Evaluate Keyword Coverage

In [None]:
samples = load_samples(EVAL_PATH)
sample = samples[0]
evaluate_sample(sample, data_path=str(DATA_PATH), retriever=retriever, processor=processor)

## 5. Experiment with Chunking Parameters

In [None]:
chunker = SemanticChunker(chunk_size=40, overlap=10)
alt_chunks = chunker.chunk(documents[0])
len(alt_chunks)

Use this section to compare retrieval performance after rebuilding the pipeline with different chunk sizes, reranker weights, or REFRAG selection ratios.