# Agentic RAG with LangGraph (timeframe drift fix)

This notebook adds explicit control flow to fix timeframe drift: **rewrite → retrieve → grade → bounded retry → generate with citations + confidence**.


In [4]:
from __future__ import annotations

import json
import os
import sys
from pathlib import Path

from dotenv import load_dotenv


def _find_project_root() -> Path:
    cwd = Path.cwd().resolve()
    for base in (cwd, *cwd.parents):
        if (base / "src" / "config.py").exists():
            return base
        nested = base / "agentic-rag-second-brain"
        if (nested / "src" / "config.py").exists():
            return nested
    raise RuntimeError("Could not locate project root containing src/config.py")

PROJECT_ROOT = _find_project_root()
if str(PROJECT_ROOT) not in sys.path:
    sys.path.insert(0, str(PROJECT_ROOT))
os.chdir(PROJECT_ROOT)

from src.config import settings
from src.graph import build_agentic_rag_graph, run_agentic_rag
from src.rag_baseline import baseline_rag_answer
from src.retrieval import load_persisted_index

load_dotenv()
if not os.getenv('OPENAI_API_KEY'):
    raise EnvironmentError('OPENAI_API_KEY is missing. Set it in your environment or .env file.')

print('Config loaded.')
print(f"PROJECT_ROOT={PROJECT_ROOT}")
print(f"OPENAI_MODEL={settings.openai_model}")
print(f"TOP_K={settings.top_k}, MAX_RETRIES={settings.max_retries}, RECENCY_DAYS={settings.recency_days}")


Config loaded.
PROJECT_ROOT=C:\Repos\Intro-to-RAG-Agentic-RAG-2602\agentic-rag-second-brain
OPENAI_MODEL=gpt-4o-mini
TOP_K=6, MAX_RETRIES=2, RECENCY_DAYS=365


In [5]:
index_dir = Path(settings.chroma_dir)
if not index_dir.exists() or not any(index_dir.iterdir()):
    raise FileNotFoundError(
        f'Persisted index not found at {index_dir}. Run notebooks/02_indexing_chroma_llamaindex.ipynb first.'
    )

index = load_persisted_index(chroma_dir=index_dir, embed_model=settings.embed_model)
print(f'Index loaded from {index_dir}')


Index loaded from data\processed\chroma


In [6]:
DRIFT_QUERY = 'What embedding model should we use?'

# Optional reminder of baseline behavior from Notebook 03
baseline = baseline_rag_answer(
    index=index,
    query=DRIFT_QUERY,
    top_k=int(settings.top_k),
    model=settings.openai_model,
    temperature=float(settings.temperature),
    max_context_chars=int(settings.max_context_chars),
)
print('Baseline answer (Notebook 03 style):')
print(baseline['answer'])
print('Baseline citations:')
for c in baseline['citations']:
    print('-', c)


Baseline answer (Notebook 03 style):
The recommended embedding model to use is EmbedPro-v2, as it has been shown to improve retrieval quality for nuanced queries, despite a significant cost increase compared to EmbedLite-v1. This change was made due to the need for better handling of semantically subtle questions that EmbedLite-v1 struggled with.
Baseline citations:
- {'doc_title': 'Embedding Model Decision Update: Quality Priority', 'doc_date': '2025-07-05', 'chunk_id': '1843528f9966ef38e563dc60ec056795eab0a0b1:7', 'source_path': 'C:\\Repos\\Intro-to-RAG-Agentic-RAG-2602\\agentic-rag-second-brain\\data\\raw\\notes\\2025-07-05-embedding-model-quality-shift.md'}
- {'doc_title': 'Embedding Rollout Postmortem', 'doc_date': '2025-10-21', 'chunk_id': '3ce7ccdc6cae9be14a952f15d541a4f87c73ec51:11', 'source_path': 'C:\\Repos\\Intro-to-RAG-Agentic-RAG-2602\\agentic-rag-second-brain\\data\\raw\\notes\\2025-10-21-embedding-rollout-postmortem.md'}


In [7]:
graph = build_agentic_rag_graph(
    index=index,
    openai_model=settings.openai_model,
    temperature=float(settings.temperature),
    top_k=int(settings.top_k),
    max_context_chars=int(settings.max_context_chars),
    max_retries=int(settings.max_retries),
    recency_days=int(settings.recency_days),
    evidence_min_recent_chunks=int(settings.evidence_min_recent_chunks),
    use_llm_grader=settings.use_llm_grader == '1',
    raw_notes_dir=settings.raw_notes_dir,
)

result = run_agentic_rag(graph, DRIFT_QUERY)

print('Decision trace:')
for step in result['decision_trace']:
    print('-', step)

print('\nRewritten query:')
print(result['rewritten_query'])

print('\nRetrieved chunks (score | doc_date | doc_title | chunk_id):')
for chunk in result['retrieved_chunks']:
    print(
        f"- {chunk['score']} | {chunk['doc_date']} | {chunk['doc_title']} | {chunk['chunk_id']}"
    )

print('\nGrade + retries:')
print(f"evidence_ok={result['evidence_ok']}, retry_count={result['retry_count']}, confidence={result['confidence']}")

print('\nFinal answer payload:')
print(json.dumps(result['final_answer'], indent=2))


Decision trace:
- rewrite: What is the latest recommendation for which embedding model to use? Prefer latest notes by date.
- retrieve: 627528ea8b8af5f59df5fae9b902a22869e8e53f:0|2025-01-10|Embedding Model Decision: Cost-First Default, aa23b701925fb16b6312fa7dc6f53474541c46fc:3|2025-03-18|Q1 Embedding Evaluation Notes, 1843528f9966ef38e563dc60ec056795eab0a0b1:7|2025-07-05|Embedding Model Decision Update: Quality Priority, 3ce7ccdc6cae9be14a952f15d541a4f87c73ec51:11|2025-10-21|Embedding Rollout Postmortem, 23d249bcab98910ef4133cf9ece7801e2b179a31:10|2025-10-02|Demo Retro: Internal Stakeholder Session, 20f6f8a9b43928b36d1bb1bb1d53b6391bf5eec0:9|2025-09-03|Chunking Strategy v2: Smaller Chunks + Overlap
- grade: evidence_ok=True, confidence=high (recent_chunks=6, min_required=1, topic_match=True, conflict_signals=False)
- continue: evidence_ok=True retry_count=0
- generate: completed answer with citations

Rewritten query:
What is the latest recommendation for which embedding model to use?

## What changed vs baseline?

The baseline pipeline in Notebook 03 is retrieve → generate with no explicit evidence checks, so it can drift to stale recommendations.

Notebook 04 adds explicit control flow with LangGraph:
1. rewrite query for recency intent
2. retrieve chunks
3. grade evidence for recency + relevance
4. bounded retry (`MAX_RETRIES`) if evidence is weak
5. generate grounded answer with citations + confidence (+ optional clarifying next step when confidence is low).
