# ANC DBRX: Minimal LangGraph Notebook

This notebook is the Databricks-facing companion baseline. It demonstrates minimal LangGraph behavior while documenting the migration target from the GCP baseline.

## Status
- Current role: proof-of-concept and parity harness.
- Target role: Databricks notebook + Model Serving + Vector Search flow once migrated.

## Target Architecture Summary
- Development/debugging in notebooks
- Jobs/Workflows for ingest refresh
- Model Serving for embeddings and generation
- Vector Search for indexed retrieval
- MLflow for parity/evaluation tracking

## Guardrails
- Wikipedia-only sourcing for factual claims.
- Allow a small number of uploaded pictures only when files are sourced from Wikipedia/Wikimedia pages.
- Explicit uncertainty when retrieval evidence is weak.
- Keep this notebook minimal; feature work shifts to Flask localhost implementation.


## PlantUML Workflow Diagram

```plantuml
@startuml
title ANC DBRX Minimal LangGraph
start
:Input question;
:prepare_context node;
:draft_answer node;
:finalize_answer node;
:Return answer + citation placeholder;
stop
@enduml
```


In [None]:
# Uncomment in fresh environments:
# %pip install -q -r ../requirements-dbrx-dev.txt


In [None]:
from typing import TypedDict

from langgraph.graph import END, START, StateGraph


class GraphState(TypedDict):
    question: str
    provider: str
    context: str
    draft: str
    answer: str


def prepare_context(state: GraphState) -> GraphState:
    context = (
        f"{state['provider']} context stub for: {state['question']}. "
        "In full mode this would use retrieved chunks from Databricks Vector Search."
    )
    return {**state, 'context': context}


def draft_answer(state: GraphState) -> GraphState:
    draft = f"Draft answer based on context: {state['context']}"
    return {**state, 'draft': draft}


def finalize_answer(state: GraphState) -> GraphState:
    answer = (
        f"{state['draft']} Citation placeholder: "
        "https://en.wikipedia.org/wiki/Main_Page"
    )
    return {**state, 'answer': answer}


graph = StateGraph(GraphState)
graph.add_node('prepare_context', prepare_context)
graph.add_node('draft_answer', draft_answer)
graph.add_node('finalize_answer', finalize_answer)

graph.add_edge(START, 'prepare_context')
graph.add_edge('prepare_context', 'draft_answer')
graph.add_edge('draft_answer', 'finalize_answer')
graph.add_edge('finalize_answer', END)

app = graph.compile()
print('Graph compiled successfully.')


In [None]:
result = app.invoke(
    {
        'question': 'What is the Appalachian Plateau?',
        'provider': 'Databricks/DBRX',
        'context': '',
        'draft': '',
        'answer': '',
    }
)

for key in ('provider', 'question', 'context', 'draft', 'answer'):
    print(f"{key}: {result[key]}")


## Phased Plan and Next Phases

Databricks phases:
1. Parity harness with minimal graph behavior.
2. Notebook port with provider integrations.
3. Vector Search + Model Serving + evaluation tracking.

Product phases from this point:
1. Keep this notebook as a stable PoC artifact only.
2. Finish notebook baselines with Wikipedia/Wikimedia-only sourcing (including notebook picture files).
3. Move all feature development into the Flask localhost app.
4. Expand app/flask data sources to additional public sources (for example USGS).
5. Start iOS planning after Flask behavior is reliable.
