# Phase 1 Notebook: `state_schema.py`

Target file: `execution/langgraph/state_schema.py`

Purpose: define the canonical run state, defaulting behavior, and hashing/time helpers that make LangGraph state durable and safe across retries.

## 3-Layer Context

- Layer 1 (Directive): `directives/phase1_langgraph.md` requires schema-first state and durable orchestration.
- Layer 2 (Orchestration): `graph.py` reads/writes this state on every node transition.
- Layer 3 (Execution): this file is deterministic and contains zero model calls.

In [None]:
from pathlib import Path
import sys
import os

def bootstrap_repo_root() -> Path:
    cwd = Path.cwd().resolve()
    candidates = [cwd, *cwd.parents, Path('/home/nir/dev/agent_phase0')]
    for candidate in candidates:
        if (candidate / 'execution' / 'langgraph' / 'state_schema.py').exists():
            if str(candidate) not in sys.path:
                sys.path.insert(0, str(candidate))
            return candidate
    raise RuntimeError('Could not locate repo root for Phase 1 notebooks.')

repo_root = bootstrap_repo_root()
from dotenv import load_dotenv
load_dotenv(repo_root / ".env")
print('P1_PROVIDER =', os.getenv('P1_PROVIDER', 'ollama'))
print('repo_root =', repo_root)
print('kernel_python =', sys.executable)
if '/.venv/' not in sys.executable.replace('\\', '/'):
    print('WARNING: kernel is not the project .venv interpreter.')

## P0 vs P1 (State)

| Concern | Phase 0 (`orchestrator.py`) | Phase 1 (`state_schema.py`) | Why this matters |
|---|---|---|---|
| State shape | Implicit mutable dict/list usage | Explicit `RunState` typed contract | Fewer hidden key errors during retries/checkpoint resume |
| Defaulting | Assumed keys exist | `ensure_state_defaults` repairs partial state | Prevents runtime crashes after graph transitions |
| Trace identifiers | Local counters only | `run_id`, tool signatures, policy counters | Reproducible debugging and auditability |

## Why this block exists: inspect contracts

We inspect the source to confirm the state contract includes memo and duplicate-call fields required by Phase 1 policy logic.

In [None]:
import inspect
import execution.langgraph.state_schema as state_schema

print(inspect.getsource(state_schema.RunState))
print(inspect.getsource(state_schema.new_run_state))
print(inspect.getsource(state_schema.ensure_state_defaults))

## Why this block exists: prove missing-key recovery

LangGraph nodes can receive partial state snapshots. This test verifies `ensure_state_defaults` repairs missing keys deterministically.

In [None]:
partial = {'run_id': 'demo'}
fixed = state_schema.ensure_state_defaults(partial, system_prompt='demo prompt')
fixed

In [None]:
required_keys = {
    'run_id', 'step', 'messages', 'completed_tasks', 'tool_history',
    'memo_events', 'retry_counts', 'policy_flags', 'seen_tool_signatures',
    'tool_call_counts', 'pending_action', 'final_answer'
}
assert required_keys.issubset(set(fixed.keys()))
assert fixed['retry_counts']['duplicate_tool'] == 0
print('state defaults assertion passed')

## Takeaways

- This file is the single source of truth for graph state schema.
- `ensure_state_defaults` is mandatory for resilient node execution.
- P1 adds explicit counters/signatures that P0 did not model as first-class state.
- Next notebook: `execution/notebooks/p1_provider.ipynb`.