#**CHAPTER 5. LIQUIDITY REGIME EXECUTION MACHINE**
---

##REFERENCE

https://chatgpt.com/share/6996441e-56f4-8012-a1fe-348dea108363

##0.CONTEXT

**Introduction: Notebook 5 as a Liquidity-Aware Execution Regime Machine**

Imagine you are sitting on a trading desk with an order ticket that looks deceptively simple: **BUY 250,000 units**. No options, no structured payoff, no fancy forecasting—just “get it done.” In real life, that simplicity is a trap. Execution is not a single action; it is a **sequence of controlled decisions** made under changing liquidity, widening spreads, thinning depth, and the constant fear that your own trading will move the market against you. Notebook 5 is built to teach that reality as an **agentic architecture**: not an “AI trader,” but a **state-driven workflow** that behaves like a professional execution process with explicit controls, bounded iteration, and auditable outcomes.

This notebook’s purpose is architectural: it demonstrates how a finance problem—**execution under liquidity stress**—is naturally a **state machine**, and how a LangGraph topology turns that state machine into a transparent, inspectable system. The novelty is not “prompting.” The novelty is that **every decision is routed by state**, the loop is bounded by explicit rules, and the system exports artifacts that make the run reviewable: **run_manifest.json**, **graph_spec.json**, and **final_state.json**.

---

**The task this agentic model performs**

The model’s job is to take an order and repeatedly answer a narrow professional question:

**Given the current microstructure conditions, what is the safest feasible execution tactic right now—and should we continue or stop?**

Notice what is missing on purpose: we are not predicting price direction, not estimating alpha, not optimizing portfolio weights. This is not a research model; it is an execution control system. Its “intelligence” is operational: detect liquidity regime, choose a tactic, estimate costs and slippage, and enforce risk limits.

This is what execution desks actually do, even when they use sophisticated internal tools. They do not “trade once.” They trade in increments, reassessing conditions repeatedly, and they stop when constraints say stop. Notebook 5 makes that structure explicit and teachable.

---

**The cast: agents and gates**

This notebook has **agents** (nodes that act) and **gates** (nodes that decide whether the system is allowed to proceed).

**Agents**
1. **market_data**  
   Produces a synthetic market snapshot: midprice, spread in bps, top-of-book depth proxy, volatility proxy, and an impact coefficient. The point is not realism of the numbers; the point is that the state changes each loop in a deterministic way so you can learn the control structure.

2. **regime_machine**  
   Computes a **regime score** and updates a discrete regime label: **NORMAL**, **TIGHT**, or **STRESSED**. This node is deliberately deterministic because regime classification is part of a risk control layer. You want stable, explainable transitions, not creativity.

3. **llm_tactic_proposer**  
   Calls the locked model (**claude-haiku-4-5-20251001**) to propose a tactic from a bounded action space: **TWAP, VWAP, ICEBERG, PAUSE**. The LLM is not allowed to invent tactics or “do strategy.” It is constrained to propose within a professional menu. The system captures the proposal, confidence, and rationale as audit notes. If the LLM fails (bad JSON, invalid choice), the system records an error and falls back to deterministic logic—no silent failure.

4. **execution_tactic**  
   Converts the proposed tactic (LLM or fallback) into simulated fills and cost estimates: fill quantity, remaining quantity, estimated cost bps, estimated slippage bps, total bps. This is the “execution engine” of the toy model.

**Gate**
5. **risk_gate**  
   This is the control point that decides whether the loop continues or terminates. It enforces three professional stop conditions:
   - **LIMIT_BREACH_HUMAN_REVIEW** if cost/slippage exceed thresholds
   - **ORDER_COMPLETE** if remaining quantity is zero
   - **MAX_STEPS_REACHED** if we hit the bounded loop ceiling

The important governance idea is that the LLM does not “own” stopping decisions. Stopping is a risk function, not a language function.

---

**How the system thinks: state, not chat**

Every node reads and writes the same explicit TypedDict state. There is no hidden memory and no implicit narrative. The system does not “remember” because of conversation; it “remembers” because the state carries forward fields like:

- **remaining_qty** (what is still left to execute)
- **spread_bps**, **depth_1**, **vol_bps** (liquidity conditions)
- **regime** and **regime_hysteresis** (regime control state)
- **tactic**, **est_total_bps** (latest execution decision and its estimated cost)
- **step** and **max_steps** (bounded loop drivers)
- **trace**, **notes**, **errors** (audit trail)

This is exactly the kind of discipline that financial control systems require: you can inspect the state at any moment and understand why the system routed the way it did.

---

**The loop as a professional execution cycle**

Now let’s narrate what happens, loop by loop, as if you were walking someone through a desk tool.

**Loop 0: initialization**
The run begins with an order: **BUY 250,000**. The market snapshot starts at a baseline: a midprice around 100, a modest spread, a depth proxy, and volatility proxy. The regime begins as **NORMAL**. The system has limits: max cost bps and max slippage bps. It also has a strict ceiling on how many times it is allowed to iterate: **max_steps**. This ceiling is the notebook’s “bounded loop” principle—critical in production systems to prevent runaway workflows.

**Loop 1: market update → regime classification**
The **market_data** agent updates the snapshot. The key concept is **size pressure**: remaining quantity relative to depth. When remaining is large relative to depth, spreads tend to widen and depth tends to thin. The model simulates that in a deterministic way. Immediately after, **regime_machine** converts these conditions into a regime score. If spread is wider, depth is thinner, volatility is higher, the score rises. The regime may remain NORMAL, move to TIGHT, or escalate to STRESSED depending on thresholds. Hysteresis prevents flip-flopping: once you move into stress, you don’t immediately bounce out due to tiny noise.

**Loop 1: LLM proposes, execution engine simulates**
Now **llm_tactic_proposer** is called. It sees the full snapshot—spread, depth, vol, impact coefficient, regime label, and remaining quantity—and must pick a tactic from the allowed list. This is where a human trader would say, “I’m not crossing wide spreads with size. If liquidity is tight, I’ll slow down; if it’s stressed, I might pause.”

Then **execution_tactic** takes the chosen tactic and simulates what happens: maybe a TWAP fills some fraction, VWAP fills a bit more gently, ICEBERG fills slower but with reduced impact, PAUSE fills nothing. It computes estimated costs and slippage. This is your “execution result” for the loop.

**Loop 1: risk gate decides continue or stop**
Finally, **risk_gate** checks: did we breach cost limits? Did we finish the order? Did we hit max steps? If none apply, it increments **step** and routes back to **market_data**. That routing is not an if-statement in your notebook; it is a LangGraph conditional edge. That is the architectural lesson: in agentic systems, the graph topology is the control surface.

**Loops 2–N: repeating under changing conditions**
Each loop repeats the same structure. The power is in the stability of the cycle:
1. observe market
2. classify regime
3. propose tactic
4. simulate results
5. gate decision

In finance terms, you can think of this as a simplified “microstructure control loop.” You are continuously answering: “Given current liquidity, what is my next safe action?”

---

**What stopping looks like in practice**

The system can stop in three ways, each professionally meaningful:

**1) ORDER_COMPLETE**
This is the happy path: remaining quantity reaches zero. The run ends with an explicit completion reason. In real execution, this is your “done” state.

**2) LIMIT_BREACH_HUMAN_REVIEW**
This is a governance stop. If estimated costs or slippage exceed limits, the system does not keep “trying.” It halts and escalates. This models real desk practice: beyond certain thresholds, automation should not push forward.

**3) MAX_STEPS_REACHED**
This is the bounded-loop stop. It means: “We have performed the allowed number of execution cycles, and we are not allowed to continue automatically.” In real life, this resembles a controlled batch execution window: you tried for a set number of intervals, conditions remained poor, and you stop for reassessment.

This is exactly what you saw: the run ended at max steps with significant remaining quantity. That outcome is not a failure of architecture; it is the architecture enforcing discipline.

---

**Interpreting your specific outcome**

You observed:
- **termination_reason: MAX_STEPS_REACHED**
- **regime: STRESSED**
- **last_tactic: PAUSE**
- **remaining_qty: large**

Translated into desk language:
“We entered stressed liquidity. The system repeatedly found conditions too hostile to execute safely, and the safe action became to pause. Because we cap the number of automated cycles, we terminated after the allowed attempts and produced a final state for review.”

That is a realistic pattern. In stress, doing nothing can be the correct action. But the important thing is that the system does not hide behind narrative. It tells you exactly why it stopped and what state it was in.

---

**Why this is valuable for financial professionals**

This notebook teaches a professional mental model: execution is a **governed control system**, not a single trade. The key learning artifacts are:

- **Visible topology** (Mermaid graph): you can point to the process and explain it.
- **Typed state**: every decision is anchored in fields you can audit.
- **Bounded loop**: no runaway systems.
- **Explicit gate**: risk limits override “smart” suggestions.
- **Run artifacts**: a supervisor can inspect **final_state.json**, see every node call in **trace**, and understand the stop reason from **run_manifest.json**.

In short: Notebook 5 is not “AI trading.” It is **AI-assisted execution governance**—a state machine that behaves like a disciplined desk workflow, with an LLM used in a bounded, reviewable role and risk gates that remain deterministic and enforceable.


##1.LIBRARIES AND ENVIRONMENT

**CELL 1 — Environment setup, conflict-safe installs, strict key policy, and Mermaid rendering**

Cell 1 is the “foundation slab” of the notebook. In a professional workflow, you cannot discuss reliability until you can reproduce the environment that produced the result. This cell therefore does four jobs: (1) stabilizes packages, (2) imports a minimal, predictable toolset, (3) enforces deterministic defaults, and (4) installs the visualization layer that turns the workflow into a learning artifact.

First, it uses conflict-safe installation patterns for Colab. Colab comes with many preloaded libraries, sometimes with versions that can conflict with what LangGraph or Anthropic clients expect. Upgrading pip and pinning versions reduces “it worked yesterday” failures, which matter in a classroom and matter even more in audit-like settings. The goal is not just to run once; it is to run consistently for many students and many replays.

Second, it sets deterministic controls: a fixed random seed and PYTHONHASHSEED. In notebook 5, we generate synthetic microstructure dynamics. If the seed is not fixed, students cannot compare runs, and reviewers cannot replicate behavior. Determinism here is a governance requirement, not a convenience.

Third, the cell enforces the strict requirement that the API key must exist. This is a deliberate professional posture: if the architecture claims to use a specific model, the run should fail fast if the key is missing. Silent fallbacks in finance are dangerous because they create false confidence. If we want an “LLM-in-the-loop” notebook, we enforce that the LLM can actually be called.

Fourth, the cell defines a hardened Mermaid ESM renderer. Visualization is mandatory because the graph is the lesson. Mermaid is pinned so the diagram renders the same way across machines. The renderer is designed to fail loudly, not silently. If the graph cannot be rendered, you should treat that as a meaningful error: the topology is part of the deliverable.

By the end of Cell 1, you have a stable runtime, a strict model access policy, deterministic behavior controls, and the ability to render the LangGraph topology. This is not “setup fluff.” It is the governance infrastructure that makes everything else trustworthy.


In [1]:
# CELL 1/10 — Install + core imports (Colab-ready, conflict-safe) + deterministic config
# STRICT: Anthropic key must exist (fails fast)

!pip -q install --upgrade "pip>=24.0"
!pip -q install "httpx==0.28.1" "httpcore==1.0.5"
!pip -q install "langgraph==0.2.39" "langchain==0.3.14" "langchain-core==0.3.40" "anthropic>=0.34.0"

import os, json, uuid, hashlib, platform, random, time, sys
import datetime as _dt
from typing import TypedDict, Literal, Dict, Any, List, Optional, Callable, Tuple

from langgraph.graph import StateGraph, END
from google.colab import userdata
from IPython.display import HTML, display

# Determinism
SEED = 5
random.seed(SEED)
os.environ["PYTHONHASHSEED"] = str(SEED)

import importlib.metadata as md
def _ver(pkg: str) -> str:
    try:
        return md.version(pkg)
    except Exception:
        return "missing"

def utc_now_iso() -> str:
    return _dt.datetime.now(_dt.timezone.utc).isoformat()

def sha256_str(s: str) -> str:
    return hashlib.sha256(s.encode("utf-8")).hexdigest()

print("VERSIONS:", {
    "python": sys.version.split()[0],
    "platform": platform.platform(),
    "langgraph": _ver("langgraph"),
    "langchain": _ver("langchain"),
    "langchain-core": _ver("langchain-core"),
    "anthropic": _ver("anthropic"),
    "httpx": _ver("httpx"),
    "httpcore": _ver("httpcore"),
    "langgraph-prebuilt": _ver("langgraph-prebuilt"),  # observe only; do not use
    "seed": SEED
})

# STRICT KEY MUST EXIST (ALL CAPS)
API_KEY = userdata.get("ANTHROPIC_API_KEY")
if not API_KEY:
    raise RuntimeError('Missing Colab secret: userdata.get("ANTHROPIC_API_KEY") (ALL CAPS)')
print("ANTHROPIC_API_KEY loaded:", "yes")

# ---- Mermaid visualization (pinned) ----
MERMAID_VERSION = "10.6.1"

def display_langgraph_mermaid(mermaid_code: str, *, height_px: int = 520) -> None:
    diagram_id = f"mmd-{sha256_str(mermaid_code)[:10]}"
    html = f"""
    <div id="{diagram_id}" style="border:1px solid rgba(0,0,0,0.12); border-radius:10px; padding:10px; overflow:auto; min-height:{height_px}px;">
      <div style="font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, 'Liberation Mono', 'Courier New', monospace; font-size:12px; opacity:0.75; margin-bottom:6px;">
        Mermaid {MERMAID_VERSION} — LangGraph topology
      </div>
      <div class="mermaid">
{mermaid_code}
      </div>
    </div>

    <script type="module">
      try {{
        const mermaid = (await import("https://cdn.jsdelivr.net/npm/mermaid@{MERMAID_VERSION}/dist/mermaid.esm.min.mjs")).default;
        mermaid.initialize({{
          startOnLoad: true,
          securityLevel: "strict",
          theme: "default",
          flowchart: {{ htmlLabels: true, curve: "basis" }}
        }});
        const el = document.querySelector("#{diagram_id} .mermaid");
        if (!el) throw new Error("Mermaid element not found");
        const svgId = "{diagram_id}-svg";
        const {{ svg }} = await mermaid.render(svgId, el.textContent);
        el.innerHTML = svg;
      }} catch (e) {{
        const host = document.getElementById("{diagram_id}");
        if (host) host.innerHTML = `<pre style="white-space:pre-wrap; color:#b00020;">Mermaid render failed: ${{String(e)}}</pre>`;
        console.error(e);
      }}
    </script>
    """
    display(HTML(html))


VERSIONS: {'python': '3.12.12', 'platform': 'Linux-6.6.105+-x86_64-with-glibc2.35', 'langgraph': '0.2.39', 'langchain': '0.3.14', 'langchain-core': '0.3.40', 'anthropic': '0.82.0', 'httpx': '0.28.1', 'httpcore': '1.0.5', 'langgraph-prebuilt': '1.0.7', 'seed': 5}
ANTHROPIC_API_KEY loaded: yes


##2.CONFIGURATION

###2.1.OVERVIEW

**CELL 2 — Typed state schema, configuration, and explicit initialization**

Cell 2 defines the notebook’s central idea: **state is the contract**. Instead of letting the system behave like a conversation, we define a TypedDict that lists every field the workflow is allowed to read and write. This is the clearest way to teach agentic architecture to financial professionals, because it matches how real systems are designed: you define the data model first, and you treat it as a source of truth.

The ExecState schema covers three categories. First are governance and control fields: run_id, timestamps, step counters, and termination reason. These fields guarantee that every run can be traced and that the loop cannot run forever. Second are market and order inputs: order size, urgency, spread, depth, volatility, and an impact coefficient. These variables are not meant to be “perfectly realistic,” but they are sufficient to explain the mechanism of liquidity stress. Third are decision outputs: tactic choice, estimated costs, fills, and remaining quantity.

Notebook 5 adds a major state extension versus earlier notebooks: LLM tracking fields. Instead of treating the model as a black box, the state records how many calls were made, what the last payload was, what tactic was proposed, and what rationale was given. This is a governance upgrade: it allows reviewers to distinguish “LLM suggested X” from “system executed X.” That separation matters in supervised environments.

Configuration is also explicit: the model name is locked, loop bounds are defined, and risk limits are stated in basis points. This is the professional difference between a demo and a system: limits must be visible and deliberate.

Finally, init_state constructs a fully populated state object with safe defaults. This is critical for reliability: every field exists from the beginning, which prevents hidden None behaviors and makes unit-style reasoning possible. The cell ends by printing a small subset of the initial state so students can verify the run identity and the starting conditions.

In short: Cell 2 formalizes the system’s “ledger.” Everything that happens later is just controlled transformation of this ledger.


###2.2.CODE AND IMPLEMENTATION

In [2]:
# CELL 2/10 — Configuration, explicit TypedDict state, initialization (UPDATED: LLM proposer fields)

Regime = Literal["NORMAL", "TIGHT", "STRESSED"]
Tactic = Literal["TWAP", "VWAP", "ICEBERG", "PAUSE", "HALT_HUMAN_REVIEW"]

class ExecState(TypedDict):
    # Governance & run control
    run_id: str
    ts_start_utc: str
    model: str
    max_steps: int
    step: int
    done: bool
    termination_reason: str

    # Inputs (synthetic order)
    order_id: str
    side: Literal["BUY", "SELL"]
    qty: int
    urgency_bps: int

    # Market snapshot (synthetic microstructure)
    px_mid: float
    spread_bps: float
    depth_1: float
    adv: float
    vol_bps: float
    impact_k: float

    # Regime machine
    regime: Regime
    regime_score: float
    regime_hysteresis: int

    # LLM proposer (execution decision support)
    llm_calls: int
    llm_last: Dict[str, Any]     # last parsed JSON payload (or error record)
    llm_tactic: str              # proposed tactic (validated) or ""
    llm_confidence: float
    llm_rationale: str

    # Execution outputs (per step)
    tactic: Tactic
    fill_qty: int
    remaining_qty: int
    est_cost_bps: float
    est_slippage_bps: float
    est_total_bps: float

    # Controls / limits
    max_cost_bps: float
    max_slippage_bps: float

    # Logs (audit-friendly)
    trace: List[Dict[str, Any]]
    notes: List[str]
    errors: List[str]

CONFIG: Dict[str, Any] = {
    "model": "claude-haiku-4-5-20251001",  # STRICT, no substitution
    "max_steps": 8,                       # bounded loop driver
    "seed": SEED,
    "max_cost_bps": 35.0,
    "max_slippage_bps": 20.0,
    # LLM proposer policy
    "llm_temperature": 0.0,
    "llm_max_tokens": 240,
}

def init_state() -> ExecState:
    run_id = str(uuid.uuid4())
    order_id = f"ORD-{run_id[:8]}"
    qty = 250_000
    return ExecState(
        run_id=run_id,
        ts_start_utc=utc_now_iso(),
        model=CONFIG["model"],
        max_steps=int(CONFIG["max_steps"]),
        step=0,
        done=False,
        termination_reason="",

        order_id=order_id,
        side="BUY",
        qty=qty,
        urgency_bps=18,

        px_mid=100.00,
        spread_bps=6.0,
        depth_1=120_000.0,
        adv=8_000_000.0,
        vol_bps=45.0,
        impact_k=0.75,

        regime="NORMAL",
        regime_score=0.0,
        regime_hysteresis=0,

        llm_calls=0,
        llm_last={},
        llm_tactic="",
        llm_confidence=0.0,
        llm_rationale="",

        tactic="TWAP",
        fill_qty=0,
        remaining_qty=qty,
        est_cost_bps=0.0,
        est_slippage_bps=0.0,
        est_total_bps=0.0,

        max_cost_bps=float(CONFIG["max_cost_bps"]),
        max_slippage_bps=float(CONFIG["max_slippage_bps"]),

        trace=[],
        notes=[],
        errors=[],
    )

state0 = init_state()
print("INIT:", {k: state0[k] for k in ["run_id","order_id","qty","remaining_qty","regime","max_steps","model"]})


INIT: {'run_id': '308810e4-9a1b-47ed-bc89-016900ef690f', 'order_id': 'ORD-308810e4', 'qty': 250000, 'remaining_qty': 250000, 'regime': 'NORMAL', 'max_steps': 8, 'model': 'claude-haiku-4-5-20251001'}


##3.AGENT NODE

###3.1.OVERVIEW

**CELL 3 — AgentNode abstraction and governance utilities**

Cell 3 introduces the notebook’s core architectural discipline: every node is an **AgentNode** with a consistent interface and consistent accountability. In finance, you cannot approve a workflow that performs unlogged transformations. The AgentNode wrapper is the mechanism that prevents “invisible work.”

The AgentNode abstraction accomplishes three teaching goals. First, it makes every node callable in the same way: input state goes in, updated state comes out. This supports modular design and makes nodes testable. Second, it enforces trace logging automatically. Every time a node runs, the wrapper appends a structured trace event with timestamp, node name, step index, duration, current regime, tactic, and remaining quantity. This turns the execution into a timeline that can be reviewed after the fact. Third, it adds performance visibility. In classroom systems, speed matters. In professional systems, latency matters. Measuring per-node duration teaches students to think like system engineers.

Cell 3 also defines governance helpers: configuration hashing and environment fingerprinting. These are not cosmetic. A run_manifest is only meaningful if it references an immutable configuration hash and a reproducible environment fingerprint. Otherwise you cannot tell whether two runs are comparable. The environment fingerprint captures Python version and key package versions so reviewers can reconcile behavioral differences across machines.

A subtle but important lesson: this cell embodies the separation between “business logic” and “governance mechanics.” Nodes should focus on their decision logic. Logging, timing, and provenance should be standardized and enforced by the framework. That is how you prevent fragile notebooks where each cell logs differently and nothing is consistent.

Compared to earlier notebooks, Cell 3 continues the same institutional theme: agentic systems are not defined by prompts; they are defined by interfaces, state contracts, and traceability. This abstraction sets up a scalable pattern that becomes more valuable as notebooks become more complex in later chapters.


###3.2.CODE AND IMPLEMENTATION

In [3]:
# CELL 3/10 — AgentNode abstraction + governance helpers (hashing, trace logging)

class AgentNode:
    """
    Minimal, testable node abstraction:
    - deterministic given state + config
    - explicit trace event appended on every call
    - no hidden globals
    """
    name: str
    def __init__(self, name: str):
        self.name = name

    def __call__(self, s: ExecState) -> ExecState:
        t0 = time.time()
        out = self.run(s)
        dt_ms = int((time.time() - t0) * 1000)
        # Mandatory trace append
        out["trace"].append({
            "ts_utc": utc_now_iso(),
            "node": self.name,
            "step": out["step"],
            "duration_ms": dt_ms,
            "regime": out["regime"],
            "tactic": out["tactic"],
            "remaining_qty": out["remaining_qty"],
        })
        return out

    def run(self, s: ExecState) -> ExecState:
        raise NotImplementedError

def config_hash(cfg: Dict[str, Any]) -> str:
    cfg_norm = json.dumps(cfg, sort_keys=True, separators=(",", ":"))
    return sha256_str(cfg_norm)

def env_fingerprint() -> Dict[str, Any]:
    return {
        "python": platform.python_version(),
        "platform": platform.platform(),
        "packages": {
            "langgraph": _ver("langgraph"),
            "langchain": _ver("langchain"),
            "langchain-core": _ver("langchain-core"),
            "anthropic": _ver("anthropic"),
            "httpx": _ver("httpx"),
        },
        "seed": SEED,
    }

def require_api_key_if_needed(s: ExecState) -> str:
    if not s["use_llm_notes"]:
        return ""
    key = userdata.get("ANTHROPIC_API_KEY")
    if not key:
        raise RuntimeError('Missing Colab secret: userdata.get("ANTHROPIC_API_KEY") (ALL CAPS). Set it or set use_llm_notes=False.')
    return key

print("CONFIG_HASH:", config_hash(CONFIG))
print("ENV_FP:", env_fingerprint())


CONFIG_HASH: eb2ec1a2a989f0962c6b9075e723773775358a608f6b418142c10f8a5bdfaef7
ENV_FP: {'python': '3.12.12', 'platform': 'Linux-6.6.105+-x86_64-with-glibc2.35', 'packages': {'langgraph': '0.2.39', 'langchain': '0.3.14', 'langchain-core': '0.3.40', 'anthropic': '0.82.0', 'httpx': '0.28.1'}, 'seed': 5}


##4.MICROSTRUCTURE GENERATOR

###4.1.OVERVIEW

**CELL 4 — Synthetic market microstructure updates as an explicit agent**

Cell 4 creates the “world” in which the execution system operates. The market_data node updates a synthetic microstructure snapshot at each loop. The pedagogical purpose is to demonstrate that execution decisions must react to evolving conditions, not static inputs. If the environment never changes, a state machine is unnecessary.

The synthetic market step is designed to be explainable. The key driver is size pressure: remaining quantity relative to depth. When remaining quantity is large compared to available depth, execution becomes harder. In practice this shows up as wider spreads, reduced depth, higher volatility, and higher impact. The node simulates those relationships in a deterministic way. Determinism is achieved by seeding a pseudo-random generator with a function of run_id and step. This gives variation that feels like “market noise,” but still remains reproducible: the same run_id produces the same sequence.

This node is not an alpha model. It is not trying to predict prices. Its role is to create stress conditions so students can see regime transitions and tactic shifts. It changes the midprice slightly, but the real teaching variables are spread, depth, volatility, and impact coefficient. These are the levers that execution desks care about when thinking about feasibility and costs.

Why make this an agent node rather than a simple function call? Because in real agentic architectures, “data acquisition” is a distinct step. Treating it as a node also makes the graph meaningful: you can see that every loop begins with refreshed conditions. This is closer to operational reality: a desk tool always begins with market observation before deciding.

By the end of Cell 4, students understand that execution control is inherently dynamic and that the system must be structured to ingest updated conditions each iteration. This is the environmental half of the regime machine: without a changing environment, regimes would not matter.


###4.2.CODE AND IMPLEMENTATION

In [4]:
# CELL 4/10 — Synthetic microstructure generator node (market snapshot evolves per step)

def _bounded(x: float, lo: float, hi: float) -> float:
    return max(lo, min(hi, x))

def synthetic_market_step(s: ExecState) -> Dict[str, float]:
    """
    Deterministic synthetic market evolution:
    - Spread widens and depth thins as remaining_qty / depth rises (size pressure)
    - Vol can drift upward under stress; impact coefficient increases
    """
    step = s["step"]
    size_pressure = (s["remaining_qty"] / max(1.0, s["depth_1"]))  # >1 means order bigger than top depth proxy

    # Deterministic pseudo-randomness tied to run_id+step (still reproducible)
    salt = int(sha256_str(s["run_id"] + f":{step}")[:8], 16)
    rng = random.Random(salt)

    spread = s["spread_bps"] * (1.0 + 0.12 * _bounded(size_pressure, 0.0, 3.0)) + rng.uniform(-0.4, 0.6)
    depth = s["depth_1"] * (1.0 - 0.10 * _bounded(size_pressure, 0.0, 2.5)) + rng.uniform(-2500.0, 2500.0)
    vol = s["vol_bps"] * (1.0 + 0.06 * (1 if s["regime"] != "NORMAL" else 0)) + rng.uniform(-1.5, 2.0)
    impact_k = s["impact_k"] * (1.0 + 0.08 * (1 if s["regime"] == "STRESSED" else 0)) + rng.uniform(-0.03, 0.05)

    return {
        "spread_bps": float(_bounded(spread, 2.0, 60.0)),
        "depth_1": float(_bounded(depth, 10_000.0, 500_000.0)),
        "vol_bps": float(_bounded(vol, 10.0, 180.0)),
        "impact_k": float(_bounded(impact_k, 0.2, 2.0)),
        "px_mid": float(_bounded(s["px_mid"] + rng.uniform(-0.08, 0.10), 50.0, 200.0)),
    }

class MarketDataNode(AgentNode):
    def __init__(self):
        super().__init__("market_data")

    def run(self, s: ExecState) -> ExecState:
        snap = synthetic_market_step(s)
        s["px_mid"] = snap["px_mid"]
        s["spread_bps"] = snap["spread_bps"]
        s["depth_1"] = snap["depth_1"]
        s["vol_bps"] = snap["vol_bps"]
        s["impact_k"] = snap["impact_k"]
        return s

market_node = MarketDataNode()
test = market_node(init_state())
print("SAMPLE_SNAPSHOT:", {k: test[k] for k in ["step","px_mid","spread_bps","depth_1","vol_bps","impact_k"]})


SAMPLE_SNAPSHOT: {'step': 0, 'px_mid': 99.94479697383088, 'spread_bps': 7.315101915987836, 'depth_1': 93855.17540993745, 'vol_bps': 44.61023681570283, 'impact_k': 0.7990023586220082}


##5.STATE REGIME SCORE

###5.1.OVERVIEW

**CELL 5 — Regime classification as a state machine with hysteresis**

Cell 5 is the architectural center of Notebook 5. It introduces a true state machine concept: the system assigns the market into a discrete regime—NORMAL, TIGHT, or STRESSED—and that regime persists across iterations. This is more than a label; it is a control variable that governs later decisions.

The regime score aggregates multiple signals: spread, volatility, depth, and size pressure. Each term is normalized so the score is interpretable. This is an important didactic point: in professional systems, regime indicators should not be mysterious. You want to know which inputs drive classification and how sensitive the system is to each. The weights in the score are not “true” in a statistical sense; they are chosen to be pedagogically clear and to create meaningful transitions in the synthetic environment.

The state machine transition function uses thresholds. Promotion into more stressed states happens relatively quickly, because liquidity can deteriorate rapidly in real markets. Demotion happens more slowly, because conditions need to stabilize before you resume normal behavior. Hysteresis enforces this stability. Without hysteresis, the regime would flip-flop near thresholds due to noise. That flip-flop is not just annoying; it is dangerous, because it causes inconsistent tactic switching and increases cost.

This cell teaches a finance systems lesson: regime classification is not a one-off judgement; it is a controlled process with inertia. Many institutional workflows implement a similar idea: risk “tiers,” liquidity “states,” or volatility “bands” that govern what the desk is allowed to do.

Compared to earlier notebooks, this is the first time we introduce a state variable that behaves like an operational mode rather than a label attached to a single response. That is the progression: from routing based on missing info or suitability, to routing based on a persistent market state that evolves and must be stabilized.


###5.2.CODE AND IMPLEMENTATION

In [5]:
# CELL 5/10 — Stateful regime machine node (hysteresis + deterministic transitions)

def regime_score(s: ExecState) -> float:
    """
    Liquidity/execution stress score (higher = worse):
    - spread_bps up
    - depth_1 down (via inverse)
    - vol up
    - size_pressure up (remaining vs depth)
    """
    size_pressure = s["remaining_qty"] / max(1.0, s["depth_1"])
    score = 0.0
    score += 0.45 * (s["spread_bps"] / 10.0)         # normalize
    score += 0.35 * (size_pressure)                  # direct pressure
    score += 0.20 * (s["vol_bps"] / 60.0)            # normalize
    return float(score)

def next_regime(prev: Regime, score: float, hyst: int) -> Tuple[Regime, int]:
    """
    Deterministic regime state machine with hysteresis:
    - Promote quickly into stress, demote slowly out of it.
    - Hysteresis counter prevents flip-flop on boundary noise.
    """
    # Thresholds (tuned for pedagogical clarity)
    to_tight = 1.10
    to_stress = 1.55
    back_to_tight = 1.30
    back_to_normal = 0.95

    # Promote (fast)
    if prev == "NORMAL" and score >= to_tight:
        return ("TIGHT", 2)
    if prev == "TIGHT" and score >= to_stress:
        return ("STRESSED", 3)

    # Demote (slow + hysteresis)
    if hyst > 0:
        return (prev, hyst - 1)

    if prev == "STRESSED" and score <= back_to_tight:
        return ("TIGHT", 2)
    if prev == "TIGHT" and score <= back_to_normal:
        return ("NORMAL", 1)

    return (prev, hyst)

class RegimeNode(AgentNode):
    def __init__(self):
        super().__init__("regime_machine")

    def run(self, s: ExecState) -> ExecState:
        sc = regime_score(s)
        new_reg, new_hyst = next_regime(s["regime"], sc, s["regime_hysteresis"])
        s["regime_score"] = sc
        s["regime"] = new_reg
        s["regime_hysteresis"] = new_hyst
        return s

regime_node = RegimeNode()
s = init_state()
for _ in range(3):
    s = market_node(s)
    s = regime_node(s)
    s["step"] += 1
print("REGIME_TRACE_SAMPLE:", [(e["step"], e["node"], e["regime"]) for e in s["trace"][-6:]])


REGIME_TRACE_SAMPLE: [(0, 'market_data', 'NORMAL'), (0, 'regime_machine', 'TIGHT'), (1, 'market_data', 'TIGHT'), (1, 'regime_machine', 'STRESSED'), (2, 'market_data', 'STRESSED'), (2, 'regime_machine', 'STRESSED')]


##6.TACTIC SELECTION

###6.1.OVERVIEW

**CELL 6 — LLM tactic proposal, robust JSON enforcement, and deterministic fallback execution logic**

Cell 6 is where the notebook becomes “LLM-in-the-loop,” but in a controlled way. The LLM is not asked to invent strategies or predict returns. It is asked to do one thing: propose an execution tactic from a bounded menu. This is a professional architecture choice. In sensitive workflows, you do not give unconstrained action authority to a language model. You constrain the action space and you wrap it with logging and hard gates.

This cell includes three key pieces. First, it defines the deterministic fallback tactic logic. This ensures the workflow remains runnable and stable even if the LLM fails. It also sets a baseline behavior that students can compare against the LLM’s choices. Second, it defines the execution simulator: given a tactic and the current microstructure, estimate a fill fraction and compute cost components. The simulator is deliberately simple but structured: spread cost, volatility-driven slippage, and convex impact. This decomposition teaches students how execution cost is not a single number—it is a sum of channels.

Third, it defines the LLM proposer node with robust JSON parsing. The earlier failure you observed—errors on every call—was the most common operational issue: models sometimes return text around JSON. The robust parser first tries strict JSON. If that fails, it extracts the first JSON object boundaries and parses again. If that still fails, it raises and logs. This is the correct governance stance: attempt safe recovery, but never pretend the call succeeded.

The cell also stores the LLM proposal, confidence, and rationale in state, and appends a concise note. This matters because a reviewer needs to understand whether the tactic came from the LLM or from fallback logic. The final execution_tactic node then chooses: if LLM proposal is valid, use it; otherwise use deterministic logic.

This is a major progression from previous notebooks: the LLM is now part of an operational control loop, but the loop is protected by bounded action space, explicit logging, and deterministic fallback. That is “agentic architecture” in finance: constrained autonomy under governance.


###6.2.CODE AND IMPLEMENTATION

In [11]:
# CELL 6/10 — BEST OPTION (production-safe): robust JSON extraction + bounded LLM proposal + deterministic fallback
# Fixes: LLM returned non-JSON → errors_n=llm_calls.
# Design: try strict json.loads; if fails, extract first {...} block; if still fails → log + fallback.
# No silent failures. Hard action bounds. Risk gate still rules.

from anthropic import Anthropic

ALLOWED_TACTICS = ("TWAP", "VWAP", "ICEBERG", "PAUSE")

def _bounded(x: float, lo: float, hi: float) -> float:
    return max(lo, min(hi, x))

def _extract_first_json_object(text: str) -> str:
    """
    Best-effort JSON recovery:
    - takes substring from first '{' to last '}'
    - rejects if boundaries missing
    """
    i = text.find("{")
    j = text.rfind("}")
    if i == -1 or j == -1 or j <= i:
        raise ValueError("No JSON object boundaries found")
    return text[i:j+1]

def choose_tactic(s: ExecState) -> str:
    """
    Deterministic fallback selection (state-driven).
    """
    urg = int(s["urgency_bps"])
    rem = int(s["remaining_qty"])
    size_pressure = rem / max(1.0, float(s["depth_1"]))

    if s["regime"] == "STRESSED":
        if urg >= 20 and size_pressure < 1.0:
            return "ICEBERG"
        return "PAUSE"

    if s["regime"] == "TIGHT":
        if size_pressure > 1.2:
            return "ICEBERG"
        return "VWAP" if urg <= 18 else "TWAP"

    return "VWAP" if urg <= 15 else "TWAP"

def simulate_execution_step(s: ExecState, tactic: str) -> Dict[str, Any]:
    """
    Explainable execution simulator.
    """
    rem = int(s["remaining_qty"])
    if rem <= 0:
        return {"fill_qty": 0, "cost_bps": 0.0, "slip_bps": 0.0, "total_bps": 0.0}

    size_pressure = rem / max(1.0, float(s["depth_1"]))

    if tactic == "PAUSE":
        fill_frac = 0.00
        aggress = 0.00
    elif tactic == "ICEBERG":
        fill_frac = _bounded(0.18 - 0.04 * _bounded(size_pressure, 0.0, 2.0), 0.06, 0.18)
        aggress = 0.55
    elif tactic == "VWAP":
        fill_frac = 0.22
        aggress = 0.65
    else:  # TWAP
        fill_frac = 0.28
        aggress = 0.75

    if s["regime"] == "TIGHT":
        fill_frac *= 0.85
    elif s["regime"] == "STRESSED":
        fill_frac *= 0.65

    fill_qty = int(max(0, min(rem, round(rem * fill_frac))))

    spread_cost = 0.50 * float(s["spread_bps"]) * aggress
    vol_slip = 0.10 * (float(s["vol_bps"]) / 10.0) * aggress
    impact = (float(s["impact_k"]) * (size_pressure ** 1.25) * 6.5) * aggress

    if tactic == "ICEBERG":
        impact *= 0.75
        vol_slip *= 0.90
    if tactic == "PAUSE":
        spread_cost = 0.0
        vol_slip = 0.0
        impact = 0.0

    total = float(spread_cost + vol_slip + impact)
    return {
        "fill_qty": fill_qty,
        "cost_bps": float(spread_cost + impact),
        "slip_bps": float(vol_slip),
        "total_bps": total,
    }

def call_claude_json(api_key: str, model: str, system: str, user_json: Dict[str, Any]) -> Dict[str, Any]:
    """
    Strict JSON contract with robust recovery:
    1) parse full text as JSON
    2) if fails, extract first {...} JSON object and parse
    else raise (and caller logs) — no silent failures.
    """
    client = Anthropic(api_key=api_key)
    payload = json.dumps(user_json, sort_keys=True)
    msg = client.messages.create(
        model=model,
        max_tokens=int(CONFIG["llm_max_tokens"]),
        temperature=float(CONFIG["llm_temperature"]),
        system=system,
        messages=[{"role": "user", "content": payload}],
    )
    text = "".join([b.text for b in msg.content if hasattr(b, "text")])

    try:
        return json.loads(text)
    except Exception:
        try:
            return json.loads(_extract_first_json_object(text))
        except Exception as e:
            raise RuntimeError(f"LLM returned non-JSON. First 400 chars: {text[:400]}") from e

class LLMTacticProposerNode(AgentNode):
    """
    LLM proposes a tactic in a bounded action space.
    - If LLM fails, we log and fall back deterministically.
    - If tactic invalid, we treat as failure.
    """
    def __init__(self):
        super().__init__("llm_tactic_proposer")

    def run(self, s: ExecState) -> ExecState:
        system = (
            "Return ONLY valid JSON. No prose. No markdown. "
            "Allowed tactics: TWAP, VWAP, ICEBERG, PAUSE. "
            "Pick exactly one tactic from allowed_tactics. "
            "rationale must be <= 140 characters."
        )

        req = {
            "task": "Propose execution tactic under liquidity conditions.",
            "allowed_tactics": list(ALLOWED_TACTICS),
            "state": {
                "side": s["side"],
                "remaining_qty": s["remaining_qty"],
                "urgency_bps": s["urgency_bps"],
                "px_mid": s["px_mid"],
                "spread_bps": s["spread_bps"],
                "depth_1": s["depth_1"],
                "adv": s["adv"],
                "vol_bps": s["vol_bps"],
                "impact_k": s["impact_k"],
                "regime": s["regime"],
                "regime_score": s["regime_score"],
                "limits": {"max_cost_bps": s["max_cost_bps"], "max_slippage_bps": s["max_slippage_bps"]},
            },
            "output_schema": {
                "tactic": "one of allowed_tactics",
                "confidence": "float 0..1",
                "rationale": "string <= 140 chars"
            }
        }

        s["llm_calls"] = int(s["llm_calls"]) + 1

        try:
            out = call_claude_json(API_KEY, s["model"], system, req)
            tactic = str(out.get("tactic", "")).strip().upper()
            conf = float(out.get("confidence", 0.0))
            rat = str(out.get("rationale", "")).strip()

            if tactic not in ALLOWED_TACTICS:
                raise RuntimeError(f"Invalid tactic from LLM: {tactic}")

            s["llm_last"] = out
            s["llm_tactic"] = tactic
            s["llm_confidence"] = float(_bounded(conf, 0.0, 1.0))
            s["llm_rationale"] = rat[:140]
            s["notes"].append(f"LLM tactic={tactic} conf={s['llm_confidence']:.2f} why={s['llm_rationale']}")
            return s

        except Exception as e:
            err = f"LLM proposer failed: {type(e).__name__}: {str(e)[:260]}"
            s["errors"].append(err)
            s["llm_last"] = {"error": err}
            s["llm_tactic"] = ""
            s["llm_confidence"] = 0.0
            s["llm_rationale"] = ""
            s["notes"].append("LLM failed; falling back to deterministic tactic selection.")
            return s

llm_node = LLMTacticProposerNode()

class TacticNode(AgentNode):
    """
    Uses LLM proposal if valid; otherwise deterministic fallback.
    """
    def __init__(self):
        super().__init__("execution_tactic")

    def run(self, s: ExecState) -> ExecState:
        proposed = str(s.get("llm_tactic", "")).strip().upper()
        tactic = proposed if proposed in ALLOWED_TACTICS else choose_tactic(s)

        sim = simulate_execution_step(s, tactic)
        s["tactic"] = tactic
        s["fill_qty"] = int(sim["fill_qty"])
        s["remaining_qty"] = int(max(0, s["remaining_qty"] - s["fill_qty"]))
        s["est_cost_bps"] = float(sim["cost_bps"])
        s["est_slippage_bps"] = float(sim["slip_bps"])
        s["est_total_bps"] = float(sim["total_bps"])
        return s

tactic_node = TacticNode()

# quick smoke test
_s = init_state()
_s = market_node(_s); _s = regime_node(_s); _s = llm_node(_s); _s = tactic_node(_s)
print("TACTIC_SAMPLE:", {
    "regime": _s["regime"],
    "llm_tactic": _s["llm_tactic"],
    "tactic_used": _s["tactic"],
    "fill_qty": _s["fill_qty"],
    "remaining_qty": _s["remaining_qty"],
    "est_total_bps": _s["est_total_bps"],
    "errors_n": len(_s["errors"])
})


TACTIC_SAMPLE: {'regime': 'TIGHT', 'llm_tactic': 'TWAP', 'tactic_used': 'TWAP', 'fill_qty': 59500, 'remaining_qty': 190500, 'est_total_bps': 15.652011131732952, 'errors_n': 0}


##7.RISK CONTROLS

###7.1.OVERVIEW

**CELL 7 — Risk gate: hard stop conditions and conditional routing signal**

Cell 7 implements the most important professional concept in the notebook: the **gate**. A gate is not an “agent.” It is a control function that determines whether the system is permitted to continue. This distinction matters. In a finance setting, we do not delegate stop decisions to creativity. We encode stop decisions as enforceable policy.

The risk_gate checks three conditions in a strict order. First, it tests whether cost or slippage limits are breached. If they are, the workflow sets done=True, sets termination_reason to LIMIT_BREACH_HUMAN_REVIEW, and forces the tactic field into a human-review halt state. This models a real escalation: automation must stop when constraints are violated.

Second, it tests whether the order is complete—remaining quantity is zero. If so, it terminates with ORDER_COMPLETE. Third, if neither applies, it increments step and checks if max_steps is reached. If max_steps is reached, it terminates with MAX_STEPS_REACHED. This last condition is not about markets; it is about bounded loops. In professional systems, bounded loops are a safety principle: you never allow indefinite autonomous cycling.

The cell also defines the routing function route_after_risk. This is not a typical if-statement in notebook code. Instead, it is a routing signal that LangGraph uses to decide which edge to follow. This reinforces the course principle: conditional routing must live in the graph, not in hidden control flow scattered across cells.

Compared to earlier notebooks, the gate has evolved. Notebook 2 introduced early termination for suitability. Notebook 5 generalizes the idea into a runtime stop system: based on state variables and limits, the workflow continues, completes, or escalates. This is how agentic systems become professional: gates define the boundary between automation and supervision.


###7.2.CODE AND IMPLEMENTATION

In [12]:
# CELL 7/10 — Risk/controls gate node + routing (bounded loop + explicit termination)

class RiskGateNode(AgentNode):
    def __init__(self):
        super().__init__("risk_gate")

    def run(self, s: ExecState) -> ExecState:
        # Hard stop if limits breached (institutional guardrail)
        if s["est_cost_bps"] > s["max_cost_bps"] or s["est_slippage_bps"] > s["max_slippage_bps"]:
            s["done"] = True
            s["termination_reason"] = "LIMIT_BREACH_HUMAN_REVIEW"
            s["tactic"] = "HALT_HUMAN_REVIEW"
            s["notes"].append(
                f"Gate breach: cost_bps={s['est_cost_bps']:.2f} (limit {s['max_cost_bps']}), "
                f"slip_bps={s['est_slippage_bps']:.2f} (limit {s['max_slippage_bps']})."
            )

        # Completion condition
        if (not s["done"]) and s["remaining_qty"] <= 0:
            s["done"] = True
            s["termination_reason"] = "ORDER_COMPLETE"

        # Step accounting (bounded loop driver)
        if not s["done"]:
            s["step"] += 1
            if s["step"] >= s["max_steps"]:
                s["done"] = True
                s["termination_reason"] = "MAX_STEPS_REACHED"

        return s

risk_node = RiskGateNode()

def route_after_risk(s: ExecState) -> str:
    """
    Conditional routing via LangGraph only.
    """
    return "end" if s["done"] else "continue"

# quick sanity
s = init_state()
s = market_node(s); s = regime_node(s); s = tactic_node(s); s = risk_node(s)
print("ROUTE_TEST:", route_after_risk(s), "| done:", s["done"], "| term:", s["termination_reason"])


ROUTE_TEST: continue | done: False | term: 


##8.GRAPH

###8.1.OVERVIEW

**CELL 8 — Graph construction, topology visualization, and graph_spec export**

Cell 8 is where everything becomes explicit. Until this point, the notebook has defined nodes and routing functions. Now we assemble the topology with LangGraph: entry point, node sequence, and conditional edges. This is crucial pedagogically because it teaches that an agentic system is not a pile of functions—it is a directed graph with a control surface you can inspect.

The topology is a loop: market_data → regime_machine → llm_tactic_proposer → execution_tactic → risk_gate. The only decision point is after risk_gate. If done=True, the graph routes to END. Otherwise, it routes back to market_data. That is the execution cycle made structural. Because routing is in LangGraph, the notebook has no hidden while-loops; the loop is the graph.

This cell also extracts Mermaid from the compiled graph and renders it. This is not decoration. The rendered diagram is the learning artifact. Students can point at it and explain the system in one glance: where data comes in, where regimes are computed, where LLM advice enters, where execution is simulated, and where stopping decisions happen. In financial workflows, this kind of explicit topology is a governance benefit: it is easier to review and approve.

Finally, the cell writes graph_spec.json. This is the “contract documentation” of the run: nodes, edges, bounded loop parameters, and the Mermaid source. In professional settings, this matters because you want to confirm the run actually used the intended topology. It prevents accidental drift. If a future edit changes the graph, graph_spec.json changes, and that change is detectable.

This cell underscores the core philosophy of the series: architecture is not a narrative. Architecture is a graph you can see and audit.


###8.2.CODE AND IMPLEMENTATION

In [13]:
# CELL 8/10 — (UPDATED) Build LangGraph topology (now includes LLM proposer), compile, visualize, export graph_spec.json

g = StateGraph(ExecState)

# Nodes
g.add_node("market_data", market_node)
g.add_node("regime_machine", regime_node)
g.add_node("llm_tactic_proposer", llm_node)
g.add_node("execution_tactic", tactic_node)
g.add_node("risk_gate", risk_node)

# Edges (explicit topology)
g.set_entry_point("market_data")
g.add_edge("market_data", "regime_machine")
g.add_edge("regime_machine", "llm_tactic_proposer")
g.add_edge("llm_tactic_proposer", "execution_tactic")
g.add_edge("execution_tactic", "risk_gate")

# Conditional routing (bounded loop: risk_gate decides continue vs end)
g.add_conditional_edges(
    "risk_gate",
    route_after_risk,
    {
        "continue": "market_data",
        "end": END
    }
)

graph = g.compile()

# Mermaid spec from compiled graph
try:
    mermaid = graph.get_graph().draw_mermaid()
except Exception as e:
    raise RuntimeError(f"Unable to extract mermaid from LangGraph: {e}")

display_langgraph_mermaid(mermaid)

graph_spec = {
    "notebook": "AA-FIN-LG-2026 — Notebook 5 (Liquidity / Execution Regime Control)",
    "created_utc": utc_now_iso(),
    "mermaid_version": MERMAID_VERSION,
    "langgraph_version": _ver("langgraph"),
    "topology": {
        "entry_point": "market_data",
        "nodes": ["market_data", "regime_machine", "llm_tactic_proposer", "execution_tactic", "risk_gate", "END"],
        "edges": [
            ["market_data", "regime_machine"],
            ["regime_machine", "llm_tactic_proposer"],
            ["llm_tactic_proposer", "execution_tactic"],
            ["execution_tactic", "risk_gate"],
            ["risk_gate", "market_data", "if route_after_risk == continue"],
            ["risk_gate", "END", "if route_after_risk == end"],
        ],
        "bounded_loop": {"max_steps_state_field": "max_steps", "step_state_field": "step"},
    },
    "mermaid": mermaid,
}

with open("graph_spec.json", "w", encoding="utf-8") as f:
    json.dump(graph_spec, f, indent=2, sort_keys=True)

print("WROTE: graph_spec.json")


WROTE: graph_spec.json


##9.EXECUTION

###9.1.0VERVIEW

**CELL 9 — Execution run with recursion limit governance and final_state export**

Cell 9 is where the workflow runs end-to-end. The important pedagogical detail is that LangGraph has its own internal recursion limit, measured in node executions, not in your state.step counter. Because each loop executes multiple nodes, you can hit LangGraph’s default recursion limit even when your bounded loop is correct. This is why the cell sets recursion_limit explicitly at invocation time based on max_steps and the number of nodes per cycle, plus headroom. This is an operational lesson: systems often have framework-level safety limits in addition to your business logic.

The run itself is deterministic in structure and mostly deterministic in behavior, except for the intrinsic variability of LLM outputs. However, the notebook reduces that variability with temperature=0 and strict action bounds. The run produces a final state that contains everything: termination reason, final regime, remaining quantity, last tactic, costs, and complete traces and notes.

The cell writes final_state.json. This is the most important artifact for review because it is the full state ledger at termination. A supervisor can inspect it to answer: “Why did we stop?” “What regime were we in?” “Did the LLM repeatedly fail?” “How many steps did we execute?” “Did we ever breach limits?” The trace provides a timeline of node calls, while notes explain key decisions like LLM proposals or fallback triggers.

In your earlier run, you saw MAX_STEPS_REACHED with STRESSED regime and PAUSE tactic. That is exactly the kind of outcome that becomes interpretable because final_state.json captures the full chain of state transitions.

Cell 9 teaches that running an agentic system is not just “get output.” It is “get output plus provenance,” under bounded iteration and framework-safe limits.


###9.2.CODE AND IMPLEMENTATION

In [15]:
# CELL 9/10 — (UPDATED) Execute the graph (bounded loop) with correct recursion_limit, capture final_state.json

def run_workflow(initial: ExecState) -> ExecState:
    """
    Runs the graph end-to-end with an explicit recursion_limit.
    LangGraph's recursion_limit counts internal ticks / node executions.
    We now have 5 nodes per cycle: market_data, regime_machine, llm_tactic_proposer, execution_tactic, risk_gate.
    """
    nodes_per_cycle = 5
    headroom = 25
    limit = int(initial["max_steps"]) * nodes_per_cycle + headroom

    final = graph.invoke(
        initial,
        config={"recursion_limit": limit}
    )
    return final

final_state = run_workflow(init_state())

with open("final_state.json", "w", encoding="utf-8") as f:
    json.dump(final_state, f, indent=2, sort_keys=True)

print("DONE:", {
    "termination_reason": final_state["termination_reason"],
    "steps_executed": final_state["step"],
    "regime": final_state["regime"],
    "remaining_qty": final_state["remaining_qty"],
    "last_tactic": final_state["tactic"],
    "llm_calls": final_state["llm_calls"],
    "errors_n": len(final_state["errors"]),
    "trace_len": len(final_state["trace"]),
})
print("WROTE: final_state.json")


DONE: {'termination_reason': 'LIMIT_BREACH_HUMAN_REVIEW', 'steps_executed': 6, 'regime': 'STRESSED', 'remaining_qty': 101512, 'last_tactic': 'HALT_HUMAN_REVIEW', 'llm_calls': 7, 'errors_n': 0, 'trace_len': 35}
WROTE: final_state.json


##10.AUDIT BUNDLE

###10.1.OVERVIEW

**CELL 10 — run_manifest export and end-of-run audit inspection**

Cell 10 completes the notebook’s governance cycle by producing the run_manifest.json and printing a concise human-readable summary. In finance workflows, it is not enough to have results; you must have metadata that makes those results reviewable and reproducible. The run_manifest is that metadata package.

The manifest captures run identity, timestamps, project name, notebook number, the locked model, configuration details, and a hash of the configuration. The config hash is particularly important: it ensures that if you rerun with different limits or different loop bounds, reviewers can detect that change. It also records an environment fingerprint so version drift is visible.

Notebook 5 extends the manifest relative to earlier notebooks by adding a structured LLM usage section: key presence (strict in this notebook), temperature, max tokens, call count, error count, and the last proposal payload. This is a governance requirement when LLMs influence decisions. It allows reviewers to separate “system policy” from “model suggestion,” and to diagnose when failures are due to parsing or model compliance.

The manifest also records artifact hashes for graph_spec.json and final_state.json. This is an integrity measure: if the files change after the run, the hash will no longer match. That is a lightweight but meaningful audit control.

Finally, the cell prints a minimal summary: termination reason, steps executed, remaining quantity, final regime, LLM calls, and error counts. It also prints the last segment of the trace and a few recent notes and errors. This gives students immediate feedback without forcing them to open JSON files manually, while still preserving the full artifacts for deeper inspection.

Cell 10 teaches the final professional lesson: a governed agentic system ends by producing an auditable bundle, not by producing a single answer. The deliverable is the process plus its evidence.


###10.2.CODE AND IMPLEMENTATION

In [16]:
# CELL 10/10 — Export run_manifest.json (audit artifacts) + quick inspection (UPDATED for LLM fields)

def _file_sha256(path: str) -> str:
    with open(path, "rb") as f:
        return hashlib.sha256(f.read()).hexdigest()

manifest = {
    "run_id": final_state["run_id"],
    "ts_start_utc": final_state["ts_start_utc"],
    "ts_end_utc": utc_now_iso(),
    "project": "AA-FIN-LG-2026",
    "notebook_number": 5,
    "notebook_objective": "Liquidity / execution regime control via stateful regime machine + LLM tactic proposer",
    "model_lock": final_state["model"],
    "config": CONFIG,
    "config_hash_sha256": config_hash(CONFIG),
    "environment": env_fingerprint(),

    "llm_usage": {
        "api_key_present": True,  # strict Cell 1 enforces this
        "model": final_state["model"],
        "temperature": float(CONFIG["llm_temperature"]),
        "max_tokens": int(CONFIG["llm_max_tokens"]),
        "calls": int(final_state.get("llm_calls", 0)),
        "errors": int(len(final_state.get("errors", []))),
        "last_payload": final_state.get("llm_last", {}),
        "last_proposal": {
            "tactic": final_state.get("llm_tactic", ""),
            "confidence": float(final_state.get("llm_confidence", 0.0)),
            "rationale": final_state.get("llm_rationale", ""),
        },
    },

    "artifacts": {
        "graph_spec.json": {"sha256": _file_sha256("graph_spec.json")},
        "final_state.json": {"sha256": _file_sha256("final_state.json")},
    },

    "outcome": {
        "termination_reason": final_state["termination_reason"],
        "final_regime": final_state["regime"],
        "final_remaining_qty": final_state["remaining_qty"],
        "final_step": final_state["step"],
        "last_tactic_used": final_state["tactic"],
        "last_est_total_bps": float(final_state.get("est_total_bps", 0.0)),
    },

    "governance_notes": [
        "State-driven routing only (LangGraph conditional edges).",
        "All loops bounded by state.max_steps and explicit END.",
        "LLM is constrained to a bounded action space (TWAP/VWAP/ICEBERG/PAUSE).",
        "Robust JSON parsing attempts full JSON, then extracts first {...} object; failures are logged (no silent failures).",
        "Hard risk_gate overrides proposals and can force HALT_HUMAN_REVIEW.",
        "Deterministic market evolution uses seeded randomness tied to run_id and step."
    ],
}

with open("run_manifest.json", "w", encoding="utf-8") as f:
    json.dump(manifest, f, indent=2, sort_keys=True)

print("WROTE: run_manifest.json")

# Minimal human-readable inspection (still code cell)
print("\nFILES:", [p for p in os.listdir(".") if p.endswith(".json")])
print("\nSUMMARY:", {
    "termination_reason": final_state["termination_reason"],
    "steps_executed": final_state["step"],
    "remaining_qty": final_state["remaining_qty"],
    "final_regime": final_state["regime"],
    "llm_calls": final_state.get("llm_calls", 0),
    "errors_n": len(final_state.get("errors", [])),
})

print("\nTOP TRACE (last 10):")
for ev in final_state["trace"][-10:]:
    print(ev)

# Show last few notes/errors for quick audit
print("\nNOTES (last 6):")
for n in final_state.get("notes", [])[-6:]:
    print("-", n)

print("\nERRORS (last 6):")
for e in final_state.get("errors", [])[-6:]:
    print("-", e)


WROTE: run_manifest.json

FILES: ['run_manifest.json', 'final_state.json', 'graph_spec.json']

SUMMARY: {'termination_reason': 'LIMIT_BREACH_HUMAN_REVIEW', 'steps_executed': 6, 'remaining_qty': 101512, 'final_regime': 'STRESSED', 'llm_calls': 7, 'errors_n': 0}

TOP TRACE (last 10):
{'ts_utc': '2026-02-18T22:46:44.259231+00:00', 'node': 'market_data', 'step': 5, 'duration_ms': 0, 'regime': 'STRESSED', 'tactic': 'ICEBERG', 'remaining_qty': 116117}
{'ts_utc': '2026-02-18T22:46:44.259682+00:00', 'node': 'regime_machine', 'step': 5, 'duration_ms': 0, 'regime': 'STRESSED', 'tactic': 'ICEBERG', 'remaining_qty': 116117}
{'ts_utc': '2026-02-18T22:46:45.968980+00:00', 'node': 'llm_tactic_proposer', 'step': 5, 'duration_ms': 1708, 'regime': 'STRESSED', 'tactic': 'ICEBERG', 'remaining_qty': 116117}
{'ts_utc': '2026-02-18T22:46:45.969623+00:00', 'node': 'execution_tactic', 'step': 5, 'duration_ms': 0, 'regime': 'STRESSED', 'tactic': 'ICEBERG', 'remaining_qty': 108569}
{'ts_utc': '2026-02-18T22:46:4

##11.CONCLUSION

**Conclusion: from “agentic prompting” to a governed execution machine**

Notebook 5 is where the course stops pretending that “agents” are just clever text generation and starts behaving like a real finance control system. Up to now, each notebook added one structural capability—first learning to ask for missing inputs, then drawing a hard boundary around suitability, then drafting under evidence gaps, then wrapping a trading hypothesis in a tool-like backtest harness. Here, the architectural contribution is different in kind: we move from “single-pass reasoning with checks” into a **stateful regime machine** that can run, pause, escalate, and stop—while leaving behind an audit trail.

The key progression is that the agent is no longer best understood as a conversational entity. It is a **workflow graph** whose meaning is the topology: nodes, gates, transitions, bounded loops, and explicit termination. This matters in execution because “good” behavior is not eloquent; it is stable and reviewable. A trading desk does not need a story. It needs: “What’s the regime? What’s the tactic? What’s the estimated impact? Are we allowed to proceed?”

Notebook 1 taught the basic discipline: **triage is not a single answer**; it is a conditional loop that tries to close missing information, and it must be bounded. Notebook 2 made the model professionally safe by creating **hard branching and early termination**: the system learns that sometimes the correct output is refusal or redirection, not a recommendation. Notebook 3 pushed into institutional writing where evidence matters: the credit memo workflow introduced the idea that the model must separate claims, assumptions, and open items, and can iterate through critique loops without hallucinating authority. Notebook 4 brought in tool-like structure: a hypothesis is not accepted because it sounds plausible, but because it survives a defined backtest wrapper and returns inspectable outputs. Each step increased structural maturity: from “get an answer” to “run a governed process.”

Notebook 5 extends that maturity into the domain where finance is most unforgiving: execution and liquidity. The new dimension is **time and regime**. We now treat decisions as contingent on evolving state rather than static prompts. The system does not ask “what should I do?” once. It repeats an execution cycle: observe conditions, classify regime, propose a tactic, simulate consequences, and pass through a risk gate. This repeating cycle is not incidental—it is the lesson. Execution is a control loop, and the system must be designed as one.

Architecturally, Notebook 5 introduces three contributions that were not present earlier:

**First, a discrete regime state with hysteresis.** We are no longer routing based only on “did we get the info?” or “is it suitable?” We route based on a persistent state variable—**NORMAL / TIGHT / STRESSED**—and we prevent flapping using hysteresis counters. This is a real systems pattern: regimes are not momentary labels; they are operational states that must be stable enough to drive behavior.

**Second, a bounded action space for the LLM.** Earlier notebooks used the model to draft or classify within constrained outputs. Here we go further: the LLM is a proposer inside a strict menu of tactics (TWAP/VWAP/ICEBERG/PAUSE), and the system captures the proposal as auditable state. The model is not “the trader.” It is a bounded advisor operating inside a governed machine.

**Third, a hard risk gate that owns stopping.** The most important professional principle is that creativity cannot be the final authority. Notebook 5 makes the gate explicit: it ends the run when limits breach, when the order completes, or when the bounded loop is exhausted. This is the architectural distinction between an agentic demo and an institutional workflow: the gate is the policy.

The stopping outcomes themselves are part of the pedagogy. If the regime becomes stressed and the safest move is PAUSE, the system can end via **MAX_STEPS_REACHED** with remaining quantity still on the book. That looks unsatisfying only if you think the goal is “always finish.” In real life, the goal is “finish only when feasible under constraints.” Notebook 5 teaches that the most professional decision is sometimes to stop and escalate with clean artifacts rather than grind forward.

Finally, Notebook 5 strengthens the series’ governance story. The graph is visible and matches the topology. Every node execution appends to trace. Every failure is explicit. Every run exports artifacts. Compared to Notebook 1’s missing-info loop and Notebook 2’s refusal boundary, this notebook demonstrates a more operational form of governance: **execution governance**—the discipline of acting only when conditions allow and proving, after the fact, what happened and why.

This is the progression across notebooks in one sentence: we moved from **answering questions** to **running controlled systems**. Notebook 5 is the first time the system truly behaves like a desk process: iterative, stateful, bounded, and supervised by gates. That is the contribution—and it sets up what comes next, where we scale from a single regime machine into committees, hub-and-spoke constellations, retrieval routers, and event-driven monitoring.
