# Agents: Planning in JSON

**Overview:** Multiple agents coordinate roles, share context, and adapt their behaviors to complete complex customer service tasks

**Steps:** 
1) **Plannings** â€” ask an LLM to come up with a JSON-based plan to complete a task 
2) **Reflecting** â€” another agent to reflect on the plan
3) **Execution** â€” write code to execute each step in the plan
4) **Error explanation** â€” if an error occurs, give the query and the error to an LLM to analyze

**Example: handling customer requests in a retail scenario**

e.g.1, imagine a walk-in customer who wants to return **two Aviator sunglasses**: youâ€™ll look up the item, compute the refund, update stock, and record the transaction.  

In [1]:
import sys
print(sys.executable)

/Users/loctruong/Documents/Self-learn/AIAgents/course-agentic-ai/course_agentic_ai_venv/bin/python


## 0. Initialization

### Import

In [3]:
# =========================
# Imports & utilities
# =========================

# --- Standard library ---
from __future__ import annotations
from typing import Any, Callable, Optional
import json
import re

# --- Third-party ---
import pandas as pd
import duckdb
# from openai import OpenAI
from dotenv import load_dotenv

# --- Local ---
import inventory_utils
import utils
import tools

_ = load_dotenv()

In [None]:
from google import genai
from google.genai import types
import os

gemini_api_key = os.getenv("GEMINI_API_KEY")
# gemini_api_key = os.getenv("GEMINI_API_KEY_2")

client = genai.Client(api_key=gemini_api_key)

In [5]:
def get_response(client, model_name, config, prompt):
    response = client.models.generate_content(
        model=model_name,
        contents=prompt,
        config=config,
    )
    return response

### Load data

Create the initial **inventory** and **transactions** tables for the sunglasses store.


In [None]:
inventory_df = inventory_utils.create_inventory_dataframe()
transaction_df = inventory_utils.create_transaction_dataframe()

In [11]:
inventory_df.head()

Unnamed: 0,name,item_id,description,quantity_in_stock,price
0,Aviator,SG001,"Originally designed for pilots, these teardrop...",23,103
1,Wayfarer,SG002,"Featuring thick, angular frames that make a st...",6,92
2,Mystique,SG003,"Inspired by 1950s glamour, these frames sweep ...",3,88
3,Sport,SG004,"Designed for active lifestyles, these wraparou...",11,144
4,Round,SG005,Circular lenses set in minimalist frames creat...,10,86


In [12]:
transaction_df.head()

Unnamed: 0,transaction_id,customer_name,transaction_summary,transaction_amount,balance_after_transaction
0,TXN001,OPENING_BALANCE,Daily opening register balance,500.0,500.0


### DuckDB helpers

To make the data easier to query, create a helper function that sets up a **DuckDB connection** and registers both DataFrames as `SQL views`:

- `inventory_df`  
- `transaction_df`  

By doing this, the rest of the workflow can reason about `inventory` and `transactions` using familiar database operations. 

In [13]:
# =========================
# DuckDB helpers
# =========================
def create_duckdb_with_views(inventory_df: pd.DataFrame, transaction_df: pd.DataFrame) -> duckdb.DuckDBPyConnection:
    con = duckdb.connect()
    con.register("inventory_df", inventory_df)
    con.register("transaction_df", transaction_df)
    return con

con = create_duckdb_with_views(inventory_df, transaction_df)

In [14]:
result = con.sql("SELECT * FROM inventory_df")
result.df().head()

Unnamed: 0,name,item_id,description,quantity_in_stock,price
0,Aviator,SG001,"Originally designed for pilots, these teardrop...",23,103
1,Wayfarer,SG002,"Featuring thick, angular frames that make a st...",6,92
2,Mystique,SG003,"Inspired by 1950s glamour, these frames sweep ...",3,88
3,Sport,SG004,"Designed for active lifestyles, these wraparou...",11,144
4,Round,SG005,Circular lenses set in minimalist frames creat...,10,86


## 1. Tools

### 1.1 Overview


A set of helper functions bundled in the `tools` module, this provides:

* **READ** â€” fetch data with lightweight DuckDB SQL (e.g., look up inventory or transactions).  
* **WRITE** â€” update in-memory DataFrames (adjust inventory, append new transactions).  
* **Propose-only** â€” compute outcomes such as totals or balances without mutating state.  
* **Helpers** â€” perform quick math (e.g., totals, refunds) and simple assertions.  
* **Validations** â€” check conditions like non-null values or non-negative stock.  
* **Registry** â€” map string names (including aliases) to callables that the plan or LLM can execute.  

All tool functions are accessible via the registry. By exposing them this way, you can later call tools directly from the planning and execution workflow.

### 1.2 Testing

- Example 1

In [27]:
# Lookup a product (READ)
prod = tools.TOOL_REGISTRY["get_inventory_data"](con=con, product_name="Aviator")

In [28]:
prod

{'rows':       name item_id                                        description  \
 0  Aviator   SG001  Originally designed for pilots, these teardrop...   
 
    quantity_in_stock  price  
 0                 23    103  ,
 'match_count': 1,
 'item': {'name': 'Aviator',
  'item_id': 'SG001',
  'description': 'Originally designed for pilots, these teardrop-shaped lenses with thin metal frames offer timeless appeal. The large lenses provide excellent coverage while the lightweight construction ensures comfort during long wear.',
  'quantity_in_stock': 23,
  'price': 103}}

In [29]:
prod['match_count']

1

In [30]:
prod["rows"]

Unnamed: 0,name,item_id,description,quantity_in_stock,price
0,Aviator,SG001,"Originally designed for pilots, these teardrop...",23,103


In [31]:
prod['item']

{'name': 'Aviator',
 'item_id': 'SG001',
 'description': 'Originally designed for pilots, these teardrop-shaped lenses with thin metal frames offer timeless appeal. The large lenses provide excellent coverage while the lightweight construction ensures comfort during long wear.',
 'quantity_in_stock': 23,
 'price': 103}

- Example 2:

In [35]:
# Compute a purchase total (HELPER)
total = tools.TOOL_REGISTRY["compute_total"](qty=3, price=prod["item"]["price"])

In [39]:
total

{'amount': 309.0}

## 2. Plan execution  


### 2.1 execute_plan_tools_only

In [40]:
def execute_plan_tools_only(
    plan: dict[str, Any],
    inventory_df: pd.DataFrame,
    transaction_df: pd.DataFrame,
    return_updated_frames: bool = True,
    stop_on_failed_validation: bool = True  # <-- new flag
) -> dict[str, Any]:
    """
    Executes a plan based ONLY on tools (no SQL visible in the plan).
    - Runs tools step by step and stores results in the context.
    - Performs validations using tools.
    - Write tools return updated DataFrames and the executor applies them automatically.
    - If stop_on_failed_validation=True, execution halts at the first failed validation.
    """
    con = create_duckdb_with_views(inventory_df, transaction_df)
    ctx: dict[str, Any] = {
        "__con__": con,
        "__frames__": {"inventory_df": inventory_df.copy(), "transaction_df": transaction_df.copy()}
    }

    report: dict[str, Any] = {"ok": True, "steps": []}
    try:
        for step in plan.get("steps", []):
            step_number = step.get("step_number")
            description = step.get("description", "")

            # tools
            tool_error = None
            try:
                ran = tools.run_tools_for_step(step, ctx)
            except Exception as e:
                ran = {}
                tool_error = str(e)
                report["ok"] = False

            # validations
            validations = [tools.run_tool_validation(v, ctx) for v in step.get("validations", [])]
            step_ok = (tool_error is None) and all(v.get("ok", False) for v in validations)
            if not step_ok:
                report["ok"] = False

            # record the step
            report["steps"].append({
                "step_number": step_number,
                "description": description,
                "tools_run": list(ran.keys()),
                "tool_error": tool_error,
                "validations": validations,
            })

            # stop execution if a validation failed
            if stop_on_failed_validation and any(not v.get("ok", False) for v in validations):
                report["aborted"] = True
                report["abort_step"] = step_number
                report["abort_reason"] = "validation_failed"
                break

    finally:
        con.close()

    if return_updated_frames:
        report["updated_frames"] = {
            "inventory_df": ctx["__frames__"]["inventory_df"],
            "transaction_df": ctx["__frames__"]["transaction_df"],
        }

    return report

<div style="background-color:#ffe4e1; padding:12px; border-radius:6px; color:black;">  
<strong>Note:</strong> The execution process follows these stages:  
<ol type="a">  
<li>A DuckDB connection is created and working copies of the DataFrames are stored in a <code>context</code>.</li>  
<li>Each step in the plan is executed in sequence, with tools run and results captured in the report.</li>  
<li>Validations are applied as tool calls (instead of exceptions).</li>  
<li>If any tool or validation fails, the report is marked as <code>False</code>, andâ€”if <code>stop_on_failed_validation=True</code>â€”execution halts immediately.</li>  
<li>Write tools update the DataFrames automatically, and the updated frames can optionally be returned at the end.</li>  
</ol>  
</div>  

### 2.2 Testing

In [46]:
# Minimal one-step plan: lookup the product "Aviator"
simple_plan = {
    "reasoning": "User wants to check availability of Aviator sunglasses.",
    "steps": [
        {
            "step_number": 1,
            "description": "Lookup Aviator sunglasses in inventory",
            "tools": [
                {"use": "get_inventory_data", "args": {"product_name": "Aviator"}, "result_key": "prod"}
            ],
            "validations": [
                {"name": "product_found", "use_tool": "assert_true", "args": {"value_from": "context.prod.item"}}
            ]
        }
    ]
}

# Run the plan with current inventory and transactions
report = execute_plan_tools_only(
    plan=simple_plan,
    inventory_df=inventory_df,
    transaction_df=transaction_df
)

In [47]:
report

{'ok': True,
 'steps': [{'step_number': 1,
   'description': 'Lookup Aviator sunglasses in inventory',
   'tools_run': ['prod'],
   'tool_error': None,
   'validations': [{'name': 'product_found',
     'ok': True,
     'result': {'ok': True}}]}],
 'updated_frames': {'inventory_df':        name item_id                                        description  \
  0   Aviator   SG001  Originally designed for pilots, these teardrop...   
  1  Wayfarer   SG002  Featuring thick, angular frames that make a st...   
  2  Mystique   SG003  Inspired by 1950s glamour, these frames sweep ...   
  3     Sport   SG004  Designed for active lifestyles, these wraparou...   
  4     Round   SG005  Circular lenses set in minimalist frames creat...   
  
     quantity_in_stock  price  
  0                 23    103  
  1                  6     92  
  2                  3     88  
  3                 11    144  
  4                 10     86  ,
  'transaction_df':   transaction_id    customer_name             t

In [44]:
report.keys()

dict_keys(['ok', 'steps', 'updated_frames'])

In [45]:
report['steps']

[{'step_number': 1,
  'description': 'Lookup Aviator sunglasses in inventory',
  'tools_run': ['prod'],
  'tool_error': None,
  'validations': [{'name': 'product_found',
    'ok': True,
    'result': {'ok': True}}]}]

## 3. Agentic Workflow Steps

### 3.1 Planning Step

The planning step takes a **customer query** (e.g., *"Buy 2 Aviators"*) and transforms it into a structured plan made up entirely of tool calls

- Which tools to use  
- How to sequence them  
- What validations to apply  

The output is a JSON object containing `reasoning` and `steps`.  This structured format ensures the plan is both **machine-readable** and **auditable**.

#### 3.1.1 Prompt

In [50]:
# =========================
# Planning spec (TOOLS-ONLY) and planning workflow
# =========================

# Shared planning spec: TOOLS ONLY (no raw SQL in the plan)
PLANNING_SPEC_TOOLS_ONLY = """
You are a planning system for a sunglasses store. Produce a FULL, AUTONOMOUS plan using TOOLS ONLY.
We will run this plan against two pandas DataFrames registered in DuckDB as views:
- inventory_df(name, item_id, description, quantity_in_stock, price)
- transaction_df(transaction_id, customer_name, transaction_summary, transaction_amount, balance_after_transaction)

Customer intents include:
- Purchase: "I want to buy 3 Aviators"
- Return: "I'd like to return two Sport sunglasses"
- Inquiry: "Do you have Mystique glasses?"
- Browse: "Show me what's available"

IMPORTANT: ALLOWED TOOLS ONLY (do NOT invent new tools)
Tool catalog (names, exact args, outputs):
1) get_inventory_data
   - args: { product_name?: string, item_id?: string }
   - returns: { rows: DataFrame, match_count: int, item: dict|null }
   - notes: Use this for product lookup (case-insensitive by name) or by item_id.
2) get_transaction_data
   - args: { mode?: "last_balance" }
   - returns: { mode: string, last_txn_id: string|null, last_balance: number }
3) compute_total
   - args: { qty: number, price: number }
   - returns: { amount: number }
4) compute_refund
   - args: { qty: number, price: number }   # Refund is negative by design
   - returns: { amount: number }
5) update_inventory
   - args: { item_id: string, delta?: number, quantity_new?: number }
   - returns: { inventory_df: DataFrame, updated: { item_id: string, quantity_in_stock: number } }
   - notes: For purchase use delta = -qty. For return use delta = +qty.
6) append_transaction
   - args: { customer_name: string, summary: string, amount: number }
   - returns: { transaction_df: DataFrame, transaction: { ... } }
7) assert_true
   - args: { value: any }                    # passes if truthy (non-null/non-zero/non-empty)
   - returns: { ok: boolean }
8) assert_nonnegative_stock
   - args: { inventory_df: DataFrame, item_id: string }
   - returns: { ok: boolean, qty: number }

STRICT RULES:
1) Return VALID JSON ONLY with keys: reasoning, steps.
2) Each step MUST contain:
   - "step_number": integer
   - "description": short human text
   - "tools": an array of tool calls in order. Each tool call is:
       {"use": "<tool_name>", "args": {...}, "result_key": "<context_key>"}
     * You MAY reference previous results using dotted paths starting with "context.", e.g., "context.prod.item.price".
     * Use *_from to resolve from context, e.g., {"price_from": "context.prod.item.price"}.
     * Use ONLY the tools listed above. Do NOT use names like assert_one, assert_gt, assert_contains, format_return_summary, lookup_product, propose_transaction, etc.
     * Strings like the transaction summary MUST be composed inline by you (e.g., "Return 2 Sport sunglasses").
   - "validations": array of tool validations:
       {"name": "...", "use_tool": "<tool_name>", "args": {...}}
     * Allowed validation tools: assert_true, assert_nonnegative_stock ONLY.
     * Examples:
         - product_found: assert_true with {"value_from": "context.prod.item"} (non-null)
         - nonnegative_stock_after_update: assert_nonnegative_stock with {"inventory_df_from": "context.__frames__.inventory_df", "item_id_from": "context.prod.item.item_id"}
3) Do NOT include raw SQL in the plan. Tools run any needed SQL internally.
4) For purchases/returns, include tool calls to:
   - Lookup product via get_inventory_data (case-insensitive by name)
   - (Purchase only) compute_total; (Return only) compute_refund
   - Create a clear summary STRING inline (e.g., "Purchase 3 Aviator sunglasses" / "Return 2 Sport sunglasses")
   - Update inventory via update_inventory (delta = -qty for purchases, +qty for returns)
   - Append the transaction via append_transaction (amount from compute_total/compute_refund)
5) Use canonical arg names exactly as in the Tool catalog:
   - quantity -> use qty
   - unit_price -> use price
   - Do NOT add extra args like sign; compute_refund already returns negative amounts.

OUTPUT JSON SHAPE:
{
  "reasoning": "...",
  "steps": [
    {
      "step_number": 1,
      "description": "...",
      "tools": [ {"use": "...", "args": {...}, "result_key": "..."} ],
      "validations": [ {"name": "...", "use_tool": "...", "args": {...}} ]
    }
  ]
}

EXAMPLE (Return 2 Sport sunglasses for a walk-in):
{
  "reasoning": "User requests a return of 2 Sport units. We'll lookup product, compute a negative refund, update stock (+2), and append a refund transaction.",
  "steps": [
    {
      "step_number": 1,
      "description": "Lookup product 'Sport' and capture item details",
      "tools": [
        {"use": "get_inventory_data", "args": {"product_name": "Sport"}, "result_key": "prod"}
      ],
      "validations": [
        {"name": "product_found", "use_tool": "assert_true", "args": {"value_from": "context.prod.item"}}
      ]
    },
    {
      "step_number": 2,
      "description": "Compute refund amount for qty=2",
      "tools": [
        {"use": "compute_refund", "args": {"qty": 2, "price_from": "context.prod.item.price"}, "result_key": "refund"}
      ],
      "validations": []
    },
    {
      "step_number": 3,
      "description": "Update inventory by adding returned quantity",
      "tools": [
        {"use": "update_inventory", "args": {"item_id_from": "context.prod.item.item_id", "delta": 2}, "result_key": "inv_after"}
      ],
      "validations": [
        {"name": "stock_nonnegative", "use_tool": "assert_nonnegative_stock",
         "args": {"inventory_df_from": "context.__frames__.inventory_df", "item_id_from": "context.prod.item.item_id"}}
      ]
    },
    {
      "step_number": 4,
      "description": "Append the refund transaction for a walk-in customer",
      "tools": [
        {"use": "append_transaction",
         "args": {
           "customer_name": "WALK_IN_CUSTOMER",
           "summary": "Return 2 Sport sunglasses",
           "amount_from": "context.refund.amount"
         },
         "result_key": "txn"}
      ],
      "validations": [
        {"name": "transaction_created", "use_tool": "assert_true",
         "args": {"value_from": "context.txn.transaction.transaction_id"}}
      ]
    }
  ]
}
"""

In [49]:
# def generate_plan(user_query: str, model: str = "o4-mini") -> dict[str, Any]:
#     context = f"{PLANNING_SPEC_TOOLS_ONLY}\n\nCustomer query: {user_query}\nProduce the plan now."
#     resp = client.chat.completions.create(
#         model=model,
#         messages=[
#             {"role": "system", "content": "Return JSON ONLY following the TOOLS-ONLY planning spec."},
#             {"role": "user", "content": context},
#         ],
#         response_format={"type": "json_object"},
#     )
#     return json.loads(resp.choices[0].message.content)

In [58]:
def generate_plan(client: genai.Client, user_query: str, model: str = "o4-mini") -> dict[str, Any]:
    context = f"{PLANNING_SPEC_TOOLS_ONLY}\n\nCustomer query: {user_query}\nProduce the plan now."

    config = types.GenerateContentConfig(
        system_instruction="Return JSON ONLY following the TOOLS-ONLY planning spec."
    )

    resp = get_response(client, model, config, context)
   
    return resp.text

#### 3.1.2 Testing

In [59]:
# =========================
# Try the planning step
# =========================
user_query = "I'd like to return two Aviator sunglasses"

draft_plan = generate_plan(client, user_query, model="gemini-2.5-flash")

In [None]:
print(draft_plan)

```json
{
  "reasoning": "The user wants to return 2 Aviator sunglasses. This involves looking up the product, calculating the refund amount, updating the inventory to add the returned items back, and recording the transaction.",
  "steps": [
    {
      "step_number": 1,
      "description": "Lookup the product 'Aviator' to get its details, including price and item_id.",
      "tools": [
        {
          "use": "get_inventory_data",
          "args": {
            "product_name": "Aviator"
          },
          "result_key": "prod"
        }
      ],
      "validations": [
        {
          "name": "product_found",
          "use_tool": "assert_true",
          "args": {
            "value_from": "context.prod.item"
          }
        }
      ]
    },
    {
      "step_number": 2,
      "description": "Calculate the refund amount for 2 units of Aviator sunglasses.",
      "tools": [
        {
          "use": "compute_refund",
          "args": {
            "qty": 2,
         

In [73]:
import re
import json

def extract_json_from_text(s: str) -> dict[str, Any]:
    # If the string starts/ends with a single quote, remove it
    if s.startswith("'") and s.endswith("'"):
        s = s[1:-1]

    # Extract JSON between ```json and ```
    m = re.search(r"```(?:json)?\s*(\{[\s\S]*\})\s*```", s, re.DOTALL)
    if not m:
        # fallback: extract first { .. last } block
        start = s.find('{')
        end = s.rfind('}')
        if start != -1 and end != -1:
            json_text = s[start:end+1]
        else:
            raise ValueError("Could not find JSON in the string")
    else:
        json_text = m.group(1)

    # parse
    data = json.loads(json_text)
    return data

In [75]:
draft_plan = extract_json_from_text(draft_plan)

In [81]:
print(json.dumps(draft_plan, indent=2))

{
  "reasoning": "The user wants to return 2 Aviator sunglasses. This involves looking up the product, calculating the refund amount, updating the inventory to add the returned items back, and recording the transaction.",
  "steps": [
    {
      "step_number": 1,
      "description": "Lookup the product 'Aviator' to get its details, including price and item_id.",
      "tools": [
        {
          "use": "get_inventory_data",
          "args": {
            "product_name": "Aviator"
          },
          "result_key": "prod"
        }
      ],
      "validations": [
        {
          "name": "product_found",
          "use_tool": "assert_true",
          "args": {
            "value_from": "context.prod.item"
          }
        }
      ]
    },
    {
      "step_number": 2,
      "description": "Calculate the refund amount for 2 units of Aviator sunglasses.",
      "tools": [
        {
          "use": "compute_refund",
          "args": {
            "qty": 2,
            "pric

### 3.2 Reflection Step

The reflection step acts as a **quality control process**:  

- It repairs **minor issues** (e.g., invalid JSON escapes).  
- It critiques the draft and produces a corrected `revised_plan` that follows the **TOOLS-ONLY spec**.  
- It guarantees structure by always returning a JSON object with two keys:  
  - `critique` â†’ feedback on the draft  
  - `revised_plan` â†’ the corrected plan  

If the reflection output is malformed, the system gracefully falls back to the original draft.

#### 3.2.1 reflect_on_plan

In [95]:
# =========================
# Reflection step (enforces the same TOOLS-ONLY spec)
# =========================
_ALLOWED_ESC = r'["\\/bfnrtu]'
def _repair_invalid_json_escapes(s: str) -> str:
    s = s.replace("\\'", "'")
    return re.sub(rf'\\(?!{_ALLOWED_ESC})', r'', s)

def _parse_json_or_repair(s: str) -> dict[str, Any]:
    try:
        return json.loads(s)
    except Exception:
        # return json.loads(_repair_invalid_json_escapes(s))
        return extract_json_from_text(s)

def reflect_on_plan(client: genai.Client, user_query: str, draft_plan: dict[str, Any], model: str = "gemini-2.5-flash") -> dict[str, Any]:
    sys = (
        "You are a senior plan reviewer. Return STRICT JSON with keys "
        "'critique' (string) and 'revised_plan' (object). The revised_plan MUST follow the TOOLS-ONLY spec."
    )
    user = (
        "TOOLS-ONLY PLANNING SPEC (enforce exactly):\n"
        f"{PLANNING_SPEC_TOOLS_ONLY}\n\n"
        "Customer query:\n"
        f"{user_query}\n\n"
        "Draft plan (JSON):\n"
        f"{json.dumps(draft_plan, ensure_ascii=False)}\n\n"
        "Task: Critique the draft against the spec and return a corrected 'revised_plan' if needed. "
        "Ensure valid JSON and that no raw SQL appears in the plan (only tool calls)."
    )
    # resp = client.chat.completions.create(
    #     model=model,
    #     messages=[{"role": "system", "content": sys},
    #               {"role": "user", "content": user}],
    #     response_format={"type": "json_object"},
    # )
    # data = _parse_json_or_repair(resp.choices[0].message.content)

    config = types.GenerateContentConfig(
        system_instruction=sys
    )

    resp = get_response(client, model, config, user)

    data = _parse_json_or_repair(resp.text)
    if "revised_plan" not in data or not isinstance(data["revised_plan"], dict):
        if "steps" in data:
            data = {"critique": "No explicit critique provided.", "revised_plan": data}
        else:
            data = {"critique": "Malformed reflection output; falling back to draft.", "revised_plan": draft_plan}
    return data

#### 3.2.2 Testing

For this, you will intentionally craft a slightly flawed draft plan (e.g., wrong argument names and missing validations). Run the next cell to see how this step produces a **critique** and a **revised_plan** that complies with the TOOLS-ONLY spec.

In [None]:
# A purposely imperfect draft plan (arg mismatch + missing validation)
draft_plan = {
    "reasoning": "User wants to buy 2 Aviators.",
    "steps": [
        {
            "step_number": 1,
            "description": "Lookup product 'Aviator'",
            "tools": [
                {"use": "get_inventory_data", "args": {"product_name": "Aviator"}, "result_key": "prod"}
            ],
            "validations": []  # (missing product_found validation)
        },
        {
            "step_number": 2,
            "description": "Compute total for purchase",
            "tools": [
                # <-- wrong arg name: should be qty
                {"use": "compute_total", "args": {"quantity": 2, "price_from": "context.prod.item.price"}, "result_key": "total"}
            ],
            "validations": []
        }
    ]
}

user_query = "I'd like to buy 2 Aviator sunglasses for a walk-in customer"

# Run the reflection step to critique and revise the draft
reflection = reflect_on_plan(client, user_query=user_query, draft_plan=draft_plan, model="gemini-2.5-flash")

# Display results
utils.print_html(reflection.get("critique"), title="Reflection Critique")
utils.print_html(json.dumps(reflection.get("revised_plan"), indent=2), title="Revised Plan (TOOLS-ONLY compliant)")

### 3.3 Error Explanation Step

Even with validations in place, execution may still fail â€” perhaps due to insufficient stock or a malformed plan. The error explanation step translates these failures into **human-readable guidance**.  

- It takes the **customer query** and the **execution report** as input.  
- It uses a model to **summarize errors in plain language**.  
- It outputs clear feedback, e.g.: *"Stock would go negative. Try reducing the quantity."*  

This closes the loop, helping learners connect technical validation errors to **actionable insights**.

<div style="background-color:#ffe4e1; padding:12px; border-radius:6px; color:black;">
  <strong>Note:</strong> Use the <em>Error Explanation Step</em> when <code>report['ok']</code> is <code>False</code>.  
  This allows you to turn a failed execution report into a clear, plain-English explanation of what went wrong and how to fix it.
</div>

#### 3.3.1 explain_execution_error

In [96]:
def explain_execution_error(client: genai.Client, user_query: str, execution_report: dict[str, Any], model: str = "gemini-2.5-flash") -> str:
    sys = (
        "You are a senior plan reviewer. Given a json with the report, "
        "explain in simple terms what went wrong and how to fix it."
    )
    user = (
        "Customer query:\n"
        f"{user_query}\n\n"
        "Execution report (JSON):\n"
        f"{execution_report}\n\n"
        "Task: Explain in simple terms what went wrong and how to fix it."
    )
    # resp = client.chat.completions.create(
    #     model=model,
    #     messages=[{"role": "system", "content": sys},
    #               {"role": "user", "content": user}],
    #     response_format={"type": "text"},
    # )

    config = types.GenerateContentConfig(
        system_instruction=sys
    )
    resp = get_response(client, model, config, user)

    return resp.text

#### 3.3.2 Testing

To demonstrate, we'll simulate a failed execution where a purchase request would make the stock go negative. The step will take the **customer query** and the **execution report**, then translate the technical error into simple guidance.

Run the next cell to see how it summarizes the problem and suggests a fix.

In [90]:
# Example: simulate a failed execution report
failed_report = {
    "ok": False,
    "steps": [
        {
            "step_number": 1,
            "description": "Lookup product 'Aviator'",
            "tools_run": ["get_inventory_data"],
            "tool_error": None,
            "validations": [{"name": "product_found", "ok": True}],
        },
        {
            "step_number": 2,
            "description": "Update inventory by subtracting 999 units",
            "tools_run": ["update_inventory"],
            "tool_error": None,
            "validations": [{"name": "stock_nonnegative", "ok": False, "qty": -995}],
        },
    ],
    "aborted": True,
    "abort_step": 2,
    "abort_reason": "validation_failed",
}

# Use the error explanation step to explain the error
explanation = explain_execution_error(
    client,
    user_query="Buy 999 Aviator sunglasses",
    execution_report=failed_report
)

utils.print_html(content=explanation, title="Error Explanation")

## End-to-end pipeline

Now that you have tools, an executor, and workflow steps in place, you will run the whole pipeline from a single prompt. This walkthrough helps you see the full flow and where each piece fits.

**What you will do:**

a) Preview the initial data.  
b) Build a **tools-only** plan from a natural-language request.  
c) Run a **reflection** pass to enforce the spec and fix small issues.  
d) Execute the final plan safely.  
e) Inspect the step-by-step report and updated tables.  
f) (If something fails) Generate a plain-English error explanation.

**Prompt ideas (expected behavior)**

â€” "I'd like to **buy five Mystique** sunglasses for a walk-in customer" â†’ **Fail** (would drive stock negative)  
â€” "I'd like to **return two Aviator** sunglasses for a walk-in customer" â†’ **Pass**  
â€” "I'd like to **buy ten Sport** sunglasses for a walk-in customer" â†’ **Pass**

In [100]:
def run_workflow(
        client: genai.Client, 
        user_prompt: str, 
        inventory_df: pd.DataFrame, 
        transaction_df: pd.DataFrame, 
        model_planner: str,
        model_reflector: str,
        model_analysis: str,
        ) -> None:
    utils.print_html(user_prompt, title="User Prompt")

    # Assumes you already created inventory_df and transaction_df.
    utils.print_html(inventory_df, title="Initial Inventory (sample)")
    utils.print_html(transaction_df, title="Initial Transactions (tail)")


    # 1) Create a plan (TOOLS-ONLY)
    draft_plan = generate_plan(client, user_prompt, model=model_planner)
    utils.print_html(json.dumps(draft_plan, indent=2), title="Draft Plan")

    # 2) Reflect and possibly revise
    reflection = reflect_on_plan(client, user_prompt, draft_plan, model=model_reflector)
    utils.print_html(reflection.get("critique"), title="Reflection Critique")
    final_plan = reflection["revised_plan"]
    utils.print_html(json.dumps(final_plan, indent=2), title="Final Plan (After Reflection)")

    # 3) Execute the final plan (no 'apply' arg â€” executor auto-applies when tools return updated DataFrames)
    report = execute_plan_tools_only(
        final_plan,
        inventory_df,
        transaction_df,
        return_updated_frames=True
    )

    # 4) Show the execution report
    utils.print_html(f"Overall execution status: {'SUCCESS' if report['ok'] else 'FAILED'}", title="Execution Status")
    for step in report["steps"]:
        utils.print_html(json.dumps(step, indent=2), title=f"Execution Report Step {step['step_number']}: {step['description']}")

    if report["ok"]:
        inventory_df = report["updated_frames"]["inventory_df"]
        transaction_df = report["updated_frames"]["transaction_df"]
        utils.print_html(inventory_df, title="Updated Inventory")
        utils.print_html(transaction_df, title="Updated Transactions")
    else:
        utils.print_html("Some validations failed, no transactions were made â€” check the message below.", title="Execution Status")
        error_ = explain_execution_error(client, user_prompt, report, model=model_analysis)
        utils.print_html(f"<pre>{error_}</pre>", title="Error Explanation")


In [101]:
user_prompt = "I'd like to return two Aviator sunglasses for a walk-in customer"
model_planner = "gemini-2.5-flash"
model_reflector = "gemini-2.5-flash"
model_analysis = "gemini-2.5-flash"

run_workflow(client, user_prompt, inventory_df, transaction_df, model_planner, model_reflector, model_analysis)

name,item_id,description,quantity_in_stock,price
Aviator,SG001,"Originally designed for pilots, these teardrop-shaped lenses with thin metal frames offer timeless appeal. The large lenses provide excellent coverage while the lightweight construction ensures comfort during long wear.",23,103
Wayfarer,SG002,"Featuring thick, angular frames that make a statement, these sunglasses combine retro charm with modern edge. The rectangular lenses and sturdy acetate construction create a confident look.",6,92
Mystique,SG003,"Inspired by 1950s glamour, these frames sweep upward at the outer corners to create an elegant, feminine silhouette. The subtle curves and often embellished temples add sophistication to any outfit.",3,88
Sport,SG004,"Designed for active lifestyles, these wraparound sunglasses feature a single curved lens that provides maximum coverage and wind protection. The lightweight, flexible frames include rubber grips.",11,144
Round,SG005,"Circular lenses set in minimalist frames create a thoughtful, artistic appearance. These sunglasses evoke a scholarly or creative vibe while remaining effortlessly stylish.",10,86


transaction_id,customer_name,transaction_summary,transaction_amount,balance_after_transaction
TXN001,OPENING_BALANCE,Daily opening register balance,500.0,500.0


name,item_id,description,quantity_in_stock,price
Aviator,SG001,"Originally designed for pilots, these teardrop-shaped lenses with thin metal frames offer timeless appeal. The large lenses provide excellent coverage while the lightweight construction ensures comfort during long wear.",25,103
Wayfarer,SG002,"Featuring thick, angular frames that make a statement, these sunglasses combine retro charm with modern edge. The rectangular lenses and sturdy acetate construction create a confident look.",6,92
Mystique,SG003,"Inspired by 1950s glamour, these frames sweep upward at the outer corners to create an elegant, feminine silhouette. The subtle curves and often embellished temples add sophistication to any outfit.",3,88
Sport,SG004,"Designed for active lifestyles, these wraparound sunglasses feature a single curved lens that provides maximum coverage and wind protection. The lightweight, flexible frames include rubber grips.",11,144
Round,SG005,"Circular lenses set in minimalist frames create a thoughtful, artistic appearance. These sunglasses evoke a scholarly or creative vibe while remaining effortlessly stylish.",10,86


transaction_id,customer_name,transaction_summary,transaction_amount,balance_after_transaction
TXN001,OPENING_BALANCE,Daily opening register balance,500.0,500.0
TXN002,WALK_IN_CUSTOMER,Return 2 Aviator sunglasses,-206.0,294.0
