# Level 2 - Week 7 - 03 Trace Debugging

**Estimated time:** 60-90 minutes

## Learning Objectives

- Record decisions per step
- Summarize tool outputs
- Make traces readable


## Overview

If you cannot explain why the agent chose a tool, you donâ€™t control it.

Traces make behavior diagnosable.

## Underlying theory: a trace is a causal explanation

Each step is a decision made from state:

$$
\text{decision}_t = \pi(s_t)
$$

So a trace must record enough of $s_t$ (task, plan, evidence summaries, error history) to explain why $\pi$ chose that decision.

## What a good trace contains

- `request_id`
- `step_index`
- `tool_call`
- `tool_result` (or short `output_summary`)
- `decision`
- `error` (if any)

Practical rule:

- if you cannot reproduce the decision from the trace, the trace is missing critical fields

## Practice Steps

- Create a trace object with steps.
- Add `output_summary` (short) and optionally store full outputs separately.
- Add `error` fields and confirm failures are explainable from the trace.

### Sample code

Trace structure with steps and decisions.


In [None]:
trace = {
    'request_id': 'req-123',
    'steps': [
        {'step_index': 1, 'decision': 'call_search'},
        {'step_index': 2, 'tool': 'search', 'output_summary': '5 hits'},
        {'step_index': 3, 'decision': 'write_answer'},
    ],
}

print(trace)


### Student fill-in

Add model_input_summary or error fields.


In [None]:
import json


def add_tool_step(trace: dict, step_index: int, tool: str, tool_input: dict, tool_output: dict | None, error: str | None = None) -> None:
    output_summary = None
    if tool_output is not None:
        if tool == "search":
            output_summary = f"{len(tool_output.get('hits', []))} hits"
        else:
            output_summary = "ok"

    trace["steps"].append(
        {
            "step_index": step_index,
            "tool": tool,
            "input": tool_input,
            "output_summary": output_summary,
            "error": error,
        }
    )


def add_decision_step(trace: dict, step_index: int, decision: str, model_input_summary: str | None = None) -> None:
    trace["steps"].append({"step_index": step_index, "decision": decision, "model_input_summary": model_input_summary})


trace = {"request_id": "req-123", "steps": []}

add_decision_step(trace, 1, decision="call_search", model_input_summary="need evidence for refund policy")
add_tool_step(trace, 2, tool="search", tool_input={"query": "refund policy", "top_k": 3}, tool_output={"hits": ["kb#001"]})
add_decision_step(trace, 3, decision="write_answer_with_citations", model_input_summary="have evidence; proceed")
add_tool_step(trace, 4, tool="write_answer", tool_input={"question": "refund policy"}, tool_output=None, error="timeout")
add_decision_step(trace, 5, decision="fallback_refuse", model_input_summary="tool failed; return safe response")

print(json.dumps(trace, indent=2))

## Self-check

- Can you explain each decision from the trace?
- Is output_summary short and clear?
