```{contents}
```
## Tracing 

**Tracing** in LangGraph is the systematic capture of **execution events, state transitions, decisions, tool calls, and model interactions** during the runtime of a graph.
It enables **observability, debugging, performance analysis, auditing, and governance** of complex LLM workflows.

LangGraph tracing is built on **LangChain’s callback system** and integrates natively with **LangSmith**.

---

### **1. Why Tracing Is Essential**

LLM systems are:

* Non-deterministic
* Multi-step
* Stateful
* Distributed
* Autonomous

Without tracing, failures are **invisible** and optimization is **guesswork**.

| Capability   | Without Tracing | With Tracing  |
| ------------ | --------------- | ------------- |
| Debugging    | Blind           | Deterministic |
| Performance  | Unknown         | Measured      |
| Compliance   | Impossible      | Auditable     |
| Cost control | Untracked       | Precise       |
| Safety       | Unverifiable    | Enforced      |

---

### **2. What Gets Traced**

LangGraph emits structured events at each execution stage.

| Layer   | Traced Elements                 |
| ------- | ------------------------------- |
| Graph   | Node entry/exit, edge traversal |
| State   | State diffs, versions           |
| LLM     | Prompts, responses, tokens      |
| Tools   | Inputs, outputs, latency        |
| Routing | Branch decisions                |
| Errors  | Exceptions, retries             |
| Timing  | Latency per node                |
| Cost    | Token usage, cost per run       |

---

### **3. Tracing Architecture**

```
LangGraph Runtime
   |
Callback Manager
   |
Tracer (LangSmith / Custom)
   |
Trace Store → UI / Analytics / Alerts
```

Each run receives a unique **Run ID** and **Thread ID**.

---

### **4. Enabling Tracing**

### Environment Setup

```bash
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=your_key
```

### Basic Tracing Example

```python
from langchain.callbacks.tracers import LangChainTracer
from langchain.callbacks.manager import CallbackManager

tracer = LangChainTracer()
callback_manager = CallbackManager([tracer])

graph.invoke(input, config={"callbacks": callback_manager})
```

---

### **5. Trace Data Model**

Each trace forms a **hierarchical execution tree**:

```
Graph Run
 ├─ Node: reason
 │   ├─ LLM Call
 │   └─ Tool Call
 ├─ Node: act
 └─ Node: observe
```

Each node contains:

* Inputs
* Outputs
* State before/after
* Timing
* Cost
* Metadata

---

### **6. Visualizing Traces (LangSmith)**

LangSmith UI provides:

* Timeline view
* Node execution tree
* Token & cost dashboard
* Failure inspection
* State diff viewer
* Reproducible runs

---

### **7. Custom Tracing**

```python
class MyTracer(BaseCallbackHandler):
    def on_chain_start(self, serialized, inputs, **kwargs):
        print("START:", serialized["name"])

    def on_chain_end(self, outputs, **kwargs):
        print("END:", outputs)
```

```python
graph.invoke(input, config={"callbacks": [MyTracer()]})
```

---

### **8. Production Use Cases**

| Use Case           | How Tracing Helps        |
| ------------------ | ------------------------ |
| Debugging          | Identify faulty node     |
| Cost optimization  | Find expensive steps     |
| Safety audits      | Verify tool usage        |
| Compliance         | Immutable execution logs |
| Performance tuning | Detect slow components   |
| Incident response  | Replay failures          |

---

### **9. Advanced Features**

| Feature          | Purpose                |
| ---------------- | ---------------------- |
| Trace replay     | Reproduce failures     |
| State diffing    | Inspect data evolution |
| Span correlation | Distributed tracing    |
| Sampling         | Reduce trace volume    |
| Alert hooks      | Trigger on anomalies   |

---

### **10. Best Practices**

* Trace **every production run**
* Store traces immutably
* Alert on error patterns
* Track cost per node
* Use trace replay for debugging
* Correlate traces with user sessions

---

### **Mental Model**

Tracing turns LangGraph from a black box into a **fully observable distributed system**, where every decision, cost, and failure is measurable and controllable.


### Demonstration

In [2]:
# ===== 1. Enable Tracing =====
import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = "langgraph-tracing-demo"
# os.environ["LANGCHAIN_API_KEY"] = "YOUR_API_KEY"   # set once in your system

# ===== 2. Build a Simple Cyclic LangGraph =====
from typing import TypedDict
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from langchain_classic.callbacks.tracers import LangChainTracer
from langchain_classic.callbacks.manager import CallbackManager

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

class State(TypedDict):
    question: str
    answer: str
    step: int

def reason(state):
    response = llm.invoke(f"Answer briefly: {state['question']}")
    return {"answer": response.content, "step": state["step"] + 1}

def check(state):
    return END if state["step"] >= 1 else "reason"

builder = StateGraph(State)
builder.add_node("reason", reason)
builder.add_node("check", check)

builder.set_entry_point("reason")
builder.add_edge("reason", "check")
builder.add_conditional_edges("check", check, {"reason": "reason", END: END})

graph = builder.compile()

# ===== 3. Attach Tracer =====
tracer = LangChainTracer()
callback_manager = CallbackManager([tracer])

# ===== 4. Invoke Graph with Tracing =====
result = graph.invoke(
    {"question": "What is LangGraph?", "step": 0},
    config={"callbacks": callback_manager}
)

print("Final Output:", result)


Failed to multipart ingest runs: langsmith.utils.LangSmithAuthError: Authentication failed for https://api.smith.langchain.com/runs/multipart. HTTPError('401 Client Error: Unauthorized for url: https://api.smith.langchain.com/runs/multipart', '{"error":"Unauthorized"}\n')trace=019b6df8-7894-7791-a4be-b4fc7ac04b95,id=019b6df8-7894-7791-a4be-b4fc7ac04b95; trace=019b6df8-7b5e-7653-994b-3d69ecaf4742,id=019b6df8-7b5e-7653-994b-3d69ecaf4742


InvalidUpdateError: Expected dict, got __end__
For troubleshooting, visit: https://docs.langchain.com/oss/python/langgraph/errors/INVALID_GRAPH_NODE_RETURN_VALUE

Failed to send compressed multipart ingest: langsmith.utils.LangSmithAuthError: Authentication failed for https://api.smith.langchain.com/runs/multipart. HTTPError('401 Client Error: Unauthorized for url: https://api.smith.langchain.com/runs/multipart', '{"error":"Unauthorized"}\n')trace=019b6df8-7b5e-7653-994b-3d69ecaf4742,id=019b6df8-7b5e-7653-994b-3d69ecaf4742; trace=019b6df8-7894-7791-a4be-b4fc7ac04b95,id=019b6df8-7894-7791-a4be-b4fc7ac04b95
