```{contents}
```
## Run ID 

A **Run ID** in LangGraph is a **globally unique identifier** assigned to each execution of a graph.
It provides **traceability, reproducibility, debugging, observability, and lifecycle management** for every workflow invocation.

---

### **1. Why Run ID Exists**

Every graph execution is a **distributed computation** involving:

* Multiple nodes
* Possibly multiple agents
* External tools
* Human interventions
* Long-running state transitions

Without a Run ID, it is impossible to:

* Correlate logs
* Trace execution steps
* Resume failed runs
* Audit decisions
* Reproduce outputs

The **Run ID is the primary execution handle** for all of the above.

---

### **2. Where Run ID Lives in the System**

```
Client Request
   ↓
LangGraph Runtime
   ↓
[ Run ID ]  →  Execution Engine  →  State Store → Logs → Traces → Metrics
```

Each run produces:

* One Run ID
* Many node executions
* Many state transitions
* Many logs and events

---

### **3. Creation & Structure**

When invoking a graph:

```python
result = graph.invoke(input_data)
```

LangGraph internally generates:

```text
run_id = UUID4
```

Example:

```
"6bde8b3c-2e3a-4c7a-94f7-0f3e0b71c2ab"
```

You may also supply one explicitly:

```python
result = graph.invoke(
    input_data,
    config={"run_id": "customer-42-request-7"}
)
```

---

### **4. What Run ID Controls**

| Component        | How Run ID Is Used          |
| ---------------- | --------------------------- |
| State store      | Key for persistent state    |
| Checkpoint store | Resume & replay             |
| Logs             | Correlation key             |
| Traces           | Distributed tracing         |
| Metrics          | Per-run statistics          |
| Human review     | Attach comments & approvals |
| Recovery         | Restart from failure        |
| Auditing         | Legal & compliance record   |

---

### **5. Run ID vs Thread ID**

| Concept   | Purpose                                  |
| --------- | ---------------------------------------- |
| Run ID    | One execution instance                   |
| Thread ID | Long-lived conversation / session        |
| State     | Can span multiple runs under same thread |

```
Thread ID
 ├── Run ID #1
 ├── Run ID #2
 └── Run ID #3
```

---

### **6. Production Workflow Example**

```python
config = {
    "run_id": "order-88291-validation",
    "thread_id": "customer-88291"
}

result = graph.invoke(data, config=config)
```

Now every log, error, tool call, checkpoint, and metric is attached to:

```
thread_id = customer-88291
run_id    = order-88291-validation
```

---

### **7. Recovery & Replay**

If a crash occurs:

```python
graph.resume(run_id="order-88291-validation")
```

The engine:

1. Loads last checkpoint using Run ID
2. Restores state
3. Continues execution

---

### **8. Observability Integration**

With LangSmith or OpenTelemetry:

```
TraceID  ←→  Run ID
SpanID   ←→  Node Execution
```

This allows full distributed tracing.

---

### **9. Mental Model**

> **Run ID = Execution Identity**
> **Thread ID = Conversation Identity**
> **State = Execution Memory**

Together they form LangGraph’s execution model.

---

### **10. When to Control Run IDs**

| Scenario          | Why                 |
| ----------------- | ------------------- |
| Financial systems | Auditability        |
| Healthcare        | Compliance          |
| Multi-tenant SaaS | Isolation           |
| Long workflows    | Recovery            |
| Human review      | Traceability        |
| A/B testing       | Experiment tracking |

---

### **Summary**

The **Run ID** is the backbone of LangGraph’s **reliability, observability, and governance**.
Without it, production-grade agent systems would be unmanageable.

| Concept       | What It Represents                         | Lifetime       | Purpose                      |
| ------------- | ------------------------------------------ | -------------- | ---------------------------- |
| **Run ID**    | A **single execution instance** of a graph | One invocation | Debugging, tracing           |
| **Session**   | A **user interaction context**             | Multiple runs  | UX continuity                |
| **Thread ID** | A **persistent workflow memory**           | Long-running   | State persistence & recovery |

In [1]:
from typing import TypedDict
from langgraph.graph import StateGraph, END

class State(TypedDict):
    step: int
    done: bool

def increment(state: State):
    print(f"[NODE] increment | step={state['step']}")
    return {"step": state["step"] + 1}

def check(state: State):
    print(f"[NODE] check | step={state['step']}")
    return {"done": state["step"] >= 3}

builder = StateGraph(State)
builder.add_node("increment", increment)
builder.add_node("check", check)

builder.set_entry_point("increment")
builder.add_edge("increment", "check")

builder.add_conditional_edges(
    "check",
    lambda s: END if s["done"] else "increment",
    {"increment": "increment", END: END}
)

graph = builder.compile()


In [2]:
config = {
    "run_id": "demo-run-001",
    "thread_id": "user-session-42"
}

result = graph.invoke({"step": 0, "done": False}, config=config)
print("\nFinal Result:", result)


[NODE] increment | step=0
[NODE] check | step=1
[NODE] increment | step=1
[NODE] check | step=2
[NODE] increment | step=2
[NODE] check | step=3

Final Result: {'step': 3, 'done': True}
