# Notebook 6 (Lecture 3) - Building a Multi‑Agent System with LangSmith Tracing

This notebook builds upon **Notebook 4 - Multi-Agent System**, where we created a coordinated ecosystem of specialized agents. We'll recreate the same architecture—combining two distinct domains (calendar management and desk reservation) orchestrated through a *Central Agent*—but this time with **comprehensive LangSmith tracing** to monitor, debug, and optimize the entire multi-agent system.

> Goal: Demonstrate how to instrument a production-grade multi-agent system with full observability through LangSmith, using metadata, tags, and run names to organize and analyze traces.

## What's New in This Exercise

The key enhancement in this notebook is **LangSmith tracing with rich metadata**. You'll learn how to:
- **Configure tracing metadata**: Add tags, run names, and custom metadata to each agent
- **Track multi-agent orchestration**: Visualize how requests flow from the Central Agent to specialized agents
- **Organize traces hierarchically**: Use structured metadata to filter and analyze specific agents or workflows
- **Monitor performance per agent**: Compare latency and token usage across Calendar, Desk, and Central agents
- **Debug routing decisions**: Understand why the Central Agent delegates to specific agents
- **Analyze production patterns**: Use metadata to segment traces by environment, version, or feature

## Why Tracing Matters for Multi-Agent Systems

As functional scope grows with multiple agents, observability becomes critical:
- **Complex interactions**: Multiple agents calling multiple tools create intricate execution paths
- **Routing logic**: Understanding why the Central Agent chose a specific agent
- **Performance bottlenecks**: Identifying which agent or tool is slowing down responses
- **Cost attribution**: Tracking token usage per agent to optimize spending
- **Debugging**: Quickly isolating issues to specific agents or tool calls
- **Iteration**: Comparing different prompt versions or model configurations

Without tracing, multi-agent systems are black boxes. With LangSmith, every decision is transparent.

## Logical Architecture with Tracing
```
User ─▶ Central Agent (Routing + Tracing) ─┬─▶ Calendar Agent (events + tracing)
                                           └─▶ Desk Agent (reservations + tracing)
```

Each agent includes:
- **System prompt** (policy, style, output format)
- **Specialized tools** (domain-specific actions)
- **Separate memory** (dedicated thread id)
- **Rich tracing config** (tags, metadata, run_name) ← NEW!

## What You Will Learn

| Topic | Why it matters |
|-------|----------------|
| Tracing configuration | Structure metadata for effective analysis |
| Per-agent instrumentation | Track individual agent performance |
| Hierarchical traces | Navigate complex multi-agent workflows |
| Production metadata | Organize traces by environment, version, user |
| Performance optimization | Use metrics to improve latency and costs |
| Debugging workflows | Quickly isolate and fix issues |

## What You'll Build

You will implement the same multi-agent system from Notebook 4, enhanced with:
1. **Calendar Agent** with comprehensive tracing metadata
2. **Desk Agent** with structured tags and run names
3. **Central Agent** with routing observability
4. **Rich configuration objects** demonstrating production-ready instrumentation

## Ready to Get Started?

Let's set up the environment with LangSmith tracing enabled and learn how to properly instrument a multi-agent system. 
To enable comprehensive observability for your multi-agent system, configure the following environment variables in your `.env` file:



### Required Environment Variables:

```bash
LANGSMITH_TRACING=true
LANGSMITH_API_KEY=your_langsmith_api_key_here
OPENAI_API_KEY=your_openai_api_key_here
```

### Optional but Recommended:

```bash
LANGSMITH_PROJECT=tech4tech-multi-agent-lecture3
LANGSMITH_WORKSPACE_ID=your_workspace_id  # if you have multiple workspaces
```


## 0. Setup the Environment and Enable LangSmith Tracing

In [None]:
from uuid import uuid4
from dotenv import load_dotenv

load_dotenv(override=True)

## 1. Calendar Agent

The Calendar Agent manages lightweight CRUD operations on events (list, add, delete).

### 1. Tool Definition

We define the operational tool set for the Calendar Agent. Each tool is atomic and free of policy; strategy (when/why) lives in the prompt.

#### Tools Provided
1. `get_current_date` – temporal anchor for relative expressions (today, tomorrow, day after tomorrow).
2. `convert_weekday_to_date` – robust mapping weekday → next ISO date (contextual to "today").
3. `list_events` – retrieve events for a date; also used for pre‑conflict checks.
4. `add_event` – events insertion 
5. `delete_event` – events removal

In [None]:
from tools.calendar import get_current_date, convert_weekday_to_date, list_events, add_event, delete_event

calendar_tools = [
    get_current_date,
    convert_weekday_to_date,
    list_events,
    add_event,
    delete_event
]

### 2. Initializing the LLM

We pick a lightweight chat model with low temperature for determinism. A larger or reasoning model can be swapped later for complex multi‑event reconciliation; here we optimize for responsiveness.

In [None]:
from langchain.chat_models import init_chat_model

CALENDAR_MODEL_NAME = "openai:gpt-4o-mini-2024-07-18"

calendar_llm = init_chat_model(CALENDAR_MODEL_NAME, temperature=0)

### 3. Crafting the Agent System Prompt

The Calendar Agent prompt encapsulates:
- Role + objective (event management, Italian output)
- Formatting rules (ISO internal, tabular output for lists)
- Safety policies (confirmation before delete)
- Disambiguation strategy (ask only for missing elements)
- Few-shot examples to anchor tool usage

This reduces cognitive overhead and nudges the model toward correct tool sequencing without verbose reasoning leakage.

In [None]:
CALENDAR_AGENT_PROMPT = """
# 📌 System Prompt — *Calendar Agent (gpt-4o mini)*

## Ruolo & Obiettivo

Sei un assistente calendario affidabile e conciso in lingua italiana. Aiuti l’utente a **vedere, aggiungere e rimuovere** impegni, interpretando richieste in linguaggio naturale e usando **solo** i tool disponibili. Non rivelare dettagli interni né schemi dei tool; mostra all’utente risultati chiari e conferme sintetiche.

## Principi di comportamento

* **Chiarezza prima di tutto.** Se la richiesta è ambigua (data/ora mancante, titoli multipli uguali), fai **una** domanda di chiarimento mirata prima di agire.
* **Niente catene di pensiero.** Non esporre ragionamenti passo-passo; mostra solo il risultato o le domande necessarie.
* **Localizzazione.** Rispondi in italiano, usa **orario 24h** e giorno della settimana in minuscolo.
* **Fuso orario di default:** Europe/Rome.
* **Formati coerenti:**

  * Interno/tool: **ISO 8601** `YYYY-MM-DD`; orario `HH:MM`.
  * Uscita utente: `ddd DD/MM/YYYY` (es. `ven 03/10/2025`) e `HH:MM`.
* **Conferme e toni:**

  * Conferma sempre dopo un’azione (aggiunta o eliminazione).
  * Per azioni distruttive (delete) **chiedi conferma** esplicita prima di procedere.

## Policy di uso dei tool

Usa esclusivamente i seguenti tool e solo quando servono allo scopo:

* `get_current_date` — per ancorare “oggi”, “domani”, “dopodomani”.
* `convert_weekday_to_date` — per espressioni come “lunedì”, “venerdì prossimo”.
* `list_events` — per elenchi o verifiche di conflitto.
* `add_event` — per creare un evento (richiede **data**, **ora**, **titolo**).
* `delete_event` — per rimuovere un evento per **titolo** in una **data**.

> Regola d’oro: prima di *add* o *delete*, verifica ed esplicita il **target** (data/ora/titolo) in modo non ambiguo; se manca un elemento necessario, chiedilo.

## Disambiguazione date/ore (procedura)

1. Se la richiesta contiene un **giorno della settimana** (es. “martedì”):
   → chiama `convert_weekday_to_date` per ottenere la prossima data utile.
2. Se contiene **oggi/domani/dopodomani**:
   → chiama `get_current_date` per ricavare la data di oggi e calcola quella relativa (senza esporre il calcolo).
3. Se la data è già in formato naturale (es. “3 ottobre 2025”):
   → normalizza internamente a `YYYY-MM-DD` prima di chiamare i tool.
4. Se **manca l’ora** per `add_event`: chiedi “A che ora?” (24h).
5. Se **manca il titolo**: chiedi “Come vuoi chiamare l’evento?”.

## Gestione conflitti e feedback

* Prima di `add_event`, se utile, esegui `list_events` sulla **stessa data** per rilevare collisioni orarie e informare l’utente.
* Se `add_event` segnala che l’evento esiste già in quell’ora, comunica il problema e proponi: “Vuoi cancellarlo e ricrearlo con un altro titolo/orario?”
* `list_events` senza risultati: riferisci il messaggio in tono neutro (“Nessun evento…”).
* Ordina sempre gli eventi per orario quando li mostri.

## Sicurezza per azioni distruttive

Prima di `delete_event`, **ripeti** target e chiedi conferma **sì/no** in una riga:

> “Confermi l’eliminazione di **‘titolo ’** del **YYYY-MM-DD**? (sì/no)”

Procedi solo se l’utente risponde “sì” (accetta anche “ok”, “confermo”).

## Stile di risposta

* Breve, pratico, niente gergo tecnico.
* Per gli elenchi, usa una tabella semplice:

| Ora   | Titolo       |
| ----- | ------------ |
| 09:30 | Dentista     |
| 14:00 | Allineamento |

* Conferme d’azione:

  * Aggiunta: `✅ Aggiunto “titolo” — ddd DD/MM/YYYY alle HH:MM.`
  * Eliminazione: `🗑️ Eliminato “titolo” — ddd DD/MM/YYYY.`
* Errori: spiega in una frase cosa è andato storto e come risolvere.

## Esempi d’uso (few-shot)

**Esempio 1 — Aggiungere con giorno della settimana**
**Utente:** «Aggiungi *Dentista* venerdì alle 09:30.»

1. `get_current_date` → determina oggi.
2. `convert_weekday_to_date`(“friday”) → ottieni la data ISO.
3. `add_event` con data/ora/titolo.
   **Risposta:** `✅ Aggiunto “Dentista” — ven DD/MM/YYYY alle 09:30.`

**Esempio 2 — Elencare eventi di una data**
**Utente:** «Cosa ho il 2025-10-03?»

1. `list_events` per quella data.
   **Risposta (con tabella o messaggio “nessun evento”).**

**Esempio 3 — Eliminare con conferma**
**Utente:** «Elimina *Dentista* di lunedì.»

1. `convert_weekday_to_date`(“monday”).
2. Chiedi conferma: “Confermi l’eliminazione di ‘Dentista’ del YYYY-MM-DD? (sì/no)”
3. Se sì → `delete_event`.
   **Risposta:** `🗑️ Eliminato “Dentista” — lun DD/MM/YYYY.`

**Esempio 4 — Dati mancanti**
**Utente:** «Aggiungi riunione domani.»

1. `get_current_date` → calcola domani.
2. Chiedi: “A che ora?” e “Titolo completo?” se necessario.

## Cose da non fare

* Non inventare dati mancanti (titolo, ora, data).
* Non mostrare né descrivere gli schemi dei tool o percorsi file.
* Non eseguire cancellazioni senza conferma.
* Non rivelare il ragionamento interno o piani step-by-step.


SESSION START

Greet the user only once at the beginning of a new thread. The user name is {user_name}. The current date is {current_date}.
"""

### 4. Dynamic Prompt

We can define a wrapper function for the system prompt to add user-related information or other details that may be useful. 

In [None]:
from langchain_core.runnables import RunnableConfig
from langchain_core.prompts import PromptTemplate
from langchain_core.messages import SystemMessage
from tools.utils import get_current_date as get_current_date_util

def custom_prompt_calendar(state: dict, config: RunnableConfig):
    template = CALENDAR_AGENT_PROMPT
    prompt_template = PromptTemplate(
        template=template,
        input_variables=[
            "user_name", 
            "current_date"
        ],
    )
    current_date = get_current_date_util()
    user_name = config['configurable']['user']

    system_msg = SystemMessage(
        content=prompt_template.format(
            user_name=user_name,
            current_date=current_date 
        )
    )

    return [system_msg] + state["messages"]


### 5. Building the Calendar Agent with Rich Tracing Configuration

We compose the agent with:
- `model` already bound to tools (prevents hallucinated tool names)
- Dynamic `prompt` (inject user + current date)
- `MemorySaver` for per-thread continuity

Best practice: unique thread_id per agent to isolate context.

### 🎯 Rich Tracing Configuration

Notice the enhanced `calendar_agent_config` below. Beyond the basic `configurable` parameters, we now add:

#### **`run_name`** (string)
A human-readable identifier for this run. Appears prominently in LangSmith UI.
```python
"run_name": "Session with Calendar Agent"
```

#### **`tags`** (list of strings)
Labels for filtering and organizing traces. Use hierarchical naming:
```python
"tags": ["demo", "lecture 3", "agent:calendar"]
```
- `demo`: indicates this is a demonstration run
- `lecture 3`: groups traces by course section
- `agent:calendar`: identifies the specific agent type

#### **`metadata`** (dictionary)
Structured data for detailed analysis and filtering:
```python
"metadata": {
    "env": "develop",              # deployment environment
    "notebook": "6 - Multi-agent with Tracing",  # source notebook
    "course": "LangGraph+LangSmith",  # course context
    "prompt_version": "calendar_agent_v1",  # for A/B testing
    "llm": CALENDAR_MODEL_NAME,    # model tracking
    "agent": "react",              # agent architecture
}
```

### Benefits of This Configuration

1. **Filtering**: Find all "calendar" agent traces with tag `agent:calendar`
2. **Comparison**: Compare `prompt_version: v1` vs `v2` performance
3. **Cost tracking**: Aggregate costs by `env: production` vs `env: develop`
4. **Debugging**: Filter by `notebook` to isolate specific test scenarios
5. **Monitoring**: Track different LLM models or agent architectures

This production-ready pattern makes your traces **searchable, analyzable, and actionable**.

In [None]:
from langgraph.prebuilt import create_react_agent

from langgraph.checkpoint.memory import MemorySaver


calendar_agent_memory = MemorySaver()

calendar_agent = create_react_agent(
    model=calendar_llm.bind_tools(calendar_tools, parallel_tool_calls=False),
    tools=calendar_tools,
    prompt=custom_prompt_calendar,
    checkpointer=calendar_agent_memory,
)

calendar_agent_config: RunnableConfig = {
    "configurable": {
        "thread_id": str(uuid4()),
        "user": "alice",
    },
    "run_name": "Session with Calendar Agent",
    "tags": ["demo", "lecture 3", "agent:calendar"],
    "metadata": {
        "env": "develop",
        "notebook": "6 - Multi-agent with Tracing",
        "course": "LangGraph+LangSmith",
        "prompt_version": "calendar_agent_v1",
        "llm": CALENDAR_MODEL_NAME,
        "agent": "react",
    }
}

### 6. Testing the Calendar Agent with Trace Observation

Sample prompts:
- "Dammi le informazioni sui miei impegni per il 3 ottobre 2025."
- "Vorrei aggiungere una riunione con il team venerdì prossimo alle 15:00." -> If an event is already there:
- "Cancella l'evento" → expect confirmation first -> "Si"
- "Aggiungi un meeting domani" → expect follow-up request for missing hour/title

### 🔍 What to Observe in LangSmith

While testing, **open your LangSmith dashboard** at [smith.langchain.com](https://smith.langchain.com) and observe:

#### In the Trace View:
- **Run Name**: "Session with Calendar Agent" appears as the trace title
- **Tags**: Filter by `agent:calendar` or `lecture 3` to find these traces
- **Metadata Panel**: See all the structured data (env, notebook, prompt_version, etc.)

#### In the Performance View:
- **Token Usage**: How many tokens did the calendar tools consume?
- **Latency**: Which tool call took the longest?
- **Cost**: What's the cost per calendar operation?

#### In the Filters:
Try filtering by:
- Tag: `agent:calendar` → shows only Calendar Agent traces
- Metadata: `env: develop` → shows only development runs
- Metadata: `llm: gpt-4o-mini-2024-07-18` → compares this model's performance

This rich configuration makes debugging and optimization dramatically easier!

In [None]:
from utils.stream import print_stream

user_task = "Dammi le informazioni sui miei impegni per il 3 ottobre 2025."
inputs = {"messages": [("user", user_task)]}

out = print_stream(calendar_agent.stream(inputs, calendar_agent_config, stream_mode="values"))

## 2. Desk Agent

The Desk Agent manages the reservation lifecycle: request, release, inspect state, and retrieve descriptive metadata. It operates on potentially contested resources (same desk, same date), so it emphasizes:
- Pre‑action validation
- Actionable failure messages (suggest next step)
- Minimal questioning (only desk ID or date if absent)

Isolation avoids leakage of calendar-specific formatting or confirmation patterns that are not relevant here.

### 1. Tool Definition

We define the tools for the Desk domain. Each returns structured data; the agent turns it into concise natural language.

#### Tools Provided
1. `get_current_date` – baseline for relative references.
2. `convert_weekday_to_date` – normalize weekday names.
3. `retrieve_desk_general_info` – metadata (location, description, allowed users).
4. `reserve_desk` – reservation (conflict & permission checks inside tool).
5. `release_desk` – release if the booking belongs to the user.
6. `get_all_reserved_desks` – list active/future reservations.


In [None]:
from tools.desk_manager import get_current_date, convert_weekday_to_date, retrieve_desk_general_info, reserve_desk, release_desk, get_all_reserved_desks

desk_tools = [
    get_current_date,
    convert_weekday_to_date,
    retrieve_desk_general_info,
    reserve_desk,
    release_desk,
    get_all_reserved_desks
]

### 2. Initializing the LLM

We reuse the same base model as the Calendar Agent for stylistic consistency and predictable latency. If future complexity emerges (e.g. multi-user optimization) we can upgrade only this model to a reasoning variant.

In [None]:
from langchain.chat_models import init_chat_model

DESK_MODEL_NAME = "openai:gpt-4o-mini-2024-07-18"

desk_llm = init_chat_model(DESK_MODEL_NAME, temperature=0)

### 3. Crafting the Agent System Prompt

The Calendar Agent prompt encapsulates:
- Role + objective 
- Formatting rules 
- Safety policies 
- Disambiguation strategy 
- Few-shot examples 

This reduces cognitive overhead and nudges the model toward correct tool sequencing without verbose reasoning leakage.

In [None]:
DESK_AGENT_PROMPT = """
# 🧭 System Prompt — “Desk Agent” per gpt4o mini

> **Lingua:** rispondi sempre in **italiano**.
> **Dominio:** prenotazioni e gestione postazioni (desk).
> **Obiettivo primario:** aiutare l’utente a **prenotare, rilasciare e consultare** informazioni e prenotazioni delle postazioni, usando in modo sicuro ed efficace i tool forniti.

---

## Ruolo & Obiettivi

Sei un **assistente per la gestione dei desk**.
Capisci l’intento dell’utente, **normalizzi date e ID**, chiami i tool quando servono, e restituisci risposte **chiare, sintetiche e operative**.

---

## Strumenti disponibili

Usa **sempre** i tool, non fare affidamento a knowledge intrinseca o inventare dati.

* `get_current_date`
* `convert_weekday_to_date`
* `retrieve_desk_general_info`
* `reserve_desk`
* `release_desk`
* `get_all_reserved_desks`

> Non spiegare nel dettaglio parametri o output dei tool all’utente; usa i risultati per guidare l’azione e la risposta.

---

## Linee guida di comportamento (Best Practices)

1. **Identifica l’intento**

   * Prenotare, rilasciare, informarsi su un desk, oppure vedere prenotazioni dell’utente.

2. **Date sempre chiare**

   * Se l’utente cita un **giorno della settimana** (“lunedì”, “mercoledì prossimo”), usa `convert_weekday_to_date` per ottenere la data **YYYY-MM-DD**.
   * Quando serve “oggi”, usa `get_current_date` per evitare ambiguità.

3. **Desk ID & permessi**

   * Non inventare ID. Se l’utente non lo indica, chiedilo con una **domanda mirata**.
   * Se utile, puoi consultare `retrieve_desk_general_info` prima di prenotare (es. per luogo/descrizione/utenti ammessi).
   * Rispetta eventuali restrizioni di accesso (il tool di prenotazione le fa rispettare).

4. **Prenotazioni**

   * Per prenotare usa `reserve_desk`.
   * Se il tool segnala che **l’utente ha già una prenotazione** quel giorno, proponi di **cancellarla** (rilasciarla) prima di procedere.
   * Se il desk è già occupato o inesistente, comunica l’esito e **offri alternative** (es. “vuoi tentare con un altro ID o un’altra data?”).

5. **Rilascio**

   * Usa `release_desk` per liberare una prenotazione dell’utente sulla data/desk indicati.
   * Se il tool indica che non è prenotato o è prenotato da un altro utente, spiega l’esito e proponi il **prossimo passo** (controllo dettagli, data corretta, ecc.).

6. **Storico dell’utente**

   * Usa `get_all_reserved_desks` per mostrare all’utente le sue prenotazioni future/attive, in modo sintetico.

7. **Error handling & veridicità**

   * **Non affermare mai** di aver prenotato/rilasciato se il tool non conferma.
   * Riporta messaggi d’errore **chiari e orientati all’azione** (cosa è andato storto, cosa proviamo ora).

8. **Stile di risposta**

   * **Breve, diretto, cortese**.
   * Metti in evidenza: **data, desk ID, esito**.
   * Usa elenchi puntati quando aiuta la leggibilità.

9. **Privacy & sicurezza**

   * Non chiedere dati non necessari.
   * Non rivelare dettagli tecnici dei tool o implementazioni interne.

---

## Strategie operative

* **Disambiguazione minima e mirata:**
  Chiedi solo ciò che manca (es. “Quale desk ID desideri?” o “Confermi la data YYYY-MM-DD?”).
* **Normalizzazione coerente:**
  Mostra sempre le date come **YYYY-MM-DD** e gli ID come forniti dall’utente (es. `A3`).
* **Conferme esplicite:**
  Dopo una prenotazione/rilascio riusciti, **conferma** con: data, desk ID, utente implicito, e **prossimi passi** (es. “Vuoi aggiungere un promemoria?” — senza creare automazioni).
* **Fallback utili:**
  In caso di desk non esistente/data non disponibile, proponi:

  * alternativa di **data** (chiedi un altro giorno)
  * alternativa di **desk ID** (chiedi un altro ID).

---

## Formati di risposta consigliati

**Conferma di prenotazione riuscita**

* “✅ Prenotazione confermata: **Desk ID** in data **YYYY-MM-DD**. Vuoi fare altro?”

**Conferma di rilascio riuscito**

* “♻️ Prenotazione rilasciata: **Desk ID** in data **YYYY-MM-DD**.”

**Richiesta di disambiguazione**

* “Per procedere mi serve il **desk ID** (es. A3). Quale vuoi usare?”

**Esito negativo con step successivi**

* “❌ Il desk **ID** risulta già occupato il **YYYY-MM-DD**. Vuoi provare con **un altro desk ID** o **un’altra data**?”

---

## Esempi rapidi

**Esempio 1 — “Prenota lunedì A3”**

1. Usa `convert_weekday_to_date` su “lunedì”.
2. Chiama `reserve_desk` con la data ottenuta e l’ID `A3`.
3. Se confermato: “✅ Prenotazione confermata: Desk A3 il 2025-09-29.”
4. Se rifiutato (già occupato/non disponibile): comunica l’errore e proponi alternative.

**Esempio 2 — “Rilascia A3 domani”**

1. Ottieni oggi con `get_current_date`, calcola “domani” chiedendo all’utente la data se non chiara, oppure fai chiedere una **data precisa**.
2. Esegui `release_desk`.
3. Conferma l’esito o proponi correzioni se l’operazione non è possibile.

**Esempio 3 — “Che desk ho prenotato?”**

1. Esegui `get_all_reserved_desks`.
2. Se vuoto: “Nessuna prenotazione trovata.”
3. Se presente: elenca per righe “Data — Desk ID”.

**Esempio 4 — “Dove si trova B2?”**

1. Esegui `retrieve_desk_general_info` con `B2`.
2. Riassumi posizione/descrizione/utenti ammessi.
3. Offri di prenotare per una data specifica.

---

## Cose da **fare**

* Chiedere **solo** le informazioni mancanti (data o desk ID).
* **Usare i tool** per ogni dato dinamico (soprattutto date).
* Mostrare sempre **esito e dettagli** (data, ID).
* Proporre **prossimi passi** quando qualcosa non va.

## Cose da **evitare**

* Non fare calcoli di calendario “a mano” se puoi usare i tool.
* Non inventare ID, date o disponibilità.
* Non rivelare logiche interne, parametri o eccezioni di implementazione.
* Non dire di aver eseguito un’azione se il tool non lo conferma.

---

## SESSION START
Greet the user only once at the beginning of a new thread. The user name is {user_name}. Current date is {current_date}.
"""

### 4. Dynamic Prompt

We can define a wrapper function for the system prompt to add user-related information or other details that may be useful. 

In [None]:
def custom_prompt_desk(state: dict, config: RunnableConfig):
    template = DESK_AGENT_PROMPT
    prompt_template = PromptTemplate(
        template=template,
        input_variables=[
            "user_name", 
            "current_date"
        ],
    )

    user_name = config['configurable']['user']
    current_date = get_current_date_util()

    system_msg = SystemMessage(
        content=prompt_template.format(
            user_name=user_name, 
            current_date=current_date
        )
    )

    return [system_msg] + state["messages"]


### 5. Building the Desk Agent with Rich Tracing Configuration

Parallels the Calendar Agent:
- Tool binding
- Dynamic prompt (user + current date)
- Dedicated memory

### 🎯 Consistent Tracing Pattern

Notice how the `desk_agent_config` follows the **same tracing pattern** as the Calendar Agent, but with agent-specific values:

```python
"run_name": "Session with Desk Agent"        # Different run name
"tags": ["demo", "lecture 3", "agent:desk"]  # agent:desk tag
"metadata": {
    "prompt_version": "desk_agent_v1",       # Desk-specific version
    "llm": DESK_MODEL_NAME,                  # Can use different model
    ...
}
```

### Why Consistent Configuration Matters

Using a **consistent structure** across all agents enables:

1. **Comparative Analysis**: Compare Calendar vs Desk agent performance side-by-side
2. **Unified Filtering**: Use the same tag structure (`agent:*`) to filter any agent
3. **Cross-Agent Metrics**: Aggregate metrics across the entire multi-agent system
4. **Team Standards**: Establish organization-wide tracing conventions
5. **Automated Analysis**: Build dashboards that work across all agents

This pattern scales from 2 agents to 20+ in production systems.

In [None]:
desk_agent_memory = MemorySaver()

desk_agent = create_react_agent(
    model=desk_llm.bind_tools(desk_tools, parallel_tool_calls=False),
    tools=desk_tools,
    prompt=custom_prompt_desk,
    checkpointer=desk_agent_memory,
)

desk_agent_config = {
    "configurable": {
        "thread_id": str(uuid4()),
        "user": "alice",
        },
    "run_name": "Session with Desk Agent",
    "tags": ["demo", "lecture 3", "agent:desk"],
    "metadata": {
        "env": "develop",
        "notebook": "6 - Multi-agent with Tracing",
        "course": "LangGraph+LangSmith",
        "prompt_version": "desk_agent_v1",
        "llm": DESK_MODEL_NAME,
        "agent": "react",
    }
}

### 6. Testing the Desk Agent with Trace Organization

Suggested cases:
- "Quali prenotazioni ho per la prossima settimana?"
- "Mi cancelli tutte le mie prenotazioni?"
- "Aggiungi una prenotazione per il desk5" ex: "In realtà volevo prenotarlo per il 14 ottobre"
- "Mi prenoti il desk A1 per il 13 ottobre?"

Observe tone: succinct, outcome + next step.

### 🔍 Comparing Agent Traces in LangSmith

Now that you have traces from both Calendar and Desk agents, you can:

#### Compare Performance:
- Filter by `agent:calendar` vs `agent:desk` tags
- Compare average latency, token usage, and costs
- Identify which agent needs optimization

#### Analyze Tool Usage:
- Which agent calls more tools per request?
- Are there unnecessary tool calls to optimize?
- What's the success rate for each tool?

#### Track Prompt Evolution:
- Use `prompt_version` metadata to A/B test improvements
- Compare `calendar_agent_v1` vs `calendar_agent_v2`
- Roll back if a new version performs worse

#### Environment Segmentation:
- All traces have `env: develop` currently
- In production, you'd add `env: staging` and `env: production`
- Track metrics separately per environment

This structured approach transforms debugging from guesswork to data-driven analysis.

In [None]:
from utils.stream import print_stream

user_task = "Quali prenotazioni ho per la prossima settimana?"
inputs = {"messages": [("user", user_task)]}

print_stream(desk_agent.stream(inputs, desk_agent_config, stream_mode="values"))

## 3. Multi-Agent System with Comprehensive Tracing

We now compose the two agents into a coordinated system with **full observability**. Core principle: **the Central Agent does not re‑implement domain logic**; it acts purely as an intelligent router.

Adopted pattern:
- Wrap each agent as a tool (`chat_with_calendar_agent`, `chat_with_desk_agent`)
- Central Agent picks which tool to invoke based on inferred intent
- Separate memories maintained through supplied thread IDs in the config

### 🎯 Enhanced with Hierarchical Tracing

The real power of LangSmith emerges in multi-agent systems:

**Hierarchical Trace Structure:**
```
📊 Central Agent (coordinator trace)
  ├─ 🔀 Tool Call: chat_with_calendar_agent
  │   └─ 📅 Calendar Agent (nested trace)
  │       ├─ 🔧 Tool: get_current_date
  │       ├─ 🔧 Tool: convert_weekday_to_date
  │       └─ 🔧 Tool: list_events
  └─ ✅ Final Response
```

Each level has its own tags and metadata, creating a **rich hierarchy** for analysis.

### Benefits with Tracing:

- **Decoupling**: easy to swap/upgrade a domain agent (track performance before/after)
- **Scalability**: adding a new domain = new prompt + tool wrapper + tracing config
- **Traceability**: routing logs distinct from domain action logs
- **Performance Attribution**: know if slowness is in routing or domain execution
- **Cost Breakdown**: attribute token usage to coordinator vs domain agents
- **Error Isolation**: quickly identify which agent failed and why

Next: define the agent wrappers with proper config propagation for nested tracing.

### 1. Re-instantiating Domain Agents for Multi-Agent Coordination

In this section, we re-instantiate the Calendar Agent and Desk Agent to ensure each has a fresh, isolated memory and configuration. This is necessary for multi-agent orchestration, allowing the central agent to route requests to dedicated, independent agent instances for calendar and desk management.

### 🎯 Important: Fresh Configurations with Tracing

Notice that we create **new instances** with their tracing configurations intact:

```python
calendar_agent_config: RunnableConfig = {
    "configurable": {...},
    "run_name": "Session with Calendar Agent",
    "tags": ["demo", "lecture 3", "agent:calendar"],
    "metadata": {...}
}
```

These configurations will be **propagated through nested calls** when the Central Agent delegates to these agents. This creates the hierarchical trace structure in LangSmith, where you can see:

1. **Root Trace**: Central Agent receives user request
2. **Child Trace**: Central Agent calls `chat_with_calendar_agent` tool
3. **Nested Trace**: Calendar Agent executes with its own config
4. **Leaf Traces**: Individual tool calls (get_current_date, list_events, etc.)

Each level maintains its own tags and metadata, enabling multi-dimensional analysis.

In [None]:
calendar_agent_memory = MemorySaver()

calendar_agent = create_react_agent(
    model=calendar_llm.bind_tools(calendar_tools, parallel_tool_calls=False),
    tools=calendar_tools,
    prompt=custom_prompt_calendar,
    checkpointer=calendar_agent_memory,
)

calendar_agent_config: RunnableConfig = {
    "configurable": {
        "thread_id": str(uuid4()),
        "user": "alice",
    },
    "run_name": "Session with Calendar Agent",
    "tags": ["demo", "lecture 3", "agent:calendar"],
    "metadata": {
        "env": "develop",
        "notebook": "6 - Multi-agent with Tracing",
        "course": "LangGraph+LangSmith",
        "prompt_version": "calendar_agent_v1",
        "llm": CALENDAR_MODEL_NAME,
        "agent": "react",
    }
}

In [None]:
from langgraph.prebuilt import create_react_agent

from langgraph.checkpoint.memory import MemorySaver

desk_agent_memory = MemorySaver()

desk_agent = create_react_agent(
    model=desk_llm.bind_tools(desk_tools, parallel_tool_calls=False),
    tools=desk_tools,
    prompt=custom_prompt_desk,
    checkpointer=desk_agent_memory,
)

desk_agent_config = {
    "configurable": {
        "thread_id": str(uuid4()),
        "user": "alice",
        },
    "run_name": "Session with Desk Agent",
    "tags": ["demo", "lecture 3", "agent:desk"],
    "metadata": {
        "env": "develop",
        "notebook": "6 - Multi-agent with Tracing",
        "course": "LangGraph+LangSmith",
        "prompt_version": "desk_agent_v1",
        "llm": DESK_MODEL_NAME,
        "agent": "react",
    }
}

### 2. Agent As-a-Tool with Configuration Propagation

In this section, we wrap each domain agent (Calendar Agent and Desk Agent) as a callable tool. This allows the Central Agent to delegate user requests to the appropriate specialized agent based on the detected intent. We expose each agent as a tool enabling modular orchestration and maintaining clear separation of responsibilities between calendar management and desk reservation functionalities.

### 🎯 Critical: Passing Complete Configurations

Each tool wrapper (e.g., `chat_with_calendar_agent`, `chat_with_desk_agent`) receives the user's input and a `RunnableConfig` object. The wrapper extracts the **complete agent configuration** (including tracing metadata) from `config['configurable']` and forwards it to the sub-agent when invoking it.


```python
calendar_agent_config = config['configurable']['calendar_agent_config']
```

This retrieves the **entire configuration object**, including:
- `configurable` (thread_id, user)
- `run_name` (appears in traces)
- `tags` (for filtering)
- `metadata` (for analysis)

### Why This Matters for Tracing

By passing the complete config object, we ensure:

1. **Nested Trace Inheritance**: Child traces preserve parent context
2. **Tag Propagation**: Filter by `agent:calendar` to see all related traces
3. **Metadata Continuity**: Analysis works across the entire call chain
4. **Attribution**: Know which coordinator session spawned which agent calls
5. **Performance Tracking**: Measure end-to-end latency including nested calls

This pattern ensures **complete observability** through the entire multi-agent workflow.

### Example Trace Hierarchy:

```
📊 Coordinator: "Session with Coordinator Agent" [agent:coordinator]
  └─ 🔧 Tool: chat_with_calendar_agent
      └─ 📅 "Session with Calendar Agent" [agent:calendar]
          ├─ 🔧 Tool: get_current_date
          └─ 🔧 Tool: list_events
```

Each level is independently queryable but maintains the hierarchical relationship.

In [None]:
from langchain_core.tools import tool

@tool
def chat_with_calendar_agent(
    user_input: str,
    config : RunnableConfig
    ): 
    """
    Chat with the calendar management agent.

    Inputs:
    - user_input: str : The user's input message.
    - config : RunnableConfig.

    Output
    - calendar agent's response as str.
    """
    inputs = {"messages": [("user", user_input)]}
    calendar_agent_config = config['configurable']['calendar_agent_config']
    return calendar_agent.invoke(inputs, calendar_agent_config)['messages'][-1].content

In [None]:
@tool
def chat_with_desk_agent(
    user_input: str,
    config : RunnableConfig
    ):
    """
    Chat with the desk management agent.
    
    Inputs:
    - user_input: str : The user's input message.
    - config : RunnableConfig.
    
    Output
    - desk agent's response as str.
    """
    inputs = {"messages": [("user", user_input)]}
    desk_agent_config = config['configurable']['desk_agent_config']
    return desk_agent.invoke(inputs, desk_agent_config)['messages'][-1].content

### 3. Defining the Coordinator Agent Prompt

In this section, we define the system prompt for the Coordinator Agent, which orchestrates interactions between the Calendar Agent and Desk Agent. The prompt includes:

- **Role definition:** The Coordinator Agent acts as a router, delegating user requests to the appropriate specialized agent.
- **Delegation guidelines:** It must identify user intent (calendar vs desk), use the correct tool, and never handle domain logic directly.
- **Clarification strategy:** If the intent is ambiguous, the agent asks clarifying questions before routing.
- **Context management:** Maintains conversation context and ensures coherent multi-turn interactions.

In [None]:
CENTRAL_AGENT_PROMPT = """
# 🧭 System Prompt — "Central Agent"

You are the central coordinator for multiple specialized agents: a Calendar Agent and a Desk Agent. Your role is to understand user requests and delegate them to the appropriate agent based on the context.

Do not handle requests directly; always route them to the relevant agent using the provided tools.

Calendar Agent capabilities:
* Manage calendar events: view, add, delete.

Desk Agent capabilities:
* Manage desk reservations: view, reserve, release.

## Guidelines for Delegation
1. **Identify Intent**: Determine if the user's request pertains to calendar management or desk reservations.
2. **Use Tools**: Always use the `chat_with_calendar_agent` or `chat_with_desk_agent` tools to forward requests.
3. **Clarify When Needed**: If the request is ambiguous, ask a clarifying question to the user before delegating.
4. **Maintain Context**: Keep track of the conversation context to ensure coherent interactions across multiple turns.

## CONVERSATION START
Greet the user only once at the beginning of a new thread. The user name is {user_name}. Current date is {current_date}.
"""

### 4. Dynamic Prompt

We can define a wrapper function for the system prompt to add user-related information or other details that may be useful. 

In [None]:
def custom_prompt_modifier(
    state: dict, 
    config: RunnableConfig
    ):
    template = CENTRAL_AGENT_PROMPT
    prompt_template = PromptTemplate(
        template=template,
        input_variables=[
            "user_name", 
            "current_date"
        ],
    )

    user_name = config['configurable']['user']
    current_date = get_current_date_util()

    system_msg = SystemMessage(
        content=prompt_template.format(
            user_name=user_name, 
            current_date=current_date
        )
    )

    return [system_msg] + state["messages"]

### 5. Initializing the LLM
In this section, we initialize the language model (LLM) for the Central Agent. For this orchestration layer, we use a lightweight model (`openai:gpt-4o-mini-2024-07-18`) to ensure fast routing and low latency. The Central Agent does not perform domain reasoning; it only delegates requests to the specialized agents.


In [None]:
CENTRAL_AGENT_MODEL_NAME = "openai:gpt-4o-mini-2024-07-18"

central_agent_llm = init_chat_model(CENTRAL_AGENT_MODEL_NAME, temperature=0)

### 6. Building the Coordinator Agent with Complete Tracing

We setup the Coordinator Agent to route user requests to the appropriate specialized agent, maintaining modularity and clear separation of responsibilities.

### 🎯 Critical: Embedding Sub-Agent Configs

The `config` dictionary for the Coordinator Agent includes the **complete configuration objects** for each sub-agent:

```python
central_agent_config = {
    "configurable": {
        "thread_id": str(uuid4()),           
        "user": "alice",
        "calendar_agent_config": calendar_agent_config,  # Complete config object!
        "desk_agent_config": desk_agent_config           # Complete config object!
    },
    "run_name": "Session with Coordinator Agent",
    "tags": ["demo", "lecture 3", "agent:coordinator"],
    "metadata": {
        "env": "develop",
        "notebook": "6 - Multi-agent with Tracing",
        "course": "LangGraph+LangSmith",
        "prompt_version": "central_agent_v1",
        "llm": CENTRAL_AGENT_MODEL_NAME,
        "agent": "react",
    }
}
```

### Why This Pattern Works

By embedding the **full config objects** (not just thread IDs), we enable:

1. **Complete Config Propagation**: All tags, metadata, and run names flow through
2. **Hierarchical Tracing**: LangSmith shows nested trace structure
3. **Independent Analysis**: Query coordinator or sub-agents independently
4. **Performance Attribution**: Separate coordinator overhead from agent execution
5. **Multi-Level Filtering**: Filter by coordinator tags OR sub-agent tags

### Trace Organization in LangSmith

With this configuration, your dashboard shows:

```
📊 Project: tech4tech-multi-agent-lecture3
├─ Session with Coordinator Agent [agent:coordinator, env:develop]
│  ├─ Tool: chat_with_calendar_agent
│  │  └─ Session with Calendar Agent [agent:calendar, env:develop]
│  │     └─ Tool: list_events
│  └─ Tool: chat_with_desk_agent
│     └─ Session with Desk Agent [agent:desk, env:develop]
│        └─ Tool: reserve_desk
```

### Production Benefits

This structured approach enables:
- **Cost Analysis**: "How much does routing add vs domain logic?"
- **Performance Optimization**: "Is the coordinator or agents the bottleneck?"
- **A/B Testing**: "Does central_agent_v2 route more accurately?"
- **Error Tracking**: "Which agent failed and in what context?"
- **Usage Patterns**: "Which agent gets called most often?"

When the central agent delegates a request, it forwards the complete config, ensuring each agent preserves its own isolated memory, context, **and tracing metadata**.

In [None]:
central_agent_memory = MemorySaver()

central_agent = create_react_agent(
    model=central_agent_llm.bind_tools([chat_with_calendar_agent, chat_with_desk_agent], parallel_tool_calls=False),
    tools=[chat_with_calendar_agent, chat_with_desk_agent],
    prompt=custom_prompt_modifier,
    checkpointer=central_agent_memory,
)

central_agent_config = {
    "configurable": {
        "thread_id": str(uuid4()), 
        "user": "alice",
        "calendar_agent_config": calendar_agent_config,
        "desk_agent_config": desk_agent_config 
    },
    "run_name": "Session with Coordinator Agent",
    "tags": ["demo", "lecture 3", "agent:coordinator"],
    "metadata": {
        "env": "develop",
        "notebook": "6 - Multi-agent with Tracing",
        "course": "LangGraph+LangSmith",
        "prompt_version": "central_agent_v1",
        "llm": CENTRAL_AGENT_MODEL_NAME,
        "agent": "react",
    }
}

### 7. Testing the Multi-Agent System with Full Observability

You can test the multi-agent system by asking questions that span both domains. The orchestrator (central coordinator agent) will automatically route each request to the appropriate specialized agent, ensuring correct handling and modular separation of logic.

### 🔍 What to Observe in LangSmith Dashboard

**While testing, keep your LangSmith dashboard open** at [smith.langchain.com](https://smith.langchain.com) and observe:

#### 1. Hierarchical Trace Structure
See the complete flow:
- **Root**: Coordinator receives "Mi dai gli impegni del 29? Prenotami un desk per domani, A2."
- **Branch 1**: Coordinator → Calendar Agent → list_events tool
- **Branch 2**: Coordinator → Desk Agent → reserve_desk tool
- **Convergence**: Coordinator synthesizes final response

#### 2. Tag-Based Filtering
Try these filters in LangSmith:
- `agent:coordinator` → See only routing decisions
- `agent:calendar` → See only calendar operations
- `agent:desk` → See only desk operations
- `lecture 3` → See all traces from this exercise
- `env:develop` → Separate dev from production traces

#### 3. Metadata Analysis
Explore metadata panels to see:
- Which prompt version was used
- Which LLM model powered each agent
- Which notebook generated the trace
- Course context for educational tracking

#### 4. Performance Metrics
Compare across agents:
- **Latency**: Which agent responds fastest?
- **Token Usage**: Which agent is most expensive?
- **Tool Calls**: Which agent uses more tools per request?
- **Success Rate**: Which agent has fewer errors?

#### 5. Cost Attribution
See token consumption broken down:
- Coordinator routing: ~100-200 tokens
- Calendar Agent: ~300-500 tokens (with tools)
- Desk Agent: ~300-500 tokens (with tools)
- **Total**: Transparent per-request costs

### Suggested Test Queries

**Single Domain:**
- "Mi dai gli impegni del 29?" → Routes to Calendar Agent
- "Quali prenotazioni ho?" → Routes to Desk Agent

**Multi-Domain:**
- "Aggiungi un meeting domani e prenota il desk A3" → Routes to both agents
- Observe how the coordinator sequences the operations

**Ambiguous:**
- "Cosa ho domani?" → Coordinator asks for clarification
- Watch the reasoning trace

### 🎯 Pro Tips for Trace Analysis

1. **Use the comparison view**: Select multiple traces to compare performance
2. **Export traces**: Download JSON for offline analysis or reporting
3. **Set up monitoring**: Configure alerts for latency or error spikes
4. **Share traces**: Send trace URLs to teammates for collaborative debugging
5. **Track versions**: Use metadata to A/B test prompt improvements

This comprehensive instrumentation transforms your multi-agent system from a black box into a **fully observable, analyzable, and optimizable production system**.

In [None]:
from utils.stream import display_stream

user_task = "Mi dai gli impegni del 29? Prenotami un desk per domani, A2."
inputs = {"messages": [("user", user_task)]}


display_stream(central_agent.stream(inputs, central_agent_config, stream_mode="values"), thinking=False)

## 5. Conclusion

🎉 **Congratulations on Completing Notebook 6!** 🎉

You've successfully built and instrumented a production-grade multi-agent system with comprehensive LangSmith tracing. This exercise demonstrated how observability transforms complex AI systems from black boxes into transparent, analyzable, and optimizable architectures.

### 🎯 What We Accomplished

#### 1. **Built a Multi-Agent System**
- Calendar Agent for event management
- Desk Agent for reservation management  
- Central Coordinator for intelligent routing
- All with complete functional separation and memory isolation

#### 2. **Instrumented with Rich Tracing**
- **run_name**: Human-readable session identifiers
- **tags**: Multi-dimensional filtering (`agent:type`, `lecture 3`, `env:develop`)
- **metadata**: Structured data for deep analysis (version, model, environment, etc.)

#### 3. **Demonstrated Hierarchical Observability**
- Root traces: Coordinator decisions
- Nested traces: Agent executions
- Leaf traces: Individual tool calls
- Complete visibility through the entire workflow

#### 4. **Explored Production Patterns**
- Environment segmentation (develop/staging/production)
- Version tracking for A/B testing
- Feature flags and experimental tracking
- User cohort analysis
- Cost attribution per agent

### 🚀 The Power of Observability

With proper instrumentation, you can now:

✅ **Debug faster**: Isolate issues to specific agents or tools immediately  
✅ **Optimize smarter**: Use data to guide performance improvements  
✅ **Deploy confidently**: Monitor production behavior in real-time  
✅ **Experiment safely**: A/B test changes with measurable outcomes  
✅ **Scale effectively**: Understand costs and bottlenecks before they become problems  
