# Example 1: Temporal Coordination Failure

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Javihaus/agents_observability_bootcamp/blob/main/chapter_01_diagnosing_agent_failures/examples/example_01_temporal_coordination_failure.ipynb)

**Instructor demonstration** - Students follow along without running code

---

## Objective

Demonstrate how multi-agent systems fail when temporal constraints are introduced, even when they coordinate successfully under static conditions.

**Key lesson**: Coordination success under static conditions does NOT predict success under time pressure.

---

## Scenario

**Medical Emergency Response**
- Patient arrives with severe symptoms requiring multi-specialist coordination
- Three agents: Triage Nurse, Emergency Physician, Specialist Consultant
- Static case: Standard protocol, no time pressure
- Dynamic case: Rapidly deteriorating patient, 30-minute window

**Hypothesis**: Agents will coordinate well in static case, but fail to adapt plans in dynamic case.

## Setup

In [None]:
# Install dependencies
!pip install -q langchain==0.1.0 langchain-anthropic==0.1.1 anthropic==0.18.1
!pip install -q python-dotenv pandas

print("Installation complete!")

In [None]:
from google.colab import userdata
from langchain_anthropic import ChatAnthropic
from langchain.schema import HumanMessage, SystemMessage
import time
from datetime import datetime

# Get API key (instructor's key)
ANTHROPIC_API_KEY = userdata.get('ANTHROPIC_API_KEY')

print("Imports successful!")

## Scenario 1: Static Coordination (No Time Pressure)

In this scenario, agents coordinate a standard emergency response without temporal constraints.

In [None]:
# Define agent configurations
TRIAGE_NURSE_SYSTEM = """You are a triage nurse in an emergency department.
Your role is to assess incoming patients and coordinate with the emergency physician.
Provide clear, concise assessments focusing on:
- Chief complaint
- Vital signs interpretation
- Recommended urgency level
- Initial action items
"""

EMERGENCY_PHYSICIAN_SYSTEM = """You are an emergency physician.
You receive triage assessments and coordinate with specialists when needed.
Provide clear medical plans including:
- Differential diagnosis
- Immediate interventions
- Specialist consultation needs
- Expected timeline
"""

SPECIALIST_SYSTEM = """You are a specialist consultant.
You receive consultation requests from emergency physicians.
Provide expert recommendations including:
- Specialist assessment
- Recommended procedures
- Resource requirements
- Timeline for intervention
"""

print("Agent configurations defined")

In [None]:
# Initialize LLM for all agents
llm = ChatAnthropic(
    model="claude-sonnet-4-20250514",
    anthropic_api_key=ANTHROPIC_API_KEY,
    max_tokens=500,
    temperature=0
)

# Patient presentation (static)
patient_info = """Patient: 58-year-old male
Chief complaint: Chest pain for 2 hours
Vitals: BP 145/92, HR 88, RR 18, SpO2 94% on room air
History: Hypertension, hyperlipidemia, 20 pack-year smoking history
Pain: 7/10, substernal, radiating to left arm
"""

print("=" * 60)
print("STATIC SCENARIO: Standard Emergency Response")
print("=" * 60)
print(f"\nPatient Information:\n{patient_info}")

# Agent 1: Triage Nurse
print("\n" + "-" * 60)
print("TRIAGE NURSE ASSESSMENT")
print("-" * 60)

triage_messages = [
    SystemMessage(content=TRIAGE_NURSE_SYSTEM),
    HumanMessage(content=f"Assess this patient:\n{patient_info}")
]

triage_response = llm.invoke(triage_messages)
print(triage_response.content)

# Agent 2: Emergency Physician
print("\n" + "-" * 60)
print("EMERGENCY PHYSICIAN PLAN")
print("-" * 60)

physician_messages = [
    SystemMessage(content=EMERGENCY_PHYSICIAN_SYSTEM),
    HumanMessage(content=f"""Triage assessment:
{triage_response.content}

Create a treatment plan for this patient.""")
]

physician_response = llm.invoke(physician_messages)
print(physician_response.content)

# Agent 3: Specialist
print("\n" + "-" * 60)
print("SPECIALIST CONSULTATION")
print("-" * 60)

specialist_messages = [
    SystemMessage(content=SPECIALIST_SYSTEM),
    HumanMessage(content=f"""Emergency physician requests consultation:
{physician_response.content}

Provide specialist recommendations.""")
]

specialist_response = llm.invoke(specialist_messages)
print(specialist_response.content)

print("\n" + "=" * 60)
print("STATIC SCENARIO COMPLETE")
print("=" * 60)
print("\nObservation: Agents produced coherent, compatible plans")

## Scenario 2: Dynamic Coordination (Time Pressure)

**Same patient, but now**: ECG shows ST elevation (STEMI), requires intervention within 30 minutes.

**Critical difference**: Time constraint added. Will agents adapt their plans?

In [None]:
# Updated patient info with temporal constraint
urgent_patient_info = """Patient: 58-year-old male
Chief complaint: Chest pain for 2 hours
Vitals: BP 145/92, HR 88, RR 18, SpO2 94% on room air
History: Hypertension, hyperlipidemia, 20 pack-year smoking history
Pain: 7/10, substernal, radiating to left arm

CRITICAL UPDATE: ECG shows ST elevation (STEMI)
Time to intervention: 30 minutes maximum (door-to-balloon time)
Current time: 10 minutes since arrival
"""

print("=" * 60)
print("DYNAMIC SCENARIO: Emergency Response Under Time Pressure")
print("=" * 60)
print(f"\nUpdated Patient Information:\n{urgent_patient_info}")

# Agent 1: Triage Nurse (with urgency)
print("\n" + "-" * 60)
print("TRIAGE NURSE ASSESSMENT (URGENT)")
print("-" * 60)

urgent_triage_messages = [
    SystemMessage(content=TRIAGE_NURSE_SYSTEM),
    HumanMessage(content=f"Assess this patient (URGENT):\n{urgent_patient_info}")
]

urgent_triage_response = llm.invoke(urgent_triage_messages)
print(urgent_triage_response.content)

# Agent 2: Emergency Physician (with time constraint)
print("\n" + "-" * 60)
print("EMERGENCY PHYSICIAN PLAN (TIME-CRITICAL)")
print("-" * 60)

urgent_physician_messages = [
    SystemMessage(content=EMERGENCY_PHYSICIAN_SYSTEM),
    HumanMessage(content=f"""Triage assessment:
{urgent_triage_response.content}

Patient requires intervention within 30 minutes. 10 minutes have elapsed.
Create an emergency treatment plan.""")
]

urgent_physician_response = llm.invoke(urgent_physician_messages)
print(urgent_physician_response.content)

# Agent 3: Specialist (must respond to urgency)
print("\n" + "-" * 60)
print("SPECIALIST CONSULTATION (URGENT)")
print("-" * 60)

urgent_specialist_messages = [
    SystemMessage(content=SPECIALIST_SYSTEM),
    HumanMessage(content=f"""URGENT consultation request:
{urgent_physician_response.content}

Patient has STEMI. 20 minutes remaining until critical window closes.
Provide immediate recommendations.""")
]

urgent_specialist_response = llm.invoke(urgent_specialist_messages)
print(urgent_specialist_response.content)

print("\n" + "=" * 60)
print("DYNAMIC SCENARIO COMPLETE")
print("=" * 60)

## Analysis: Comparing Static vs Dynamic Coordination

In [None]:
print("=" * 60)
print("COORDINATION ANALYSIS")
print("=" * 60)

# Analysis framework
analysis = """\n
**Questions to consider:**

1. Did the triage nurse's assessment change appropriately with urgency?
   - Static: Standard assessment protocol
   - Dynamic: Should emphasize time-critical nature
   - Observe: Did urgency propagate to assessment?

2. Did the physician's plan adapt to time constraint?
   - Static: Methodical diagnostic workup
   - Dynamic: Should prioritize rapid intervention
   - Observe: Were time-consuming steps removed?

3. Did the specialist recognize temporal constraint?
   - Static: Comprehensive consultation
   - Dynamic: Should provide immediate actionable guidance
   - Observe: Was response appropriately abbreviated?

4. Are the three plans mutually compatible given time constraint?
   - Do timelines align across all three agents?
   - Are resource allocations consistent?
   - Can all steps be executed within 30-minute window?

**Typical failure modes observed:**

- Triage nurse mentions urgency but doesn't change protocol
- Physician includes time-consuming tests despite constraint
- Specialist provides detailed workup requiring >30 minutes
- No explicit "time budget" allocation across agents
- Agents act independently without shared temporal awareness

**Root cause:**

Agents process "urgency" as a text token, not as a constraint that
should fundamentally restructure their decision process.

They lack:
1. Continuous temporal state representation
2. Explicit constraint checking mechanisms  
3. Shared temporal awareness for coordination

**Expected behavior in human system:**

- Immediate code STEMI activation
- All non-essential steps skipped
- Direct communication (not sequential hand-offs)
- Parallel processing where possible
- Explicit time budget: "Cath lab ready in 5 minutes"
"""

print(analysis)

print("\n" + "=" * 60)
print("KEY INSIGHT")
print("=" * 60)
print("""
Coordination that works in static scenarios often fails under
time pressure because agents cannot dynamically adapt plans
or maintain shared temporal awareness.

This is not a prompt engineering problem.
This is an architectural limitation.
""")

---

## Instructor Notes

### Teaching Strategy

**Before running**: Ask students to predict what will happen
- Will agents adapt to time pressure?
- How would humans handle this scenario?
- Where do they expect breakdown?

**During execution**: Point out specific coordination patterns
- Note when agents mention urgency but don't change behavior
- Highlight time-consuming steps that persist
- Show lack of explicit time allocation

**After completion**: Compare agent behavior to human protocols
- Real STEMI activation takes <5 minutes
- All non-essential steps are skipped
- Communication is direct and parallel

### Common Student Questions

**Q: Can we fix this with better prompts?**
A: Partially, but brittlely. Adding explicit time reminders helps but doesn't create systematic temporal awareness.

**Q: What about few-shot examples?**
A: Helps for scenarios similar to examples. Fails on novel urgent situations.

**Q: Is this a Claude-specific issue?**
A: No. All autoregressive LLMs show similar limitations. This is architectural.

**Q: How do we solve this in production?**
A: Hybrid architecture (Chapter 4): LLM for understanding + deterministic logic for temporal constraints.

### Time Management

- Setup: 2 minutes
- Static scenario: 5 minutes
- Dynamic scenario: 5 minutes  
- Analysis discussion: 8 minutes
- Student questions: 5 minutes
- **Total: 25 minutes**

### Variations

If time permits, try:
- Different temporal constraints (60 min, 15 min, 5 min)
- More agents (adding pharmacy, lab, radiology)
- Resource constraints (only one cath lab available)

### Transition to Example 2

"We've seen temporal coordination failure. Now let's see how this causes cost explosion when agents retry and repeat without proper monitoring..."