# Observability & Telemetry

This notebook demonstrates how to use Llama Stack's telemetry for tracing and debugging workflows.

## Prerequisites

- Llama Stack running at `http://localhost:5001`
- MyloWare API running at `http://localhost:8000`
- Jaeger or similar tracing backend (optional)


In [None]:
import sys
sys.path.insert(0, '../src')

import os
import logging
from llama_stack_client import LlamaStackClient

client = LlamaStackClient(base_url="http://localhost:5001")
API_URL = os.getenv("MYLOWARE_API_URL", "http://localhost:8000")


## 1. Structured Logging

MyloWare uses structured logging throughout the codebase:


In [None]:
# Configure logging to see MyloWare's internal operations
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)

# Enable debug logging for specific modules
logging.getLogger('workflows.orchestrator').setLevel(logging.DEBUG)
logging.getLogger('tools').setLevel(logging.DEBUG)

print("Logging configured")


## 2. Llama Stack Telemetry

Llama Stack provides built-in telemetry for tracking agent operations:


In [None]:
# Example telemetry configuration for llama_stack/run.yaml:
telemetry_config = """
telemetry:
  - provider_id: otel
    provider_type: remote::opentelemetry
    config:
      service_name: myloware
      exporter_type: otlp
      otlp_endpoint: http://localhost:4317
"""
print("Telemetry configuration example:")
print(telemetry_config)


## 3. Trace Correlation

MyloWare correlates traces across the entire workflow:


In [None]:
# Each workflow run has a unique run_id that can be traced
print("Trace correlation:")
print("- run_id links all operations in a workflow")
print("- span_id identifies individual agent/tool calls")
print("- trace_id groups related runs for analysis")
print()
print("Use Jaeger UI at http://localhost:16686 to view traces")


## 4. Key Metrics

Important metrics to track for production:


In [None]:
print("Key metrics to track:")
metrics = [
    "workflow_completion_rate - % of runs that complete successfully",
    "step_duration_seconds - Time spent in each workflow step",
    "tool_invocation_count - Number of tool calls per run",
    "tool_success_rate - % of tool calls that succeed",
    "token_usage_total - Total tokens consumed per agent",
    "error_rate_by_step - Error rate broken down by step type",
    "external_api_latency - Response time from KIE.ai, Remotion, etc.",
]
for m in metrics:
    print(f"  â€¢ {m}")


## 5. Jaeger Setup

Start Jaeger for distributed tracing:

```bash
docker run -d \
  -p 16686:16686 \
  -p 4317:4317 \
  jaegertracing/all-in-one:latest
```

Then configure Llama Stack telemetry to export to `localhost:4317` and open http://localhost:16686
