Skip to content

ppiova/ai-agents-compose-stack

Repository files navigation

AI Agents Compose Stack — Multi-Agent Workflow with Built-in Observability

.NET Docker Compose OpenTelemetry Aspire Dashboard Agent Framework License: MIT Publish Docker image to GHCR GHCR

A Researcher → Writer sequential workflow built with Microsoft Agent Framework, fully containerized, with OpenTelemetry traces, metrics and logs shipped to a standalone Aspire Dashboard — all orchestrated by Docker Compose.

┌──────────────────────────────┐      OTLP gRPC      ┌──────────────────────────┐
│  workflow (.NET 8)           │  ────────────────▶  │  Aspire Dashboard        │
│  Researcher → Writer         │   traces + metrics  │  (localhost:18888 UI)    │
│  OpenTelemetry instrumented  │   + logs            │  mcr.microsoft.com/...   │
└──────────────────────────────┘                     └──────────────────────────┘
                   │                                             │
                   └───────── otel-net (bridge) ─────────────────┘

Part of a Docker-first series for Microsoft Agent Framework: agent-framework-devcontainer · mcp-docker-starter · ai-agents-compose-stack


Why this matters

Agent workflows in production fail silently without observability. A naive agent loop can silently:

  • Loop on tool calls,
  • Burn tokens,
  • Miss rate-limit errors,
  • Drift from the intended orchestration.

This stack shows the full loop of observability from a multi-agent workflow down to a dashboard — with zero external cloud dependency. You bring it up with docker compose up, and you can see every LLM call, span duration, token count, log line and custom activity in real time.


What you get

Component Image Purpose
workflow Built from local Dockerfile (.NET 8 Alpine) Runs the multi-agent workflow
aspire-dashboard mcr.microsoft.com/dotnet/aspire-dashboard:9.0 OTLP receiver + UI (traces/logs)

Everything joined by a private otel-net bridge network. Ports 18888 (UI) and 18889 (OTLP gRPC) exposed on the host for convenience.


Requirements

  • Docker Desktop (or Docker Engine + Compose v2)
  • An Azure OpenAI resource with a chat deployment (e.g. gpt-4o-mini)

Pull from GHCR (skip the build)

The workflow image ships multi-arch (linux/amd64 + linux/arm64) with SBOM and provenance. Reference it in compose.yaml instead of building locally:

services:
  aspire-dashboard:
    image: mcr.microsoft.com/dotnet/aspire-dashboard:9.0
    # ...unchanged

  workflow:
    image: ghcr.io/ppiova/ai-agents-compose-stack:latest
    env_file: [.env]
    environment:
      OTEL_EXPORTER_OTLP_ENDPOINT: "http://aspire-dashboard:18889"
    depends_on: [aspire-dashboard]
    networks: [otel-net]

See the published package at ghcr.io/ppiova/ai-agents-compose-stack.


Quickstart

git clone https://github.com/ppiova/ai-agents-compose-stack.git
cd ai-agents-compose-stack
cp .env.example .env
# edit .env with your Azure OpenAI values
docker compose up --build

Then open the dashboard:

👉 http://localhost:18888

You'll see:

  • Structured logs with scopes and levels from the workflow
  • Traces with spans for sequential-workflow, each agent run, and each Azure OpenAI call
  • Metrics — token counts, request durations, tool-call counts

Run on a custom topic

docker compose run --rm workflow "Cómo diseñar un pipeline de RAG escalable con Azure AI Search"

Tail the workflow logs only

docker compose logs -f workflow

How the telemetry is wired

src/Program.cs:

using var tracerProvider = Sdk.CreateTracerProviderBuilder()
    .SetResourceBuilder(resource)
    .AddSource("AgentWorkflow")
    .AddSource("Microsoft.Agents.AI")
    .AddSource("Experimental.Microsoft.Extensions.AI")
    .AddOtlpExporter(o => o.Endpoint = new Uri(otlpEndpoint))
    .Build();

// Wrap the chat client so every LLM call gets spans + token metrics
IChatClient instrumentedChat = new ChatClientBuilder(baseChat)
    .UseOpenTelemetry(
        loggerFactory: loggerFactory,
        sourceName: "Experimental.Microsoft.Extensions.AI",
        configure: o => o.EnableSensitiveData = true)
    .Build();

Two activity sources are subscribed:

  • Microsoft.Agents.AI — spans for agent runs
  • Experimental.Microsoft.Extensions.AI — spans for individual IChatClient calls (includes prompt/response when sensitive data is enabled)

The OTLP gRPC endpoint defaults to http://aspire-dashboard:18889 — Docker Compose service discovery does the rest.

⚠️ EnableSensitiveData = true ships prompts and completions to the dashboard. Great for local debugging — do not enable in production unless you have controlled the downstream storage.


The workflow

Two specialized agents chained sequentially:

Agent Role
Researcher Produces 3–5 concrete bullets about a topic (in Spanish)
Writer Turns those bullets into a <=120 word technical note

A root activity sequential-workflow wraps each topic so you can see the complete span tree in the dashboard: topic → researcher → OpenAI call → writer → OpenAI call.


Project layout

.
├── .devcontainer/
│   └── devcontainer.json            # Codespaces/VS Code ready, forwards 18888/18889
├── src/
│   ├── AgentWorkflow.csproj         # .NET 8 + Microsoft.Agents.AI + OpenTelemetry
│   └── Program.cs                   # OTel setup + Researcher → Writer workflow
├── compose.yaml                     # workflow + aspire-dashboard on otel-net
├── Dockerfile                       # multi-stage, Alpine runtime, non-root
├── .dockerignore
├── .env.example
├── .gitignore
├── LICENSE
└── README.md

Extend this stack

  • Add a persistence service (Redis / Postgres) for agent memory across runs
  • Replace Aspire Dashboard with Jaeger or Grafana Tempo + Loki — same OTLP wire, different receiver
  • Scale workflow with deploy.replicas: N to simulate concurrent agent runs and compare traces
  • Add Prometheus to scrape metrics from OTLP collector alongside the dashboard
  • Swap the workflow for a group chat or handoff pattern — the observability wiring is the same

Auth modes

Same pattern as the rest of the series:

  1. AZURE_OPENAI_API_KEY if set → key auth (simplest inside Docker)
  2. Otherwise → AzureCliCredential (Dev Container after az login)
  3. Otherwise → DefaultAzureCredential (Managed Identity in production)

License

MIT — by Pablo Piovano · Microsoft MVP in AI.

About

Multi-agent workflow with Microsoft Agent Framework, OpenTelemetry + Aspire Dashboard observability, fully in Docker Compose.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors