Forge v0.14.0 — End-to-end OpenTelemetry distributed tracing for AI agents

Published: 2026-06-09 · Tag: v0.14.0 · Previous release: v0.13.0 (2026-06-07)

Forge v0.14.0 ships OpenTelemetry Tracing v1 — end-to-end distributed tracing for AI agents covering A2A dispatch, the executor loop, every LLM completion, every tool call, and every outbound HTTP request. Traces export over OTLP to Tempo, Jaeger, Honeycomb, Datadog, Grafana Cloud, or any other OTLP-compatible backend. Multi-hop A2A flows display as one connected trace. Audit events carry the active span's trace_id and span_id so operators pivot from compliance row to performance trace with a single copy-paste. The OTLP collector hostname auto-injects into the build's egress allowlist so Kubernetes deployments need no second NetworkPolicy edit.

This release closes initiative #108 across seven phases (one PR per phase, #122–#128), plus a bug-2 fix for A2A tasks/send message validation and the docs sync across the operator-facing surface.

45 files changed, +5,494 / −42 lines, 8 issues closed, 10 PRs merged since v0.13.0. Full per-PR detail in CHANGELOG.md.

OpenTelemetry tracing — what it covers

When observability.tracing.enabled: true is set in forge.yaml, Forge produces one OTLP span tree per inbound A2A request:

a2a.<method>                          [SpanKindServer; dispatcher]
└── agent.execute                     [outer loop; root for the task]
    ├── llm.completion (× N turns)    [per LLM provider call — Anthropic / OpenAI / Ollama]
    │   └── http.client (× outbound)  [auto via otelhttp on egress transport]
    └── tool.<tool_name> (× M calls)  [per tool invocation]
        └── http.client (if HTTP)

Every llm.completion carries OpenTelemetry GenAI semantic-convention attributes: gen_ai.system, gen_ai.request.model, gen_ai.response.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.response.finish_reasons. Cost-attribution dashboards in any OTLP backend (Tempo + Grafana, Jaeger, Honeycomb, Datadog APM, Grafana Cloud) light up immediately.

Forge-specific attributes use the forge.* namespace: forge.task.id, forge.task.final_state (completed / failed / canceled), forge.tool.name, forge.workflow.id / .stage.id / .step.id (from FWS-2 X-Workflow-* headers), forge.correlation_id, forge.loop.iteration.

How to enable

In forge.yaml:

observability:
  tracing:
    enabled: true
    endpoint: https://otel-collector.monitoring.svc.cluster.local:4318/v1/traces
    sampler: parentbased_always_on

Or via standard OTEL_* environment variables:

export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:4318/v1/traces
export OTEL_TRACES_SAMPLER=always_on
forge run --otel-enabled

Or per-command via nine new --otel-* CLI flags on forge run and forge serve start:

forge run --otel-enabled \
  --otel-endpoint http://localhost:4318/v1/traces \
  --otel-sampler always_on \
  --otel-service-name my-agent-staging

Resolution precedence: CLI flags > OTEL_* env vars > observability.tracing in forge.yaml > defaults. A set-but-empty env var does not wipe a non-empty yaml value (absence-of-value is "no override," not "unset").

All 10 standard OTEL_* SDK env vars are honored: OTEL_SDK_DISABLED, OTEL_EXPORTER_OTLP_TRACES_ENDPOINT, OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_EXPORTER_OTLP_PROTOCOL, OTEL_EXPORTER_OTLP_HEADERS, OTEL_EXPORTER_OTLP_TIMEOUT, OTEL_SERVICE_NAME, OTEL_RESOURCE_ATTRIBUTES, OTEL_TRACES_SAMPLER, OTEL_TRACES_SAMPLER_ARG.

End-to-end propagation across multi-hop A2A flows

Forge installs the W3C tracecontext + baggage composite propagator on the OpenTelemetry global at startup. The JSON-RPC dispatcher extracts inbound traceparent headers before opening its own span, and outbound HTTP through the egress-enforced transport re-injects traceparent automatically via otelhttp. Multi-hop flows — orchestrator → agent A → agent B → external API — show as one connected trace in the backend:

orchestrator
    │  traceparent: 00-T-S1-01
    ▼
┌───────────────┐
│  forge agent  │  span_id=S2, parent=S1   (a2a.tasks/send)
│      ▼        │  span_id=S3, parent=S2   (agent.execute)
│      ▼        │  span_id=S4, parent=S3   (llm.completion)
└──────│────────┘
       ▼  traceparent: 00-T-S2-01  ← otelhttp re-injects on outbound
                                      via the egress-enforced transport
┌───────────────┐
│  downstream   │  span_id=S5, parent=S2   (a2a.tasks/send)
│  forge agent  │
└───────────────┘

All spans share trace_id = T and chain by parent_span_id. The operator sees one flame graph across every hop.

baggage (the other half of the composite propagator) flows through to the handler context so application-level identifiers — tenant_id, user_id, A/B-test bucket — travel with the trace.

Audit ↔ trace cross-link

Forge has shipped a structured audit pipeline since FWS-1 (NDJSON to stderr + opt-in Unix Domain Socket sink). As of v0.14.0, every audit event emitted from a request-scoped context carries the active span's trace_id and span_id:

{
  "event": "llm_call",
  "task_id": "t-1234",
  "trace_id": "4a8f95a0e1bedda42c9dd5350fb3b33a",
  "span_id":  "ad8b2c91e44f0a72",
  "model": "claude-sonnet-4",
  "input_tokens": 1240,
  "output_tokens": 187,
  "duration_ms": 2310
}

Two new pivot directions for operators:

Audit row → trace: paste the trace_id into Tempo / Jaeger / Honeycomb to land on the matching trace; paste the span_id to jump directly to the llm.completion child carrying the matching gen_ai.usage.* tokens.
Trace → audit row: copy the trace_id from a trace browser, grep the audit log for the corresponding row, get the FWS-8 payload metadata the trace doesn't carry.

Format: lowercase hex matching W3C traceparent semantics — 32-char (128-bit) trace_id, 16-char (64-bit) span_id.

Backward compatible by construction: both fields use omitempty. When tracing is disabled (the default), audit JSON is byte-identical to pre-v0.14.0. The audit schema version ("1.0") is not bumped — the documented policy treats adding optional fields as schema-compatible.

Egress-enforced OTLP transport + build-time auto-merge

The OTLP HTTP exporter rides through the same egress enforcer every other in-process Forge HTTP client uses. The operator's egress allowlist bounds where Forge can send spans — a misconfigured collector URL cannot exfiltrate span content to an unapproved destination. The post-DNS IP guard applies to OTLP too.

No second egress edit needed. forge package and forge run both extract the OTLP endpoint hostname and auto-inject it into egress_allowlist.json. The Kubernetes NetworkPolicy generated by forge package therefore admits OTLP traffic automatically:

# forge.yaml — this is sufficient. The collector is added to the allowlist automatically.
egress:
  mode: allowlist
  allowed_domains:
    - api.anthropic.com    # operator-declared
observability:
  tracing:
    enabled: true
    endpoint: https://otel-collector.monitoring.svc.cluster.local:4318/v1/traces

# generated egress_allowlist.json:
# all_domains = [api.anthropic.com, otel-collector.monitoring.svc.cluster.local]

Disabled tracing produces no allowlist entry — turning tracing off in yaml does not leave a stale entry punched through the NetworkPolicy.

Tracing-off safety: nothing crashes the agent

Off by default per the initiative ruling. When tracing is off, enabled: false, or endpoint is empty:

coreruntime.Tracer() returns a non-recording no-op tracer; spans cost near zero (one interface dispatch).
EmitFromContext does not stamp trace_id / span_id; audit JSON is byte-identical to pre-Phase-4.
OTelDomain returns nil; no entry in egress_allowlist.json.
observability.WrapHTTPTransport is a near pass-through.

Telemetry failures never crash the agent. A misconfigured endpoint, a malformed traceparent header, an unreachable collector — every failure mode falls through to the no-op tracer with a warning in the ops log. The CLI resolver is the single place that fails loudly on bad config at startup.

Also in this release

A2A message validation at tasks/send entry points (#119 / PR #121). The most common failure was a client sending the pre-A2A-0.3.0 "type": "text" discriminator instead of the spec-correct "kind": "text" — encoding/json silently dropped the unknown field, the executor produced a confused reply ("It looks like your message didn't come through"). Now rejected at the dispatcher with a clear InvalidParams error naming the spec divergence: "parts[0]: part kind is required (A2A 0.3.0); got empty kind with non-empty content — did you send \"type\" instead of \"kind\"? \"type\" is from the pre-0.3.0 dialect and is silently ignored by the decoder." Sentinel errors (ErrPartKindMissing, ErrPartKindUnknown, ErrMessageRoleMissing, ErrMessagePartsEmpty) exposed for callers that branch on the cause.

FWS-3 X-Forge-* response headers stamped on JSON-RPC tasks/send (PR #120). Closes a bug where the workflow correlation and usage headers were present on the REST endpoint but missing from the JSON-RPC tasks/send path.

Documentation sync across the operator-facing surface (PR #129). New comprehensive reference at docs/core-concepts/observability-tracing.md. Updates to forge.yaml schema, CLI reference, environment-variables reference, audit-logging schema, egress-control, and deployment monitoring. The .claude/skills/forge.md knowledge skill includes a new § 12.9 Observability subsection covering all seven phases.

Per-phase issue + PR mapping

Phase	Issue	PR	What shipped
0	#101	#122	Tracer seam in `forge-core/runtime/tracing.go` (no-op default) + composite W3C `tracecontext + baggage` propagator installed on OTel global
1	#102	#123	Real OTLP SDK tracer provider in `forge-core/observability` — HTTP/protobuf + gRPC, batch processor, semconv 1.26.0 resource
2	#103	#124	Config resolver (yaml < env < flags), nine `--otel-*` CLI flags, runner wiring
3	#104	#125	Span instrumentation across A2A dispatcher, executor loop, LLM provider calls, tool invocations, outbound HTTP
4	#105	#126	Audit ↔ trace cross-link: `trace_id` + `span_id` on every audit event via `EmitFromContext`
5	#106	#127	Inbound `traceparent` extraction at the JSON-RPC dispatcher boundary — multi-hop A2A flows form one trace
6	#107	#128	Build-time + runtime egress auto-merge of the OTLP collector hostname into `egress_allowlist.json`

Upgrade guide

Backward-compatible. Default behavior is unchanged: tracing is off unless observability.tracing.enabled: true is set in forge.yaml, OTEL_SDK_DISABLED=false is set in env with an endpoint, or --otel-enabled is passed on the command line.

If you have an OTel collector deployed, the minimum config is:

# forge.yaml
observability:
  tracing:
    enabled: true
    endpoint: <YOUR_OTLP_HTTP_ENDPOINT>

Then forge build && forge package && kubectl apply -f ... — the generated NetworkPolicy will admit OTLP traffic automatically. No second egress edit needed. No code changes required to existing agents.

If your collector listens on localhost (e.g. via kubectl port-forward) during local development, add egress.allow_private_ips: true to forge.yaml so the egress enforcer's post-DNS IP guard admits the loopback address.

Documentation

Observability — Tracing — full reference (schema, CLI flags, env, span hierarchy, propagation, audit cross-link, egress)
forge.yaml schema — observability.tracing section
CLI reference — all --otel-* flags
Environment variables — OTEL_* env vars Forge honors
Audit logging — trace cross-link semantics
Egress control — OTel collector auto-extension
Deployment monitoring — operator-level integration
CHANGELOG.md — full per-PR detail

Installation

Download a pre-built binary from the assets attached below (forge-Darwin-arm64.tar.gz, forge-Darwin-x86_64.tar.gz, forge-Linux-arm64.tar.gz, forge-Linux-x86_64.tar.gz, forge-Windows-arm64.zip, forge-Windows-x86_64.zip). Verify with checksums.txt.

Container image: ghcr.io/initializ/forge:v0.14.0.

Homebrew: brew install initializ/tap/forge.

From source:

git clone --branch v0.14.0 https://github.com/initializ/forge
cd forge
go install ./forge-cli/cmd/forge

Compatibility

Go: 1.25+
A2A: 0.3.0
OpenTelemetry SDK: 1.44.0 (semconv 1.26.0)
OTLP: HTTP/protobuf (recommended) and gRPC, against any OTLP-compatible backend including Grafana Tempo, Jaeger, Honeycomb, Datadog APM, Grafana Cloud, New Relic, and the OpenTelemetry Collector

Acknowledgements

OpenTelemetry Tracing v1 was shipped as a strict 7-PR plan across two days, one PR per phase. Implementation followed an audit-pipeline mirror pattern: every design decision that produced a usable audit signal in earlier work (FWS-1 through FWS-10) was applied to the new tracing pipeline. Failure-domain isolation, backward-compatible schema additions, off-by-default posture, and the egress-enforced transport contract carry forward verbatim.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.14.0

Choose a tag to compare

Sorry, something went wrong.