v0.13.0
Forge v0.13.0 — A2A 0.3.0 conformance, audit pipeline hardening, and configurable platform policy
Published: 2026-06-07
Forge v0.13.0 is a feature release covering every commit merged to main
since v0.12.0
on 2026-05-25. 119 files changed, +16,324 / −566 lines,
13 issues closed, 10 numbered workstreams (FWS-1 through FWS-10)
plus the new Claude knowledge skills.
If you operate Forge in production, this release lands the
verifiable version of the security and observability claims the
project has made since v0.10: an A2A 0.3.0 spec-conformant Agent Card,
a documented schema-versioned audit contract with monotonic sequence
numbers and a metadata-only invariant pinned by regression tests, a
dedicated Unix Domain Socket sink for audit export, a three-layer
platform policy (system / user / workspace) with first-match-wins
attribution, configurable per-IP rate limiting with sensible
orchestration-friendly defaults, and a one-line stream split so SIEM
pipelines can route ops logs and audit NDJSON without parsing
payloads.
The full per-PR detail lives in CHANGELOG.md.
Highlights
- A2A 0.3.0 spec-conformant Agent Card at the canonical
/.well-known/agent-card.jsonpath. Every required field populated
fromforge.yamlandSKILL.md. Legacy/.well-known/agent.json
alias preserved with aDeprecation: trueheader per RFC 8594. - Hardened audit emission: every event carries
schema_version: "1.0"and a monotonic per-invocationseqso
consumers detect gaps and reordering by grouping
(correlation_id, task_id). The default audit posture is
metadata-only (token counts, sizes, durations — never raw prompt or
completion text or tool args/results), and aTestNoPayloadByDefault_LLMCall
regression test pins the invariant. - Audit event export via Unix Domain Socket sink (preferred) or
localhost HTTP fallback. Fire-and-forget, 50ms per-event timeout,
exponential reconnect backoff, periodicaudit_export_statusevent
every 60s carrying per-sink health. - Three-layer platform policy. The single-layer
FORGE_PLATFORM_POLICYenv from v0.12 is now joined by a system
layer (/etc/forge/policy.yaml, sysadmin) and a user layer
(~/.forge/policy.yaml, developer). Deny lists union; max-bounds
use smallest non-zero; audit attribution credits the first layer
to deny. - Channel deny is first-class. A new
denied_channelsfield in
the platform policy schema skips named adapters at startup with a
channel_denied_by_policyaudit event.forge channel disable/
enableedit the user layer;--systemretargets to the system
layer. - Per-IP rate limit is configurable end-to-end via
server.rate_limit:inforge.yaml, matching--rate-limit-*
CLI flags, andFORGE_RATE_LIMIT_*env vars. Resolution order:
CLI > env > yaml > defaults. Defaults bumped for orchestrated
workloads — write side now 60/min sustained with burst 20 (was
10/min, burst 3).tasks/cancelis exempt from the write bucket
by default so a cost-ceiling cancel burst never throttles the
recovery mechanism. - Workflow correlation —
X-Workflow-ID,X-Workflow-Stage-ID,
X-Workflow-Step-ID, andX-Invocation-Callerheaders are extracted
on every request and stamped on every audit event for multi-agent
flow tracing. - Token usage in audit and response headers. Every
llm_call
event carriesinput_tokens/output_tokens/duration_ms/
model/provider; every A2A response carries
X-Forge-Tokens-In/Out,X-Forge-Duration-Ms,X-Forge-Model,
X-Forge-Providerfor cost-enforcement pipelines. tasks/cancelactually cancels. The JSON-RPCtasks/cancel
method now signals an in-flight invocation via
context.CancelCauseFuncwith a typed reason. The agent loop
honors cancellation at iteration and tool-call boundaries. An
invocation_cancelledaudit event closes the cancelled invocation
with the reason + partial token totals.- Ops logs to stdout, audit to stderr. Container log collectors
and SIEM pipelines can now split the two at the stream level —
no payload parsing. - New Claude-side reference:
.claude/skills/forge.mdships a
single self-contained knowledge file for Forge, and
.claude/skills/forge-skill-builder.mdports the Web UI Skill
Builder's prompt for use in any Claude chat. Thesync-docs
workflow keeps both current with the code.
What's new
1. A2A 0.3.0 Agent Card conformance (FWS-1, #85)
Forge's Agent Card now conforms to the A2A 0.3.0 spec end-to-end:
canonical path, required-field shape, security schemes derived from
the configured auth chain, and SKILL.md skills bridged into the
published card at both forge build and forge run time.
| What | Where |
|---|---|
| Canonical path | GET /.well-known/agent-card.json |
| Legacy alias | GET /.well-known/agent.json — same body + Deprecation: true + Link: rel=successor-version per RFC 8594. Removable one release after this ships. |
| Required fields | name, description, url, version (from forge.yaml), protocolVersion: "0.3.0", defaultInputModes, defaultOutputModes, skills[], capabilities, securitySchemes, security |
| Skills mapping | SKILL.md frontmatter (name, description, category, tags) → A2A AgentSkill objects with deterministic ID-sorted output so card bytes are stable across restarts |
| Security schemes | Derived from auth.providers — oidc/azure_ad → openIdConnect with discovery URL; static_token/http_verifier → http+bearer; gcp_iap → apiKey in header; aws_sigv4 → http+bearer with bearerFormat: "forge-aws-v1" |
| New audit event | agent_card_published at startup + hot-reload, carrying name, version, protocol_version, url, skill_count, capabilities, security_schemes, card_size_bytes, card_sha256 — consumers detect config drift across deploys |
See docs/reference/a2a-agent-card.md.
2. Workflow correlation ID threading (FWS-2, #86)
Orchestrators (initializ Command, custom controllers, GitOps tooling)
that fire multi-agent workflows now get end-to-end audit traceability
without any per-agent code change. Inbound requests carrying
X-Workflow-ID / X-Workflow-Stage-ID / X-Workflow-Step-ID /
X-Invocation-Caller headers have those values extracted into the
request context, threaded through the agent loop, and stamped on
every audit event the invocation produces — invocation_complete,
llm_call, tool_exec, etc.
The emitted JSON is byte-for-byte identical to the pre-FWS-2 shape
for direct (non-orchestrated) calls, so consumers reading the v0.12
audit stream continue to work unchanged.
See docs/security/workflow-correlation.md.
3. Token usage + execution duration in audit (FWS-3, #87)
Every llm_call audit event now carries input_tokens,
output_tokens, model, provider, duration_ms, and request_id
captured directly from provider response metadata (Anthropic, OpenAI,
Ollama via the OpenAI-compatible path, OpenAI Responses). Field naming
aligns with OTel GenAI semantic conventions
(gen_ai.usage.input_tokens / gen_ai.usage.output_tokens) so audit
consumers correlate to OTel traces without a translation table.
When a provider returns no usage (some self-hosted Ollama setups),
the event flags tokens_unavailable: true rather than silent zeros.
Each tool_exec event gains duration_ms plus structured arg-shape
metadata (args_size, result_size). Raw arg values are not emitted
(that's FWS-8's payload-stripping invariant, not FWS-3's).
A new invocation_complete audit event closes every A2A invocation
with the wall-clock duration and aggregated input_tokens_total /
output_tokens_total / llm_call_count. Response headers
X-Forge-Tokens-In, X-Forge-Tokens-Out, X-Forge-Duration-Ms,
X-Forge-Model, X-Forge-Provider carry the same totals so
cost-enforcement layers act without parsing the body.
4. Cancellation signal handling (FWS-4, #88)
The A2A tasks/cancel JSON-RPC method now actually cancels in-flight
invocations instead of merely flipping the stored task state. A
per-Runner CancellationRegistry tracks every active invocation by
task ID; the cancel handler signals the registered
context.CancelCauseFunc with a typed reason
(workflow_failure / cost_limit_exceeded / timeout /
external_signal).
The agent loop honors cancellation at the iteration boundary and
between tool calls within an iteration, so cancellation latency is
bounded by the current LLM call or tool exec.
A new invocation_cancelled audit event closes every cancelled
invocation with the classified reason, duration_ms up to
cancellation, and partial token totals consumed before the signal.
The A2A response carries state canceled plus a cancelled: <reason>
message so the orchestrator can react. Cancel-after-complete is
idempotent.
5. Three-layer platform policy + channel scope (FWS-5 / FWS-6, #89 + #90)
The single-layer FORGE_PLATFORM_POLICY env from v0.12.0 is now
joined by two more layers so different operators with different
trust contexts can each express their bounds independently.
| Layer | Path | Set by |
|---|---|---|
| system | /etc/forge/policy.yaml (override FORGE_SYSTEM_POLICY) |
Sysadmin (MDM, corporate image, Ansible) |
| user | ~/.forge/policy.yaml |
Developer (via forge channel disable or Web UI chip) |
| workspace | path at FORGE_PLATFORM_POLICY |
Operator (Initializ Command, GitOps tooling); forge package pre-wires this env into generated Kubernetes manifests |
Resolution semantics:
- Deny lists (
denied_egress_domains,denied_tools,
forbidden_models,denied_channels) → union across layers - Max bounds (
max_egress_allowlist_size,max_tool_count) →
smallest non-zero value wins - Audit attribution → first layer to deny in load order
(system → user → workspace) takes credit, so operators grepping
layer=systemsee every sysadmin-enforced violation without false
positives from per-user overrides
Channel deny is now first-class: denied_channels in any layer skips
the named adapter at startup with a channel_denied_by_policy
audit event. The runner continues with the remaining channels —
channel skip is non-fatal, unlike egress / tool / model violations
which abort startup.
CLI:
forge channel disable slack # edits ~/.forge/policy.yaml (user layer)
forge channel disable slack --system # edits /etc/forge/policy.yaml (system layer; warns when not root)
forge channel enable slack # removes from the user layerThe Web UI surfaces all three layers via new GET/PUT /api/user-policy
endpoints. The agent card renders denied channels as locked / dimmed
chips with a tooltip naming the enforcing layer.
See docs/security/platform-policy.md
and examples/platform-policy.yaml.
6. Audit event export — UDS sink + HTTP fallback (FWS-7, #95)
Audit events can now be exported to a local Unix Domain Socket
(preferred) or localhost HTTP endpoint in addition to the existing
NDJSON-to-stderr stream. An in-pod sidecar (the initializ platform
receiver or a customer SIEM) can consume audit at low latency while
stderr stays as the safety-net fallback.
Configure via:
| Flag | Env | Purpose |
|---|---|---|
--audit-socket |
FORGE_AUDIT_SOCKET |
Unix Domain Socket path (preferred) |
--audit-http-endpoint |
FORGE_AUDIT_HTTP_ENDPOINT |
localhost HTTP POST endpoint (fallback) |
--audit-write-timeout |
FORGE_AUDIT_WRITE_TIMEOUT |
per-event timeout (default 50ms) |
Behavior:
- Lazy connect — the socket need not exist at agent startup; the
first emit dials. Sidecar deploys that come up later pick up future
events without restarting the agent. - Per-event timeout — 50ms default. Beyond that the event drops
and incrementsdrops_timeout. A slow sidecar cannot back-pressure
the agent. - Exponential backoff between failed dials — 100ms → 200ms → … →
5s cap. During backoff, writes drop without dialing. - No buffering — the sink is fire-and-forget; buffering is the
sidecar's job. - Byte parity — events leaving the export sink are byte-identical
to events leaving stderr. No sink transforms the payload.
A periodic audit_export_status event fires every 60s carrying
per-sink writes_ok, drops_timeout, drops_dial, and connected
counters so operators tail the audit stream itself to confirm export
health.
See docs/security/audit-logging.md § Audit Event Export.
7. Hardened audit emission — schema, sequence, payload (FWS-8, #91)
The audit event schema is now a stable, versioned contract.
- Every event carries
schema_version: "1.0". The schema is
additive-by-default; the version bumps only on removals or semantic
changes. - Every event emitted on behalf of an A2A invocation carries a
monotonicseqfield starting at1. Consumers detect gaps
(lost events) and reordering by grouping
(correlation_id, task_id). Startup events
(policy_loaded,agent_card_published,audit_export_status)
omitseq(no invocation scope). - The default audit posture stays metadata-only — token counts,
sizes, durations, tool names. No prompt text, no completion text,
no raw tool args / results. ATestNoPayloadByDefault_LLMCall
regression test pins the invariant — any future caller that
smuggles raw user content into a default event fails the test. - Customers who need raw payloads (debugging incidents,
supervised-learning corpora, compliance replay) opt in field by
field viaAuditPayloadCaptureon the runner config:
LLMMessages/LLMResponse/ToolArgs/ToolResult. Captured
strings are truncated to a per-field byte cap (default 16 KiB) with
a…[truncated:N]marker so a runaway prompt cannot bloat one
audit event.
Audit-event signing is deferred per the issue's architectural
recommendation — sequence numbers cover gap detection in the
meantime, and schema_version gives signing a stable place to land
when added.
See docs/security/audit-logging.md § Schema contract.
8. Ops logger output to stdout (FWS-9, #100)
forge run / forge serve use the OS streams as a stream-level
audit-vs-ops split so container log collectors and SIEM pipelines can
route the two concerns separately without parsing any payload.
| Stream | Carries | Consumer |
|---|---|---|
| stdout | Ops logs — startup banner, request lines, runtime errors via r.logger.Info/Warn/Error (JSONLogger) |
Container log collector / local debugging |
| stderr | Audit NDJSON — every event constant. After FWS-7, also lands on the dedicated UDS / HTTP sink in parallel (stderr stays as the safety-net fallback) |
SIEM pipeline today / FWS-7 sink primary |
Operator migration: if you previously captured ops logs via
forge run 2> ops.log, switch to forge run > ops.log (and
2> audit.log for audit). Container deployments that capture both
streams via the runtime's standard log collector are unaffected.
Interactive CLI commands (forge init, forge build,
forge channel) keep writing user-facing warnings + errors to stderr
— those are UX messages, not server ops logs, and the stream-split
policy doesn't apply.
9. Rate limit configurability + cancel exemption (FWS-10, #110)
The per-IP A2A rate limiter from #31 is now configurable end-to-end
via a new top-level server.rate_limit: block in forge.yaml,
matching --rate-limit-* CLI flags, and FORGE_RATE_LIMIT_* env vars.
| Field | New default | Previous |
|---|---|---|
read_rps |
1.0 (60/min) |
unchanged |
read_burst |
10 |
unchanged |
write_rps |
1.0 (60/min) |
10/60 ≈ 10/min |
write_burst |
20 |
3 |
cancel_exempt |
true |
(didn't exist) |
Why the bump. The old defaults predated parallel workflow
execution and cron bursts — a 10-step parallel stage was serialized
after the 3rd dispatch. The new 60/min sustained + burst 20 absorbs
orchestrator dispatch without throttling and still bounds anonymous
public-internet DoS at 1 task/sec sustained.
Why cancel exemption. The cost-ceiling cancel-burst case
(orchestrator firing N parallel tasks/cancel when a workflow budget
trips) was hitting -32603: rate limit exceeded at exactly the moment
cancellation matters most. DoS via cancel-spam is naturally bounded by
the registry's O(1) unknown-task lookup, so exempting cancel from the
write bucket is safe by default. Configurable via
cancel_exempt: false for stricter threat models.
Resolution per field is CLI > env > yaml > defaults.
See docs/reference/forge-yaml-schema.md § server.rate_limit.
10. Workspace-level Skill Builder LLM config (#92)
The Web UI Skill Builder now resolves its LLM from a workspace-level
.forge/ui.yaml (with ~/.forge/ui.yaml as fallback), decoupled from
any agent's runtime model. Operator picks the model verbatim — no
hardcoded "upgrade" to gpt-4.1 or claude-opus-4-6. Credentials are
threaded as request-scoped values; the pre-#92 os.Setenv leak that
caused cross-agent credential stomping when switching agents in the UI
is removed.
See docs/ui/skill-builder-llm.md.
11. Claude knowledge skills (PR #118)
Two new Markdown files under .claude/skills/ make Forge knowledge
loadable into any Claude conversation:
.claude/skills/forge.md— single comprehensive knowledge skill.
Architecture, security model, A2A protocol, CLI surface,forge.yaml
schema, audit pipeline, how to build agents + skills, FWS-1–FWS-10
recap, docs map, recipes. ~600 lines, table of contents, every
claim cross-links to the canonical doc..claude/skills/forge-skill-builder.md— byte-identical port of
the Web UI Skill Builder's system prompt
(forge-ui/skill_builder_context.go:skillBuilderSystemPrompt).
Author a validSKILL.mdin any Claude chat without running
forge ui.
The sync-docs workflow keeps both current with the code; the
byte-identical invariant for the skill-builder file is documented as
a rule in .claude/commands/sync-docs.md.
New audit events
Seven new event constants land in this release. All carry the FWS-8
schema_version and seq fields when emitted in an invocation scope.
| Event | When |
|---|---|
agent_card_published |
Agent Card finalized at startup + hot-reload. card_size_bytes, card_sha256 for drift detection. (FWS-1) |
invocation_complete |
A2A invocation closed; duration_ms, token totals, llm_call_count. (FWS-3) |
llm_call_cancelled |
Streaming LLM call aborted mid-flight; partial usage counts. (FWS-3 / FWS-4) |
invocation_cancelled |
A2A invocation cancelled via tasks/cancel; classified reason + partial token totals. (FWS-4) |
policy_loaded |
One per non-empty platform-policy layer at startup; layer, source, per-list size counters. (FWS-5 / FWS-6) |
policy_violation_at_build_time |
forge.yaml conflicts with a policy layer at startup. violation_kind, offending_value, forge_yaml_field, layer, source. (FWS-5 / FWS-6) |
channel_denied_by_policy |
Channel adapter skipped at startup. channel, layer, source. (FWS-6) |
audit_export_status |
Every 60s when an export sink is configured. Per-sink writes_ok, drops_timeout, drops_dial, connected. (FWS-7) |
See the full event-type table in
docs/security/audit-logging.md § Event Types.
New CLI flags and env vars
Flag (on forge run / forge serve start) |
Env var | Purpose |
|---|---|---|
--audit-socket |
FORGE_AUDIT_SOCKET |
Unix Domain Socket export path (FWS-7) |
--audit-http-endpoint |
FORGE_AUDIT_HTTP_ENDPOINT |
Localhost HTTP fallback (FWS-7) |
--audit-write-timeout |
FORGE_AUDIT_WRITE_TIMEOUT |
Per-event sink timeout, default 50ms (FWS-7) |
--rate-limit-read-rps |
FORGE_RATE_LIMIT_READ_RPS |
Per-IP read RPS (FWS-10) |
--rate-limit-read-burst |
FORGE_RATE_LIMIT_READ_BURST |
Per-IP read burst (FWS-10) |
--rate-limit-write-rps |
FORGE_RATE_LIMIT_WRITE_RPS |
Per-IP write RPS (FWS-10) |
--rate-limit-write-burst |
FORGE_RATE_LIMIT_WRITE_BURST |
Per-IP write burst (FWS-10) |
--rate-limit-cancel-exempt |
FORGE_RATE_LIMIT_CANCEL_EXEMPT |
Exempt tasks/cancel from the write bucket (FWS-10) |
forge serve start forwards each flag to the forked forge run; env
vars flow through os.Environ() regardless.
forge.yaml schema additions
New top-level server: block:
server:
rate_limit:
read_rps: 1.0
read_burst: 10
write_rps: 1.0
write_burst: 20
cancel_exempt: trueAll five fields are optional; omitting them inherits the new defaults.
See docs/reference/forge-yaml-schema.md § server.rate_limit.
Removed
- Per-agent
disabled_channels:field onforge.yaml. FWS-6
shipped this in its first cut, then replaced it with the three-layer
policy'sdenied_channels. Migration: move any
disabled_channels: [X]fromforge.yamlinto
~/.forge/policy.yaml'sdenied_channels:(developer scope),
/etc/forge/policy.yaml(laptop-wide), or the workspace ConfigMap
(deployed-agent).forge channel disable <name>does this
automatically. channel_disabled_by_configaudit event.
channel_denied_by_policy(with layer attribution) carries every
skip now.
Upgrade notes
-
No action required for existing deployments that don't use the
new audit export sinks, platform policy, orserver.rate_limit.
Defaults are backward-compatible and metadata-only audit is
unchanged on the wire. -
If you previously captured ops logs via
forge run 2> ops.log,
switch toforge run > ops.log(and2> audit.logfor audit). The
FWS-9 stream split routes ops to stdout. Container deployments that
capture both streams via the runtime's standard log collector are
unaffected. -
If your SIEM grouped audit by
correlation_idalone, you can
now addseqfor gap detection. The schema is additive — no
existing field changed shape. -
If you carry a per-agent
disabled_channels:block in your
forge.yaml, migrate it todenied_channels:in
~/.forge/policy.yaml(per-developer) or the workspace policy
ConfigMap (per-deployment).forge channel disable <name>does
this for you. -
Rate-limit defaults are more permissive. If you previously
relied on theWriteBurst=3ceiling for DoS protection, set
server.rate_limit.write_burst: 3in yourforge.yamlto restore
the prior behavior. The new 20-burst default targets
orchestration-friendly workloads. -
Audit-event signing is deferred to a future release per the
FWS-8 architectural recommendation. Sequence numbers cover gap
detection in the meantime.
Merged pull requests
| PR | Title |
|---|---|
| #118 | docs(claude): ship the forge knowledge skill + skill-builder skill (+ sync-docs hook) |
| #117 | feat(server): rate-limit configurability + orchestration-friendly defaults + cancel exemption (FWS-10) |
| #116 | chore(runtime): ops logs to stdout — stream separation from audit (FWS-9) |
| #115 | feat(runtime): hardened audit emission — sequence numbers, schema version, opt-in payload capture (FWS-8) |
| #114 | feat(runtime): audit event export — UDS sink + HTTP fallback (FWS-7) |
| #113 | feat(security): three-layer platform policy + channel scope (FWS-6) |
| #112 | feat(security): platform policy enforcement at runtime (FWS-5) |
| #111 | feat(runtime): cancellation signal handling (FWS-4) |
| #109 | feat(runtime): token usage + execution duration emission (FWS-3) |
| #108 | feat(runtime): workflow correlation ID threading (FWS-2) |
| #107 | feat(a2a): A2A 0.3.0 Agent Card conformance (FWS-1) |
| #84 | feat(ui): workspace-level skill-builder LLM config |
(PR numbers above the v0.12 release reflect the per-feature PRs merged
since then. The CHANGELOG entry for each FWS workstream lives in
CHANGELOG.md.)
Issues closed
- FWS-1 → #85
- FWS-2 → #86
- FWS-3 → #87
- FWS-4 → #88
- FWS-5 → #89
- FWS-6 → #90
- FWS-7 → #95
- FWS-8 → #91
- FWS-9 → #100
- FWS-10 → #110
- Custom-provider regression → #83
- Skill Builder LLM workspace config → #92
Documentation
- Forge knowledge skill — single-file Claude context for the whole project
- Forge skill-builder skill — author
SKILL.mdin any Claude chat - A2A Agent Card reference (FWS-1)
- Workflow correlation (FWS-2)
- Audit logging — full event matrix, schema contract, export, payload capture, streams (FWS-3 / FWS-7 / FWS-8 / FWS-9)
- Platform policy — three-layer model (FWS-5 / FWS-6)
forge.yamlschema reference, includingserver.rate_limit(FWS-10)- CLI reference
- Full changelog
About Forge
Forge is an open-source runtime for building, packaging, and operating
LLM-backed agents that do real work. The design commitment is three
properties at once — atomicity (explicit skills, declared tools,
declared dependencies), security (restricted egress, encrypted
secrets, end-to-end audit), and portability (the same agent runs
locally, in a container, or in Kubernetes with the same forge.yaml).
The runtime speaks A2A 0.3.0 over JSON-RPC and REST, integrates with
multiple LLM providers (Anthropic, OpenAI, Ollama, OpenAI-compatible)
behind a common interface, and ships with a pluggable channel system
(Slack / Telegram / MS Teams) plus an MCP client for external tool
servers.
Full diff: v0.12.0...v0.13.0