Skip to content

docs: AW harness design plan (Pi SDK + extensions)#29667

Merged
pelikhan merged 15 commits intomainfrom
pelikhan/aw-harness-plan
May 2, 2026
Merged

docs: AW harness design plan (Pi SDK + extensions)#29667
pelikhan merged 15 commits intomainfrom
pelikhan/aw-harness-plan

Conversation

@pelikhan
Copy link
Copy Markdown
Collaborator

@pelikhan pelikhan commented May 2, 2026

AW Harness Design Plan

Design plan for the aw engine harness — a Pi SDK-based replacement for copilot_harness.cjs that adds multi-step DAG orchestration, multi-agent/multi-model support, context engineering, cost tracking, steering, session repair, and observability.

Key Design Decisions

  • Built on Pi agent ecosystem — Uses @mariozechner/pi-coding-agent, pi-agent-core, pi-ai as the runtime
  • Extensions-first — All gh-aw capabilities (safe-outputs, MCP bridge, cost tracking, steering, repair, observability, checkpoint) are implemented as Pi extensions using ExtensionAPI
  • api-proxy for model resolution — Model names/aliases pass through to the existing api-proxy, which handles recursive resolution, provider routing, and semver selection
  • TypeScript compiled for Node 24 — Compiled to ES2024, bundled via esbuild to a single .cjs in actions/setup/js/
  • New opt-in engineengine: aw is a new choice; existing engines (copilot, claude, codex) are untouched
  • No CLI spawning — Pi SDK talks directly to LLM APIs through the api-proxy

Architecture

aw_harness.cjs (entry point)
  |-- Parser (markdown -> steps + config)
  |-- DAG Planner (topological sort, parallel groups)
  |-- DAG Runner (orchestrates multiple Pi AgentSessions)
  +-- Pi Extensions (loaded into each session):
      |-- api-proxy-provider (registerProvider for api-proxy)
      |-- mcp-bridge (MCP gateway tools -> Pi AgentTool)
      |-- safe-outputs (tools + artifact writing)
      |-- cost-tracker (budget gates via turn_end events)
      |-- steering (time/budget pressure via steer())
      |-- repair (broken session recovery)
      |-- observability (JSONL + OTel spans)
      +-- checkpoint (persist/resume state)

Files

  • specs/aw-harness.md - Full design document

This is a design-only PR - no implementation code yet.

Design plan for the 'aw' engine harness — a Pi SDK-based replacement
for copilot_harness.cjs that adds multi-step DAG orchestration,
multi-agent/multi-model support, context engineering, cost tracking,
steering, session repair, and observability.

Key design decisions:
- Built on Pi agent ecosystem (pi-coding-agent, pi-agent-core, pi-ai)
- All gh-aw capabilities implemented as Pi extensions (ExtensionAPI)
- Model resolution via existing api-proxy (no provider inference)
- TypeScript compiled for Node 24, bundled as single .cjs
- New opt-in engine (engine: aw) — existing engines untouched

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@pelikhan pelikhan marked this pull request as ready for review May 2, 2026 01:38
Copilot AI review requested due to automatic review settings May 2, 2026 01:38
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 2, 2026

✅ smoke-ci: safeoutputs CLI comment + comment-memory run (25240542151)

Generated by Smoke CI for issue #29667 ·

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 2, 2026

Comment Memory

CI lights the path
Green checks bloom at dawn
Quiet bots still sing

Note

This comment is managed by comment memory.

It stores persistent context for this thread in the code block at the top of this comment.
Edit only the text inside the backtick fences; workflow metadata and the footer are regenerated automatically.

Learn more about comment memory

Generated by Smoke CI for issue #29667 ·

@github-actions github-actions Bot mentioned this pull request May 2, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a design document for a proposed aw engine harness that would replace the current single-CLI retry wrapper with a Pi SDK-based orchestration layer for multi-step, multi-agent workflows.

Changes:

  • Introduces the proposed AW harness architecture, including Pi extension-based capabilities such as MCP bridging, safe outputs, cost tracking, repair, checkpointing, and observability.
  • Defines a draft workflow/frontmatter format for engine: aw, step extraction from markdown headings, and DAG-based execution.
  • Documents an implementation/build/testing plan for packaging a bundled aw_harness.cjs into actions/setup/js/.
Show a summary per file
File Description
specs/aw-harness.md New end-to-end design spec for the proposed AW engine, covering architecture, workflow format, extension model, runtime orchestration, and rollout/build plan.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 1/1 changed files
  • Comments generated: 9

Comment thread specs/aw-harness.md Outdated
Comment on lines +391 to +392
pi.on("agent_end", async (event, ctx) => {
await finalizeSafeOutputManifest();
Comment thread specs/aw-harness.md Outdated
Comment on lines +563 to +578

for (const stepGroup of dag.executionOrder()) {
await Promise.all(stepGroup.map(async (step) => {
const { session } = await createAgentSession({
sessionManager: SessionManager.inMemory(),
extensions,
model: resolveModel(step.agent?.model || workflow.defaultModel),
systemPrompt: buildSystemPrompt(step),
});

const prompt = buildStepPrompt(step, transcripts);
await session.prompt(prompt);

transcripts[step.name] = captureTranscript(session);
session.dispose();
}));
Comment thread specs/aw-harness.md Outdated
Comment on lines +547 to +551
// index.ts — entry point
import { createAgentSession, SessionManager } from "@mariozechner/pi-coding-agent";

async function main() {
const workflow = parseWorkflow(process.argv[2]);
Comment thread specs/aw-harness.md Outdated
Comment on lines +404 to +409
export default function(pi: ExtensionAPI) {
const budget = loadBudgetConfig();
let totalCost = 0;

pi.on("turn_end", async (event, ctx) => {
totalCost += extractCostFromTurn(event);
Comment thread specs/aw-harness.md Outdated
Comment on lines +694 to +696
1. **Scaffold project** — Initialize TypeScript project in `aw-harness/`. Configure package.json with Pi SDK deps (`@mariozechner/pi-coding-agent`, `pi-agent-core`, `pi-ai`). Set up tsconfig for ES2024/Node 24. Configure esbuild bundle → `dist/aw_harness.cjs`.

2. **Implement api-proxy provider extension** — Pi extension that registers the api-proxy as a custom provider via `pi.registerProvider()`. Async factory fetches available models at startup. All model requests route through the proxy.
Comment thread specs/aw-harness.md Outdated
Comment on lines +293 to +298
**Step extraction**: Each `## Heading` or `### Heading` is a potential step. Linked to `harness.steps` via `<!-- harness-step: name -->` HTML comments.

**Implicit behavior**:

- No `<!-- harness-step -->` → sequential by default (document order)
- With annotation → follows `harness.steps` config (parallel, depends, agent)
Comment thread specs/aw-harness.md Outdated
for (const stepGroup of dag.executionOrder()) {
await Promise.all(stepGroup.map(async (step) => {
const { session } = await createAgentSession({
sessionManager: SessionManager.inMemory(),
Comment thread specs/aw-harness.md Outdated
Comment on lines +587 to +593
| `engine: copilot` (existing) | Uses current `copilot_harness.cjs` — unchanged |
| `engine: claude` (existing) | Uses current Claude Code flow — unchanged |
| `engine: codex` (existing) | Uses current Codex flow — unchanged |
| `engine: aw` without `harness:` block | Single-step: entire body = one Pi session prompt |
| `engine: aw` with `harness:` block | Multi-step orchestration mode |
| `engine: aw` with `harness.steps` | Explicit DAG (parallel, depends, agent assignment) |
| `engine: aw` without `harness.agents` | All steps use `engine.model` |
Comment thread specs/aw-harness.md Outdated
Comment on lines +172 to +173
apiKey: "AW_PROXY_API_KEY", // env var name — Pi resolves it
api: "openai-completions", // api-proxy speaks OpenAI protocol
@pelikhan
Copy link
Copy Markdown
Collaborator Author

pelikhan commented May 2, 2026

@copilot when this harness is selected, GH-proxy and CLI proxy features are always enabled. You cannot disable them because it does not support MCP natively.

… native MCP)

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/01e7b2ba-9d7a-4706-a21a-f8b576ecb523

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented May 2, 2026

@copilot when this harness is selected, GH-proxy and CLI proxy features are always enabled. You cannot disable them because it does not support MCP natively.

Updated in 370d10c. The spec now documents in three places that gh-proxy and cli-proxy are always enabled for engine: aw and cannot be disabled (Pi SDK does not support MCP natively):

  • "What this IS" bullet added
  • Design Principle 5 added
  • Frontmatter example now shows cli-proxy: true and mode: gh-proxy with explanatory comments
  • MCP Bridge extension section has a Note callout explaining the mandatory dependency
  • Backwards Compatibility table has two new rows: cli-proxy: false is ignored, and tools.github.mode: remote is overridden to gh-proxy

@pelikhan
Copy link
Copy Markdown
Collaborator Author

pelikhan commented May 2, 2026

@copilot rewrite harness spec using the W3c-spec-writer

@pelikhan
Copy link
Copy Markdown
Collaborator Author

pelikhan commented May 2, 2026

@copilot the harness has no predefined context, all ambient imported files or skills must be declared

Copilot AI and others added 2 commits May 2, 2026 02:35
- Add W3C-style title block, Abstract, Status of This Document
- Add RFC 2119 Conformance section with keyword table
- Add formal Terminology and Definitions section for all key terms
- Restructure Architecture as normative section with numbered Design Principles
- Add Harness Invocation Contract (entry point, env vars, exit codes, streams)
- Restructure Workflow Definition with formal sub-schemas for each harness.* key
- Add normative Overrides and Fixed Settings table (Section 6.2)
- Add formal DAG Execution Model with normative algorithm (Section 7)
- Restructure all 8 extensions with Requirements + non-normative examples
- Move Model Resolution to its own normative section (Section 9)
- Move Build/Deployment and Implementation Plan to informative section (Section 10)
- Preserve full backwards-compatibility table including gh-proxy override rows
- Add Security Considerations and Privacy Considerations sections
- Add Normative and Informative References section
- Fix duplicate Design Principle number (two items numbered 5)
- Fix env var inconsistency: use AWF_API_PROXY_TOKEN consistently throughout
- Make gh-proxy/cli-proxy always-on constraint a MUST NOT normative requirement
- Frame all code examples as non-normative with >[!NOTE] callouts

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
- Clarify MUST NOT wording: 'MUST NOT honor attempts to disable'
- Add normative requirement to emit a stderr warning when cli-proxy
  or tools.github.mode overrides are applied, aiding author debugging

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented May 2, 2026

@copilot rewrite harness spec using the W3c-spec-writer

Done in 678615c. The spec has been fully rewritten in W3C specification format with:

  • Abstract, Status of This Document, and Table of Contents
  • §2 Conformance — RFC 2119 keyword table (MUST, SHOULD, MAY, etc.)
  • §3 Terminology and Definitions — 14 formal definition entries
  • §4 Architecture — design principles expressed as normative MUST/MUST NOT requirements (duplicate item 5 renumbered, AWF_API_PROXY_TOKEN used consistently)
  • §5 Harness Invocation Contract — entry point, env vars table, exit codes, stdio rules
  • §6 Workflow Definition — formal sub-schema per harness.* key + §6.2 Overrides and Fixed Settings table (gh-proxy/cli-proxy always-on is a normative MUST NOT)
  • §7 DAG Execution Model — normative construction and execution algorithms
  • §8 Extensions — each of the 8 extensions as a subsection with a Requirements list + >[!NOTE] Non-normative example code blocks
  • §9 Model Resolution (informative), §10 Build and Deployment (informative, preserves all original content)
  • §11 Security Considerations, §12 Privacy Considerations (informative)
  • §13 References — normative (RFC 2119) and informative (Pi SDK, esbuild, OTel, gh-aw)

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • github.com (HTTP Only)
    • Triggering command: /usr/bin/ssh ssh -T git@github.com (packet block)
  • https://api.github.com/graphql
    • Triggering command: /usr/bin/gh gh auth status (http block)
  • https://api.github.com/repos/fastlane/fastlane/releases
    • Triggering command: /usr/local/bin/fastlane fastlane bin-proxy --help (http block)
  • https://api.github.com/repos/github/gh-aw
    • Triggering command: /usr/bin/curl curl -s -o/dev/null -w %{http_code} -H Authorization: token ****** REDACTED (http block)
    • Triggering command: /usr/bin/curl curl -s -H Authorization: token ****** REDACTED (http block)
  • https://api.github.com/repos/github/gh-aw/git/refs/heads/pelikhan/aw-harness-plan
    • Triggering command: /usr/bin/gh gh api repos/github/gh-aw/git/refs/heads/pelikhan/aw-harness-plan (http block)
    • Triggering command: /usr/bin/curl curl -s -o/dev/null -w %{http_code} -X PATCH -H Authorization: token ****** -H Accept: application/vnd.github.v3&#43;json REDACTED -d {&#34;sha&#34;:&#34;678615c870e4122e67dfbec403ad91c366e32344&#34;,&#34;force&#34;:false} (http block)
  • https://api.github.com/user
    • Triggering command: /usr/bin/gh gh api user --jq .login (http block)
    • Triggering command: /usr/bin/curl curl -s -H Authorization: token ****** REDACTED (http block)
    • Triggering command: /usr/bin/curl curl -v REDACTED -H Authorization: token ****** (http block)

If you need me to access, download, or install something from one of these locations, you can either:

…ports must be declared

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/8df5106a-c84d-4c72-b43d-280243d2ebae

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented May 2, 2026

@copilot the harness has no predefined context, all ambient imported files or skills must be declared

Added §6.4 Initial Prompt Context in fe1e2a2. The new normative section states:

  • The harness MUST NOT inject any predefined or ambient context into agent sessions
  • Every item in a session's prompt MUST come from an explicitly declared origin: the step body, upstream transcripts, harness.agents[*].system, or imports:
  • The harness MUST NOT auto-load AGENTS.md, .github/agents/ entries, or skills directories
  • An [!IMPORTANT] callout reminds workflow authors to declare all needed files/skills via imports:

Comment thread specs/aw-harness.md Outdated
Comment on lines +126 to +127
**MCP bridge**
: The `mcp-bridge` Pi extension (Extension 2) that translates MCP gateway tool definitions into Pi `AgentTool` instances, making them available to agent sessions without native MCP support in the Pi SDK.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot not needed, MCP are already wrapped as clis. No MCP support needed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 945545f. Removed Extension 2 (MCP Gateway Bridge) entirely — the spec now states that MCP tools are available to agent sessions as CLI commands on PATH via cli-proxy, so no Pi-level bridge is needed. Updated the architecture diagram, Design Principle 5, §7.3 execution summary, §10 project structure, and implementation plan accordingly. The extension count dropped from eight to seven.

Comment thread specs/aw-harness.md Outdated
: The complete message history of a completed `AgentSession`, optionally summarized, passed as context to downstream steps.

**workflow document**
: A Markdown file with YAML frontmatter that declares an `engine: aw` workflow. The frontmatter **MUST** conform to the schema in [Section 6](#6-workflow-definition).
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The front matter is preparsed by the gh-aw compiler and the aw harness will instead receive a config.json, prompt.text and agent files

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 945545f. Updated the spec throughout to reflect that the gh-aw compiler pre-processes the workflow markdown and provides the harness with config.json (parsed harness config), prompt.txt (extracted prompt body), and agent files. Key changes:

  • §3 "workflow document" definition updated
  • §5.1 Entry Point rewritten: node aw_harness.cjs --config <config-path> --prompt <prompt-path> [--agent-file <agent-path>]
  • Normative requirement added: the harness MUST NOT read or parse workflow Markdown files directly
  • §6.1 intro updated to clarify the compiler emits config.json; the harness consumes it at runtime
  • §6.3 Step Extraction Algorithm updated to describe reading from config.json + prompt.txt
  • §7.2 normative requirements and code example updated accordingly

…onfig.json+prompt.txt

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/c4564ad2-de2d-4e3d-8a0f-da65187abf59

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Comment thread specs/aw-harness.md Outdated
where:
- `<config-path>` is the path to the compiler-generated `config.json` file containing the parsed harness configuration.
- `<prompt-path>` is the path to the compiler-generated `prompt.txt` file containing the extracted prompt body.
- `<agent-path>` is the optional path to a compiled agent file (`.md` with frontmatter stripped), provided when the workflow references an agent via `imports:`.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot this information is in the config file, not needed

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 3844198. Removed --agent-file from the invocation command and its description. The config.json already contains resolved agent file paths, so no separate flag is needed.

Comment thread specs/aw-harness.md Outdated
Comment on lines +250 to +252
| `AWF_API_PROXY_URL` | Base URL of the api-proxy OpenAI-compatible endpoint | `http://localhost:8080/v1` |
| `AWF_API_PROXY_TOKEN` | Bearer token for api-proxy authentication | *(required; no default)* |
| `AWF_MODEL_ALIASES` | JSON string containing model alias configuration | *(empty; aliases resolved by proxy)* |
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot awf uses the environment from the various providers and must connect with those directly

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 3844198. Updated throughout to reflect that AWF injects provider-specific credentials (e.g., ANTHROPIC_API_KEY, OPENAI_API_KEY, GITHUB_TOKEN) into the container and the harness reads them directly. Changes: §5.2 env vars table replaced with a note about provider env vars; Design Principle 3 updated to "Direct provider connections"; Extension 1 renamed from "api-proxy Provider" to "Provider Setup"; §9 Model Resolution, §11 Security, §12 Privacy, and the architecture diagram all updated to remove the api-proxy middleman.

Comment thread specs/aw-harness.md Outdated
> engine:
> id: aw
> model: sonnet # Model alias — api-proxy resolves
> harness: aw_harness.cjs
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot this is implicit

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 3844198. Removed harness: aw_harness.cjs from the frontmatter example.

Comment thread specs/aw-harness.md Outdated
> You synthesize multiple review perspectives into a single,
> prioritized action list.
>
> steps:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot steps are confusing because it is a github actions idiom. Use another terminology, related to processes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 3844198. Renamed all harness execution unit references from "step/steps" to "task/tasks" throughout the spec: harness.stepsharness.tasks, <!-- harness-step: --><!-- harness-task: -->, "harness step" → "harness task", "step annotation" → "task annotation", section titles "Step Extraction Algorithm" → "Task Extraction Algorithm", "Step Execution Summary" → "Task Execution Summary", exit code table, observability events, checkpoint extension, backwards compatibility table, implementation plan, and project structure.

Copy link
Copy Markdown
Contributor

Copilot AI commented May 2, 2026

@copilot remove subagents from this proposal. It just runs a single prompt instruction.

Done in 355543b. Removed all subagent/multi-task machinery from the spec:

  • harness.agents and harness.tasks keys removed from §6.1 (schema, example, subsections)
  • harness.checkpoint removed (§6.1.6 dropped)
  • transcript-mode removed from harness.context
  • §7 DAG Execution Model → replaced with §7 Single-Session Execution Model — one createAgentSession(), one session.prompt(), done
  • Extension 7 (Checkpoint) removed; extension count is now 6
  • §6.3 Task Extraction Algorithm§6.3 Prompt Loading — reads entire prompt.txt as a single atomic prompt
  • §6.4 updated — removed transcript upstream references; only prompt.txt, harness.system, and imports: feed the session
  • Project structure simplified — removed parser.ts, planner.ts, dag-runner.ts, transcript.ts
  • Implementation plan reduced from 17 items to 13, removing DAG planner, runner, transcript manager, checkpoint
  • Backwards compatibility table simplified — removed multi-task rows, added all 6 existing engines
  • §11/§12 updated — removed transcript confidentiality and retention paragraphs

@pelikhan
Copy link
Copy Markdown
Collaborator Author

pelikhan commented May 2, 2026

@copilot add section on context observability .

  • the harness should produce a file that provides a full timeline of the context creation (provenance of each piece of text) so that a downstream tool can do a deep analysis on the context size

  • the harness should render output to the console in a format that is natively markdown friendly to GitHub actions step summary

  • the harness should provide effective token consumption updates in the output on each turn

Copy link
Copy Markdown
Contributor

Copilot AI commented May 2, 2026

@copilot add section on context observability .

  • the harness should produce a file that provides a full timeline of the context creation (provenance of each piece of text) so that a downstream tool ...

Done in 9be73be. Added four normative sub-sections to §8.6 and updated §5.4:

§8.6.1 JSONL Event Stream — unchanged existing JSONL + OTel requirements.

§8.6.2 Context Provenance File — the extension MUST produce /tmp/gh-aw/context-provenance.jsonl on session completion. Each record captures timestamp, source ("prompt", "import", or "system"), path (for imports), tokens, cumulative_tokens, and role — enabling downstream tools (e.g., gh aw audit) to do deep analysis of context growth and identify which imports consumed the most token budget.

§8.6.3 GitHub Actions Step Summary — when GITHUB_STEP_SUMMARY is set, the extension MUST write a GitHub-flavored Markdown summary including a per-turn token table and a context provenance section listing each imports: file with its token contribution. §5.4 also updated to document the GITHUB_STEP_SUMMARY contract.

§8.6.4 Per-Turn Token Consumption Output — the extension MUST emit a human-readable line to stderr after each turn_end reporting turn number, input/output tokens, cumulative totals, and cost, formatted as a Markdown blockquote so it renders correctly in the step summary.

Also added a privacy note about the provenance file (metadata only, no raw prompt text) and expanded implementation plan item 10 to enumerate all four sub-requirements.

@pelikhan
Copy link
Copy Markdown
Collaborator Author

pelikhan commented May 2, 2026

@copilot

  • review pi agent and openclaw/openclaw extension format and provide compatibility requirement if applicable

Copy link
Copy Markdown
Contributor

Copilot AI commented May 2, 2026

@copilot

...

Done in cbc260c. Added §4.3 Pi SDK and OpenClaw Extension Format Compatibility (non-normative):

§4.3.1 API Surface Comparison — side-by-side table mapping Pi ExtensionAPI methods to their OpenClaw OpenClawPluginApi counterparts:

  • pi.registerTool()api.registerTool() (semantically equivalent; schema shapes differ — Pi-native vs. TypeBox Type.Object)
  • pi.on()api.on() (same intent; event names differ, e.g. Pi "turn_end" vs. OpenClaw "after_agent_turn")
  • pi.registerProvider()api.registerProvider() (config shape differs; OpenClaw also requires openclaw.plugin.json manifest)
  • api.registerAgentHarness(...) — OpenClaw-only experimental low-level executor seam (no Pi equivalent)
  • api.registerSessionExtension(...) — OpenClaw-only; no Pi equivalent

§4.3.2 Manifest Requirements — documents that OpenClaw plugins require openclaw.plugin.json + definePluginEntry(...) while Pi extensions are plain functions; the two formats are not interchangeable.

§4.3.3 Compatibility Requirements — normative SHOULDs for writing Pi extensions in a way that eases future portability: isolate Pi SDK dependencies to the entry boundary, express tool definitions as plain objects, treat event source names as a configuration layer, avoid Pi-internal APIs with no OpenClaw equivalent.

§4.3.4 registerAgentHarness Consideration — notes that OpenClaw's experimental executor seam could host the aw harness loop if a future OpenClaw gateway integration is desired, but the current spec does not target it.

Also updated the ToC, the §8 intro cross-reference, and added [OpenClaw] to the informative references.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • www.npmjs.com
    • Triggering command: /home/REDACTED/work/_temp/ghcca-node/node/bin/node /home/REDACTED/work/_temp/ghcca-node/node/bin/node --enable-source-maps /home/REDACTED/work/_temp/copilot-developer-action-main/dist/index.js (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

@pelikhan
Copy link
Copy Markdown
Collaborator Author

pelikhan commented May 2, 2026

@copilot

  • remove openclaw section
  • investigate strategy to allow declaring piagent compatible extensions in frontmatter and registering them at runtime

Copy link
Copy Markdown
Contributor

Copilot AI commented May 2, 2026

@copilot

  • remove openclaw section...

Done in 4abef9b:

§4.3 removed — the OpenClaw compatibility section, its ToC entry, §8 cross-reference, and informative reference are all gone.

§6.1.4 harness.extensions added — new normative sub-section documenting how workflow authors declare Pi SDK-compatible extensions in frontmatter:

  • Two entry forms: repository-relative path to a compiled .cjs file (e.g., ./extensions/custom-tool.cjs) or an npm package name (e.g., @my-org/pi-extension)
  • Each module MUST export (pi: ExtensionAPI) => void | Promise<void>
  • Harness dynamically loads user extensions after all six built-in gh-aw extensions; user extensions MUST NOT override built-ins
  • Failed loads emit a warning; session only aborts if harness.extensions-required: true is set

§7 updated — execution code example now shows loadUserExtensions() spreading user extensions after built-ins; normative requirement #3 references §6.1.4. §7.2 summary updated.

§11 Security Considerations updated — added user extension isolation paragraph noting extensions run in-process, with guidance on auditing third-party npm packages.

§10.8 Implementation Plan updated — new item 5a documents the user extension loader implementation.

Comment thread specs/aw-harness.md Outdated

The `harness.budget` key is **OPTIONAL**. When present, it **MUST** contain:

- `max-cost-usd` (number): Maximum total cost in USD for the run. The cost-tracker extension **MUST** abort the current session if this limit is exceeded.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot use max-effective-tokens instead as cost is not always know

Comment thread specs/aw-harness.md Outdated
The `harness.budget` key is **OPTIONAL**. When present, it **MUST** contain:

- `max-cost-usd` (number): Maximum total cost in USD for the run. The cost-tracker extension **MUST** abort the current session if this limit is exceeded.
- `warn-at-percent` (number, 0–100): Percentage of `max-cost-usd` at which a steering warning **MUST** be injected.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot already covered in steering

Comment thread specs/aw-harness.md Outdated
A conforming implementation **MUST** source every item included in the session's initial prompt from one of the following explicitly declared origins:

- The Markdown body from `prompt.txt` (loaded per [Section 6.3](#63-prompt-loading)).
- The `harness.system` prompt if declared in the `harness:` frontmatter block.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot remove this field

Comment thread specs/aw-harness.md

**Requirements:**

- The extension **MUST** call `pi.registerProvider()` for each LLM provider whose credentials are present in the environment (e.g., `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GITHUB_TOKEN`).
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot look for the base url env var as well

Comment thread specs/aw-harness.md Outdated
> }
> ```

### 8.2 Extension 2: Safe Outputs
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot safeoutouts are cli tools, no special support needed

Comment thread specs/aw-harness.md

### 8.6 Extension 6: Observability

**Purpose:** Emits structured event streams to stderr, writes a context provenance file for downstream analysis, renders a Markdown step summary, and reports per-turn token consumption.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot all events should be store d

Copilot AI and others added 2 commits May 2, 2026 11:42
…dget/safe-outputs design

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/d1715722-ea93-4860-873f-802b6fab33c5

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented May 2, 2026

@copilot

  • remove openclaw section...

Done in 1af3ca4:

OpenClaw section removed — §4.3 and its ToC entry, §8 cross-reference, and informative reference are gone.

Pi SDK-compatible extension declaration strategy added as §6.1.4 harness.extensions:

  • Two entry forms: repo-relative path to a compiled .cjs file or an npm package name
  • Each module MUST export (pi: ExtensionAPI) => void | Promise<void> (Pi SDK-native signature)
  • Harness dynamically loads user extensions after the five built-in gh-aw extensions; user extensions MUST NOT override built-ins
  • Failed loads emit a warning; session only aborts if harness.extensions-required: true is set
  • §7 code example updated to show loadUserExtensions() call

Also fixed in the same pass per other review comments: max-cost-usdmax-effective-tokens, removed warn-at-percent (covered by steering), removed harness.system, removed Safe Outputs Pi extension (safe-outputs are CLI tools via cli-proxy), added base URL env var support to Provider Setup, updated Observability to subscribe to all events.

@pelikhan pelikhan merged commit 20ed6d7 into main May 2, 2026
@pelikhan pelikhan deleted the pelikhan/aw-harness-plan branch May 2, 2026 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants