feat: query /reflect before and after running the agent in harnesses#30028
feat: query /reflect before and after running the agent in harnesses#30028
Conversation
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/e7a915e7-dc86-42df-b7cb-3bd3a0835751 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
@copilot also update the pi agent api-proxy provider extension |
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/d5620cf5-a2f1-40d2-9071-c6eae9dca649 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Done in 91c864b. Added an Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
There was a problem hiding this comment.
Pull request overview
This PR is intended to improve observability around AWF proxy state by capturing /reflect snapshots before agent execution begins and, for Pi, after the session ends.
Changes:
- Added a best-effort pre-run
fetchAWFReflectcall to the Copilot and Claude harnesses. - Added a post-run
agent_endfetchAWFReflecthook to the Pi provider for parity with other harnesses. - Also includes unrelated documentation/catalog updates outside the stated
/reflectharness scope.
Show a summary per file
| File | Description |
|---|---|
docs/src/content/docs/reference/frontmatter-full.md |
Adds/removes frontmatter reference documentation entries unrelated to the harness /reflect change. |
docs/src/content/docs/agent-factory-status.mdx |
Updates the agent/workflow status table with additional workflow entries and engine labels. |
actions/setup/js/pi_provider.cjs |
Adds pre-run and post-run /reflect fetches via Pi SDK event handlers. |
actions/setup/js/copilot_harness.cjs |
Adds a pre-run /reflect fetch before entering the Copilot retry loop. |
actions/setup/js/claude_harness.cjs |
Adds a pre-run /reflect fetch before entering the Claude retry loop. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comments suppressed due to low confidence (1)
actions/setup/js/pi_provider.cjs:117
- The new
agent_end/reflectfetch is not covered by tests. This file currently has no test companion even though nearby Pi extensions are covered (actions/setup/js/pi_steering_extension.test.cjs), so a regression in whether the handler is registered or awaited would go unnoticed.
pi.on("agent_end", async () => {
// Fetch AWF API proxy reflection data after the agent finishes for the post-run step summary.
// This is best-effort: failures are logged but do not affect the agent exit code.
await fetchAWFReflect({
reflectUrl: AWF_API_PROXY_REFLECT_URL,
outputPath: AWF_REFLECT_OUTPUT_PATH,
timeoutMs: AWF_REFLECT_TIMEOUT_MS,
modelsTimeoutMs: AWF_MODELS_URL_TIMEOUT_MS,
logger: log,
- Files reviewed: 5/5 changed files
- Comments generated: 4
| # Named model alias definitions with ordered fallback lists, resolved recursively | ||
| # by AWF. Each key is an alias name (use empty string "" for the default policy). | ||
| # Each value is an ordered list of vendor/modelid glob patterns or other alias | ||
| # names to try in sequence. Entries defined here are merged on top of the builtin | ||
| # aliases; the main workflow file always wins over imported aliases. Builtin | ||
| # aliases include: sonnet, haiku, opus, gpt-5, gpt-5-mini, gpt-5-codex, | ||
| # gemini-flash, gemini-pro, small, mini, large, auto. | ||
| # (optional) | ||
| models: |
| | [Deployment Incident Monitor](https://github.com/github/gh-aw/blob/main/.github/workflows/deployment-incident-monitor.md) | copilot | [](https://github.com/github/gh-aw/actions/workflows/deployment-incident-monitor.lock.yml) | - | - | | ||
| | [Design Decision Gate 🏗️](https://github.com/github/gh-aw/blob/main/.github/workflows/design-decision-gate.md) | claude | [](https://github.com/github/gh-aw/actions/workflows/design-decision-gate.lock.yml) | - | - | | ||
| | [Dev](https://github.com/github/gh-aw/blob/main/.github/workflows/dev.md) | copilot | [](https://github.com/github/gh-aw/actions/workflows/dev.lock.yml) | `daily around 9:00` | - | | ||
| | [Dev](https://github.com/github/gh-aw/blob/main/.github/workflows/dev.md) | pi | [](https://github.com/github/gh-aw/actions/workflows/dev.lock.yml) | `daily around 9:00` | - | |
| // Fetch AWF API proxy reflection data before running the agent to capture initial proxy state. | ||
| // This is best-effort: failures are logged but do not affect the agent run. | ||
| await fetchAWFReflect({ logger: log }); |
| // Fetch AWF API proxy reflection data before running the agent to capture initial proxy state. | ||
| // This is best-effort: failures are logged but do not affect the agent run. | ||
| await fetchAWFReflect({ logger: log }); |
The harnesses previously called
fetchAWFReflectonly after the agent run. If a harness exits unexpectedly before reaching the post-run fetch, the proxy state is never captured.Changes
claude_harness.cjs/copilot_harness.cjs: Add a best-effortfetchAWFReflectcall before the agent execution loop, immediately after startup validation. This writes the initial api-proxy/reflectsnapshot to disk and logs it before any agent output, making proxy configuration available for diagnostics regardless of how the run ends.pi_provider.cjs: Add anagent_endhandler that callsfetchAWFReflectafter the Pi agent session finishes. The extension already fetched/reflectonagent_start(pre-run); this adds the matching post-run fetch for parity with the other harnesses.The post-run call in the copilot and claude harnesses is unchanged — it remains as the authoritative snapshot for the step summary.