feat(compile): add execution-context plugin with PR contributor (#860)#865
feat(compile): add execution-context plugin with PR contributor (#860)#865jamesadevine wants to merge 6 commits into
Conversation
🔍 Rust PR ReviewSummary: Looks good overall — the trust boundary design is solid and the test coverage is thorough. One stale doc/comment inconsistency in the security-critical section warrants a fix before merge; two minor latent hazards noted below. Findings🔒 Security Concerns
|
🔍 Rust PR ReviewSummary: Well-engineered feature with a solid security design — a few correctness and documentation gaps worth addressing before merge. Findings🐛 Bugs / Logic Issues
🔒 Security Concerns
|
🔍 Rust PR ReviewSummary: Looks good overall — the trust-boundary design is solid and the test suite is thorough. Two items worth addressing before merge. Findings
|
🔍 Rust PR ReviewSummary: Looks good — well-designed, with excellent security posture and thorough test coverage. One defense-in-depth gap in a regex and a cosmetic EOF issue worth fixing before merge. Findings
|
🔍 Rust PR ReviewSummary: Well-designed with strong trust-boundary controls and excellent test coverage; one low-severity prompt injection vector worth addressing. Findings🔒 Security Concerns
🐛 Bugs / Logic Issues
|
Adds an always-on ExecContextExtension that materialises `aw-context/pr/*` on disk for PR-triggered runs, so reviewer/triage agents stop reinventing `git fetch` / `git diff` / merge-base resolution in every workflow body. Bearer is mapped only into the precompute step's env and injected into git via `GIT_CONFIG_COUNT/KEY_0/VALUE_0` (never argv); no `persistCredentials: true` and no checkout override, so the AWF-sandboxed agent never sees `SYSTEM_ACCESSTOKEN`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Aligns the execution-context plugin with gh-aw's model: the agent
does its own `git diff` against precomputed SHAs, the compiler only
handles the parts the sandbox can't (fetch+merge-base resolution,
bearer-protected).
Artefacts collapse from ~6 files to 2 (`base.sha`, `head.sha`) plus
one failure file (`error.txt`). Short identifiers (PR id / project /
repo) are interpolated directly into the prompt fragment via
`printf` calls appended to `/tmp/awf-tools/agent-prompt.md` from the
prepare step (same mechanism gh-aw uses for built-in prompt sections).
The prompt fragment lists the right `git` commands and example ADO
MCP tool calls (`repo_get_pull_request_by_id` etc.) with PR id /
project / repo pre-filled. When the precompute fails, a failure
fragment tells the agent to surface the error rather than produce an
empty review (ADO MCP calls still possible).
Auto-extends the agent's bash allow-list with 7 read-only git
commands (`git`, `git diff`, `git log`, `git show`,
`git status`, `git rev-parse`, `git symbolic-ref`) when the PR
contributor activates, so a restricted-bash agent can still inspect
the diff. Opt-out via `execution-context.pr.enabled: false`.
Dropped from v1: `status.txt` protocol, `metadata.txt`,
`changed-files*.txt`, `diff.patch` (with scope/unified/max-bytes
knobs), and `head-files/` / `base-files/` snapshots. The
`execution-context.pr` block collapses to `{ enabled }` only.
Trust boundary unchanged: `SYSTEM_ACCESSTOKEN` scoped to the
prepare step's env, `GIT_CONFIG_*` bearer injection (no argv leak),
no `persistCredentials`, no checkout override, condition gate on
`Build.Reason == 'PullRequest'`.
Issue: #860
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Four review findings from the automated Rust PR reviewer
(workflow run 27037700172).
1. Bug — inconsistent `base.sha` semantics across paths. The
synthetic-merge-commit path used `HEAD^1` (target tip), while the
progressive-deepening path used `git merge-base` (common ancestor).
`git diff $BASE..$HEAD` therefore produced different change sets
depending on ADO's checkout mode. Fixed: synthetic path now also
computes `git merge-base HEAD^1 HEAD^2`, so `BASE_SHA` is always
the common ancestor ("PR diff since branch-point").
2. Cleanup — unreachable `|| echo 0`. `wc -w` always exits 0 and
prints "0" on empty input, so the fallback never fired. Replaced
with a clean `${PARENTS:-0}` parameter-expansion default.
3. Security defence-in-depth — missing `PR_TARGET_BRANCH` validation.
`PR_ID`/`PROJECT`/`REPO` were strictly regex-validated; only
`PR_TARGET_BRANCH` was just checked for emptiness despite being
interpolated into a git refspec. Added `^[A-Za-z0-9._/-]+$` regex
guard with the same posture as the other identifiers.
4. Footgun — explicit `pr.enabled: true` widened agent bash on
non-PR agents. Without `on.pr` configured, the prepare step is dead
code (runtime `Build.Reason == 'PullRequest'` gate), but the 7 git
commands were still appended to the agent's bash allow-list at
compile time. Now `on.pr` is REQUIRED — `pr.enabled: true` without
it is treated the same as the default (inactive). Aggregate in
`ExecContextExtension::new` updated to match.
Tests:
- `test_execution_context_pr_synthetic_merge_uses_merge_base` (new)
— regression guard against the `HEAD^1`-direct-assignment shape.
- `test_execution_context_pr_target_branch_validated` (new)
— asserts the new regex guard is emitted.
- `test_execution_context_pr_enabled_true_without_on_pr_is_inactive`
(new) — prepare step and bash allow-list both suppressed when
on.pr is absent.
- All 9 `test_execution_context_pr_*` tests pass; full `cargo test`
green (1739 unit + 127 compiler + bash_lint with shellcheck).
Docs updated: synthetic-path behaviour, `PR_TARGET_BRANCH` validation,
and `on.pr`-required activation rule in `docs/execution-context.md`.
Note: the first review (run 27020920083) was against the pre-v6.2
implementation; its concerns about `git -c http.extraheader` in
docs/comments were already resolved by the v6.2 rewrite (both now
describe `GIT_CONFIG_*` env-var injection).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two additional suggestions from the latest review pass.
1. Divergence-trap unit tests for `any_contributor_active`.
The pre-computation in `ExecContextExtension::new` duplicates each
contributor's `should_activate` logic so the aggregate flag can gate
`required_bash_commands` (which receives no `CompileContext`). The
existing tests exercise this only via fixture-compile paths through
`prepare_steps`, so a future contributor that wires up
`should_activate` but forgets to OR-in the aggregate could silently
suppress its bash allow-list contributions and only be caught at
E2E time.
Added 6 focused unit tests in `src/compile/extensions/exec_context/
mod.rs::tests` that construct `ExecContextExtension::new` directly
and assert `required_bash_commands` matches `should_activate` for
every combination of:
- on.pr configured / absent
- pr.enabled default / explicit true / explicit false
- master switch on / off
A future divergence trips here at unit-test time rather than at E2E
time.
2. Hard-fail on infra failures BEFORE the soft `fail()` machinery.
Without `set -e`, a failed `mkdir -p "$AW_PR_DIR"` (e.g. read-only
workspace) would silently continue, `fail()` would itself fail to
write `error.txt`, and the step would exit 0 with nothing staged
AND no failure signal in the agent prompt.
Added an explicit `mkdir -p "$AW_PR_DIR" || { echo ...; exit 1; }`
hard-fail at the top of the prepare step, with a message that
points the operator at `BUILD_SOURCESDIRECTORY` permissions.
Unlike the merge-base soft-fail path (normal operational case),
a missing output directory always indicates a broken pipeline
configuration and warrants a hard build break.
Tests: 1745 unit (+6) + 127 compiler + bash_lint with shellcheck;
clippy clean.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ript bundle The PR contributor's prepare step used to embed ~190 lines of bash heredoc into the emitted YAML, with only end-to-end shellcheck for coverage. Port that work to a new `exec-context-pr.js` ado-script bundle alongside the existing `gate.js` and `import.js` bundles. What's in the bundle (scripts/ado-script/src/exec-context-pr/): - validate.ts - strict allowlist regex guards for PR id, project, repo, target branch. - git.ts - execFile wrappers + bearerEnv helper that injects GIT_CONFIG_* into the spawned git child env only. - merge-base.ts - synthetic-merge detection (parents >= 3) + progressive-deepening fetch (--depth=200/500/2000/--unshallow). - prompt.ts - success/failure prompt-fragment writers. - index.ts - orchestrator; hard-fails (exit 1) on mkdir failure, soft-fails (exit 0 + error.txt) on validation/merge-base failures. - 32 vitest unit tests + 3 e2e smoke tests against a synthetic-merge git repo. Rust side: - pr.rs shrinks from ~320 lines to ~145 lines; the emitted bash is 4 lines wrapping a single `node /tmp/ado-aw-scripts/ado-script/exec-context-pr.js` invocation. - AdoScriptExtension gains exec_context_pr_active: bool so the Agent-job install/download fires whenever either import.js or exec-context-pr.js is needed. - Shared pr_contributor_will_activate predicate keeps the new flag and ExecContextExtension::new in lockstep; cfg-aware variant lets unit tests pass a custom config without divergence. - Tests in compiler_tests.rs updated to assert the new node-invocation shape; two tests that asserted bash literals (synthetic-merge, target-branch regex) deleted - their logic is now covered by vitest. - dedupe_gate_only fixture pins execution-context.pr.enabled: false to keep the gate-only download-placement test narrow. Trust boundary improves: the bearer (SYSTEM_ACCESSTOKEN) is mapped into the Node process's env and is only propagated into the spawned git child via GIT_CONFIG_COUNT/KEY_0/VALUE_0 - never into the wrapping bash shell, never on argv, never written to .git/config. Release packaging (.github/workflows/release.yml) zips the new bundle into ado-script.zip. Docs (docs/ado-script.md, docs/execution-context.md, AGENTS.md) updated to reflect three bundles. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Five findings from the latest PR review on commit 7bcf107: 1. Security: sanitize unvalidated env values before embedding in the failure prompt fragment. A PR author with push access could craft a branch name containing CR/LF + markdown headers to inject content into the agent prompt via the validation-failure path. validate.ts now passes raw values through sanitizeForPrompt (CR/LF -> space, truncate to 80 chars with ellipsis) before building the reason string. Covers PR_ID, PROJECT, REPO, and PR_TARGET_BRANCH symmetrically. 2 new vitest cases guard the sanitization invariant. 2. Naming clarity: rename countParents -> countParentTokens in merge-base.ts. The function returns `1 + parentCount` (rev-list --parents output includes the commit SHA itself), so a normal 2-parent merge yields 3 tokens. The previous name made debug output (`parents=3`) misleading. Failure message updated to `parentTokens=N`. 3. Stale docstring: git.ts::bearerEnv mentioned the v6.2 bash `git_fetch` wrapper that no longer exists. Rewritten to drop the obsolete reference. 4. Smoke-test gap: the synthetic-merge success smoke test now cross-checks the bundle's emitted base.sha / head.sha against git's own `git rev-parse HEAD^2` / `git merge-base HEAD^1 HEAD^2` answer. Guards against silent SHA-transposition regressions. 5. Dedupe-fixture coverage gap: `dedupe_gate_only.md` pins `execution-context.pr.enabled: false` to keep its narrow contract, which left the `inlined-imports: true + on.pr + exec-context PR active` combination uncovered. Added `tests/fixtures/dedupe_exec_context_pr_only.md` and `test_exec_context_pr_only_downloads_bundle_in_agent_job_not_setup` to assert the Agent-job download fires for exec-context-pr.js alone (no gate, imports inlined). Verified: cargo build/test (1746 unit + 126 compiler_tests + all integration suites green), cargo clippy --all-targets --all-features clean, `npm run typecheck/test/build/test:smoke` (245 vitest + 6 smoke tests green). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
7bcf107 to
e5777a0
Compare
🔍 Rust PR ReviewSummary: Well-engineered with good trust-boundary design — one security gap in the TypeScript failure path, one visibility inconsistency in Rust. Findings🔒 Security Concerns
|
Summary
Closes #860.
Adds an always-on
ExecContextExtensionthat precomputes aminimal, focused set of execution context signals for the agent
on PR-triggered builds, and appends a tailored fragment to the agent
prompt so reviewer / triage agents stop reinventing
git fetch,merge-base resolution, and shallow-clone handling in every workflow
body.
The design follows the issue author's broader-scope request: a
plugin surface (
ContextContributortrait +Contributorenum)rather than a one-off PR helper. Future contributors (work-item,
build-history, schedule, …) plug into the same step slot and share
the same trust-boundary.
Design (v6.2 — aligned with gh-aw)
After several iterations, the design converged on a much smaller
surface than v1, inspired by how
gh-aw handles PR context. The
compiler does ONLY what the agent cannot do itself inside the AWF
sandbox:
git merge-baseresolves (requires the bearer; cannot happeninside the agent's network-isolated sandbox).
aw-context/pr/base.shaandaw-context/pr/head.sha(40-char hex SHAs) so the agent reuses them across many
git diffinvocations./tmp/awf-tools/agent-prompt.mdlisting the right
gitcommands and example ADO MCP tool calls(
repo_get_pull_request_by_id,repo_list_pull_request_threads,repo_create_pull_request_thread) with PR id / project / repopre-filled as literal values.
commands (
git,git diff,git log,git show,git status,git rev-parse,git symbolic-ref) so the agentcan inspect the diff locally.
The agent then runs
git diff $BASE..$HEADitself inside the AWFsandbox — the objects are already fetched into the workspace and
mounted in. No
changed-files.txt, nodiff.patch, nohead-files//base-files/snapshots, nostatus.txtprotocol.base.shaalways has the SAME semantics in both code paths(synthetic-merge-commit and progressive-deepening): the true common
ancestor via
git merge-base. The agent therefore sees the same"PR diff since branch-point" change set regardless of how ADO
checked out the workspace.
Trust boundary (non-negotiable)
persistCredentials: trueand no checkout override —would write the bearer to
.git/configinside the AWF-mountedworkspace where the agent could exfiltrate it.
SYSTEM_ACCESSTOKENis mapped from$(System.AccessToken)only into the precompute step's
env:block; never into theagent step.
git fetchviaGIT_CONFIG_COUNT/GIT_CONFIG_KEY_0/GIT_CONFIG_VALUE_0env vars (not
git -c http.extraheader=…argv — noprocess-listing leak).
condition: eq(variables['Build.Reason'], 'PullRequest').refspecs (
PR_ID,PROJECT,REPO,PR_TARGET_BRANCH) arestrictly regex-validated before use, even though they come from
ADO infra rather than user-controlled input.
Front matter (additive, optional)
on.pris REQUIRED for the contributor to activate. Settingpr.enabled: trueon a non-PR-triggered agent does NOT activatethe contributor (the prepare step would be dead code, and silently
widening the agent's bash surface would be a footgun).
Iterating to v6.2 — what changed vs v1
The original implementation staged 6 files and a
status.txtprotocol with 4 status values. After review feedback:
status.txt,metadata.txt,changed-files*.txt,diff.patch,head-files/,base-files/. The agent runs itsown
git diffagainst the precomputed SHAs.execution-context.prfrom 5 knobs(
scope/unified/max-diff-bytes/snapshots/enabled) toone (
enabled).directly into the prompt fragment via
printf, not staged asfiles.
error.txt+ failure prompt fragment that tellsthe agent to surface the error (and use ADO MCP as a fallback)
rather than produce an empty review.
Robustness fixes from review
base.shasemantics are now consistent across thesynthetic-merge-commit path (
HEAD^1 + HEAD^2) and theprogressive-deepening path. Both compute
git merge-basesogit diff $BASE..$HEADproduces the samechange set regardless of ADO's checkout mode.
PR_TARGET_BRANCHis regex-validated beforeinterpolation into the git refspec
(
^[A-Za-z0-9._/-]+$), matching the posture of the otheridentifiers.
pr.enabled: truewithouton.prdoes not activate thecontributor (no compile-time widening of the agent bash
allow-list for a step that can never run at runtime).
loop only breaks when
git merge-baseresolves (not on thefirst successful fetch).
GIT_CONFIG_*env, not argv.and asserts
SYSTEM_ACCESSTOKENappears only on theexec-context prepare step's
env:block (matched bydisplayName).Docs
docs/execution-context.md— full reference (motivation,contributors, field reference, agent-visible layout, prompt
fragment shape, bash-allowlist auto-extension, trust boundary,
precompute step walkthrough).
docs/front-matter.md— addedexecution-context:block tothe example.
AGENTS.md— index entry + architecture tree updated.prompts/create-ado-agentic-workflow.md— directs PR-reviewerauthors to use the auto-staged context and prompt fragment
rather than rolling their own precompute.
Test plan
cargo build— clean.cargo clippy --all-targets --all-features— clean.cargo test— 1739 unit tests + 127 compiler tests(9 exec-context tests) +
bash_lint_testswithENFORCE_BASH_LINT=1(shellcheck v0.10.0) + every other suite:all pass.
Exec-context tests (9 total)
test_execution_context_pr_compiled_output_is_valid_yamltest_execution_context_pr_emits_prepare_step_and_prompt_supplementtest_execution_context_pr_does_not_leak_system_accesstoken(parses YAML, asserts token env appears only on the
exec-context prepare step by
displayName)test_execution_context_pr_not_emitted_when_no_pr_triggertest_execution_context_pr_auto_extends_bash_allowlisttest_execution_context_pr_does_not_extend_bash_when_disabledtest_execution_context_pr_enabled_true_without_on_pr_is_inactivetest_execution_context_pr_synthetic_merge_uses_merge_base(regression guard against the inconsistent-
base.shabug)test_execution_context_pr_target_branch_validatedThe
execution-context-agent.mdfixture is also in thebash_lint_testsFIXTURESlist so the generated bash isshellchecked on every CI run.
E2E validation in a real ADO build (replacing existing
pr-reviewer steps with this contributor) is intentionally out of
scope for this PR — tracked separately because it requires a
real ADO run.