Skip to content

extracts amber.review from github workflow for reuse#707

Open
markturansky wants to merge 5 commits intoambient-code:mainfrom
markturansky:improvement/amber_review
Open

extracts amber.review from github workflow for reuse#707
markturansky wants to merge 5 commits intoambient-code:mainfrom
markturansky:improvement/amber_review

Conversation

@markturansky
Copy link
Contributor

No description provided.

@github-actions
Copy link
Contributor

test

@github-actions
Copy link
Contributor

Amber Code Review

Summary

This PR extracts the inline amber-auto-review prompt into a reusable slash command at .claude/commands/amber.review.md, enabling the review to be invoked both from CI and interactively from the Claude Code CLI. The command file is well-structured and comprehensive. The primary concern is that the two-step invocation pattern in the new workflow prompt introduces ambiguity that could cause double-execution.

Issues by Severity

Blocker Issues

None.

Critical Issues

None.

Major Issues

[1] Potential double-execution from step 1 + step 2 coupling

File: .github/workflows/amber-auto-review.yml, new prompt block (lines 81-83)

The new workflow prompt instructs Claude to (1) read the command file, then (2) invoke amber.review by name. If the Skill tool is available in the CI runner -- the --allowedTools constraint only restricts specific Bash subcommands, not the Skill tool -- Claude may treat amber.review as a discrete slash command invocation after already absorbing the full instructions from step 1. This re-expands the same command file and runs the review a second time, potentially posting two gh pr comment entries on every PR.

The original approach was unambiguous: all instructions were inline, one clear execution path.

Suggested fix -- pick one model, not both:

Option A -- pure skill invocation (simpler):

prompt: |
  amber.review REPO: github-repo PR NUMBER: pr-number
  After completing the review, use gh pr comment to post your findings.

Option B -- read-then-follow, no skill invocation:

prompt: |
  Read .claude/commands/amber.review.md and follow its instructions exactly.
  Arguments: REPO: github-repo PR NUMBER: pr-number
  After completing the review, use gh pr comment to post your findings.

Minor Issues

[1] Argument format inconsistency between workflow and command file

File: .github/workflows/amber-auto-review.yml line 82

The canonical invocation (including this review run) passes arguments as REPO: ambient-code/platform PR NUMBER: 707. The workflow now generates PR 707 in ambient-code/platform. Claude will parse either form, but aligning with the documented convention avoids edge-case ambiguity.

[2] No evidence of end-to-end workflow validation

The PR modifies the active CI review pipeline with no indication the new invocation pattern was tested against a real PR run. A silent failure here would disable automated code reviews with no immediate alert. Consider triggering a dry-run on a test PR, or noting in the PR description that this was manually validated.

Positive Highlights

  • Strong DRY improvement. Extracting the prompt into .claude/commands/amber.review.md means review logic lives in exactly one place. Criteria updates propagate automatically to both CI and local CLI invocations -- no more drift between the workflow prompt and the interactive experience.
  • Command file quality is high. Well-organized with clear sections: Goal, Execution Steps, per-layer checklists (Backend, Frontend, Security), severity classification, and output format. The Operating Principles guard against rubber-stamp reviews.
  • Workflow YAML is significantly cleaned up. Removing ~45 lines of inline prompt makes the CI configuration much easier to read and maintain.
  • Correct placement and formatting. .claude/commands/amber.review.md is the right location for Claude Code slash commands; the YAML frontmatter description field is properly formatted.
  • Scope guidance is complete. The Identify Changes to Review section handles all four invocation modes (PR number, file paths, branch, uncommitted changes), making the command genuinely general-purpose beyond CI.

Recommendations

  1. Resolve the double-execution risk before merge (Major Outcome: Reduce Refinement Time with agent System #1). Choose one invocation model. Option A (pure skill invocation) is simpler and more idiomatic for Claude Code commands.
  2. Align the argument format (Minor Outcome: Reduce Refinement Time with agent System #1) to REPO: X PR NUMBER: Y for consistency with canonical usage.
  3. Confirm end-to-end behavior (Minor Epic: RAT Architecture & Design #2) by triggering the workflow on a test PR and verifying exactly one review comment is posted.

Review performed by Amber (.claude/commands/amber.review.md) against repository standards in .claude/context/ and .claude/patterns/.

@github-actions
Copy link
Contributor

Amber Code Review

Summary

PR #707 extracts the inline amber review prompt from the GitHub Actions workflow into a reusable slash command at .claude/commands/amber.review.md. The refactoring concept is sound (DRY, reusable across CLI and CI), but the workflow migration introduces a blocker regression that will break every automated review, plus a secondary issue where comment-detection logic is mismatched with the new output heading.

Issues by Severity

Blocker Issues

.github/workflows/amber-auto-review.yml:82 — Hardcoded placeholder strings break all automated reviews

The PR replaces the working GitHub Actions context variable interpolation with literal placeholder strings:

# AFTER (broken)
Arguments: REPO: github-repo PR NUMBER: pr-number

# BEFORE (working)
REPO: ${{ github.repository }}
PR NUMBER: ${{ github.event.pull_request.number }}

When this workflow runs, Claude will receive github-repo and pr-number as literal strings. Every gh pr diff and gh pr view call will fail because those aren't real values. The automated review will produce no useful output (or error out entirely) on every PR.

Fix:

prompt: |
  Read .claude/commands/amber.review.md and follow its instructions exactly.
  Arguments: REPO: ${{ github.repository }} PR NUMBER: ${{ github.event.pull_request.number }}
  After completing the review, use gh pr comment to post your findings.

Critical Issues

None.

Major Issues

.github/workflows/amber-auto-review.yml:46,107 — Comment-detection heading mismatch breaks minimize and transparency-link steps

The new slash command defines the output format as:

# Amber Code Review

But two downstream workflow steps still look for the old heading:

// Line 46 — minimize old comments
'.body | startswith("# Claude Code Review")'

// Line 107 — add transparency link
c.body.startsWith('# Claude Code Review')

Neither step will match # Amber Code Review. The result:

  • Old review comments will never be minimized on re-runs (comment spam accumulates)
  • The transparency/AI-decision-process link will never be appended

Both fail silently (continue-on-error: true on the minimize step; the transparency step just logs "No review comment found").

Fix: Either update the command output heading to keep # Claude Code Review (lowest blast radius), or update both startsWith checks in the workflow to use # Amber Code Review.

Minor Issues

.claude/commands/amber.review.md:7-9$ARGUMENTS substitution is non-standard in CI context

The command file uses the Claude Code slash command $ARGUMENTS substitution syntax, which works correctly when a user types /amber.review REPO: foo PR: 123 in the CLI. However, the workflow invokes the command by asking Claude to read the file and follow it, not through the slash command system. In that context, $ARGUMENTS appears as a literal string and Claude must infer from the surrounding prompt what to substitute.

This works in practice but is fragile — it depends on Claude interpreting ambiguous instructions rather than deterministic substitution. Consider documenting this dual-use pattern in a comment in the workflow.

Positive Highlights

  • Excellent refactoring concept. Extracting the review logic into .claude/commands/amber.review.md is exactly the right move — the slash command is now usable both from the CLI (/amber.review) and from CI, eliminating duplication and making the review logic a first-class artifact.
  • Command structure is comprehensive and well-organized. The review axes, severity classification, backend/frontend/security checklists, and operating principles are all well-articulated and match the memory system standards precisely.
  • The description frontmatter is correct and will surface properly in Claude's slash command list.
  • The allowed tools restriction (gh pr comment, gh pr diff, gh pr view, gh issue list) is appropriately scoped — principle of least privilege is maintained.

Recommendations

  1. (Blocker — fix before merge) Replace github-repo and pr-number with ${{ github.repository }} and ${{ github.event.pull_request.number }} in the workflow prompt.
  2. (Major — fix before merge) Align the comment heading. Simplest fix: change # Amber Code Review in amber.review.md to # Claude Code Review to match the two downstream startsWith filters, or update both filters to # Amber Code Review.
  3. (Minor — optional) Add a comment in the workflow YAML explaining the $ARGUMENTS pattern so future maintainers understand why the prompt says "Arguments:" instead of invoking the slash command natively.

@jeremyeder
Copy link
Collaborator

@markturansky
Copy link
Contributor Author

@jeremyeder the command is inlined in the git workflow:

Read .claude/commands/amber.review.md and follow its instructions exactly. Arguments: REPO: github-repo PR NUMBER: pr-number After completing the review, use gh pr comment to post your findings.

@github-actions
Copy link
Contributor

Amber Code Review

Summary

This PR refactors the amber-auto-review GitHub workflow by extracting the 49-line inline Claude prompt into a reusable slash command at .claude/commands/amber.review.md. The intent — enabling both CI automation and interactive /amber.review use — is a solid architectural decision. However, a comment-header mismatch introduces a functional regression that silently breaks two workflow features, and the refactoring shifts the prompt into a checked-out file, which changes the security profile of the pull_request_target workflow.


Issues by Severity

Blocker Issues

1. Comment header mismatch breaks minimization and transparency link

  • Files: .github/workflows/amber-auto-review.yml (existing lines 46 and 107) vs .claude/commands/amber.review.md (line 73)
  • Problem: The command file instructs Claude to produce output with the heading # Amber Code Review, but two existing workflow steps filter for the old heading # Claude Code Review:
    • Minimize old Claude review comments step (line 46): startswith("# Claude Code Review")
    • Add workflow link with memory system visibility step (line 107): c.body.startsWith('# Claude Code Review')
  • Consequence:
    • Minimization never fires on new-format comments. Each PR push will accumulate a new unminimized # Amber Code Review comment indefinitely.
    • The transparency link is never appended. The actions/github-script step silently exits with "No review comment found", so users lose the AI decision-process link and memory-system visibility — the transparency features this workflow was specifically designed to provide.
  • Standard violated: Functional correctness / existing workflow contract.
  • Fix: Update both string comparisons in the workflow to match the new header, OR update amber.review.md to emit # Claude Code Review to preserve the existing filters. The latter is the lower-diff fix.

Critical Issues

2. Prompt injection risk introduced by reading command file from PR-head checkout

  • Files: .github/workflows/amber-auto-review.yml lines 26-31 (checkout step) and line 81 (prompt)

  • Problem: The workflow uses pull_request_target (which holds CLAUDE_CODE_OAUTH_TOKEN) and checks out the PR head — including forks. The updated prompt now reads from that checked-out tree: Read .claude/commands/amber.review.md and follow its instructions exactly.

    Before this PR the prompt was hardcoded in the YAML, which always comes from the base branch in pull_request_target context — making it tamper-proof. Now a fork PR that modifies .claude/commands/amber.review.md can inject arbitrary instructions into the Claude agent.

  • Mitigating factor: The allowedTools restriction limits Claude to gh pr comment, gh pr diff, gh pr view, and gh issue list, preventing arbitrary shell execution.

  • Residual risk: An attacker can still inject misleading review findings, post disinformation to the PR, or consume the OAuth token budget.

  • Standard violated: security-standards.md — no secrets should be reachable via untrusted input paths.

  • Fix (choose one):

    • Read the command file from the base ref, not the PR-head checkout.
    • Keep the prompt hardcoded inline (pre-PR approach) and keep the slash command file for interactive use only.
    • Add if: github.event.pull_request.head.repo.full_name == github.repository so external forks cannot trigger the workflow with a modified file.

Major Issues

3. $ARGUMENTS substitution relies on Claude inference, not a reliable mechanism

  • File: .claude/commands/amber.review.md lines 3-7
  • Problem: The command file embeds $ARGUMENTS inside a fenced code block. When Claude reads this file from the workflow, $ARGUMENTS is never substituted — it appears as a literal string. The workflow compensates with Arguments: REPO: ... PR NUMBER: ... and Claude must infer the mapping. This works in practice, but any future rewording of the workflow prompt could silently break argument passing with no error or warning.
  • Fix: Document the substitution contract explicitly in both files, or restructure so the argument section is a plain paragraph rather than a code-fenced placeholder.

Minor Issues

4. No guard comment documenting the header coupling

  • File: .github/workflows/amber-auto-review.yml lines 43-46 and 106-108
  • Problem: The hardcoded string # Claude Code Review in the minimize and transparency-link steps is now a hidden dependency on the output format of amber.review.md. There is no comment coupling the two, making it easy to break again on a future rename.
  • Fix: Add # Must match heading emitted by .claude/commands/amber.review.md.

5. Slash command omits guidance on CI vs interactive usage

  • File: .claude/commands/amber.review.md
  • Problem: When used interactively as /amber.review, the command produces the review but does not post it. The command file and workflow have diverged responsibilities with no documentation of the split, which may confuse future maintainers.
  • Fix: Add a note at the bottom of the command file explaining that in CI the calling workflow posts via gh pr comment, while in interactive use output is returned directly.

Positive Highlights

  • Strong refactoring instinct. Extracting 49 lines of inline YAML prompt into a versioned, reusable command file is the right DRY move. It enables /amber.review as an interactive slash command in any context, not just CI.
  • Command file structure is thorough and well-formed. The frontmatter description, checklist-style review axes, severity taxonomy, and structured output template are clear and tightly aligned with the memory-system standards.
  • Tool restrictions maintained. The allowedTools constraint was correctly preserved and acts as a meaningful blast-radius limiter.
  • Defensive workflow design. continue-on-error: true on the minimize step is good practice — the workflow won't fail on comment-API errors.

Recommendations

  1. (Blocker — fix before merge) Align comment-header strings. Easiest fix: change the output heading in amber.review.md from # Amber Code Review to # Claude Code Review.

  2. (Critical — strongly recommended before merge) Harden against prompt injection by reading .claude/commands/amber.review.md from the base ref, or guard the workflow against fork PRs.

  3. (Major) Document or eliminate the implicit $ARGUMENTS substitution contract between the workflow prompt and the command file placeholder.

  4. (Minor) Add coupling comments in the workflow wherever the command file's output format is assumed.

@ambient-code ambient-code bot added this to the Merge Queue milestone Feb 27, 2026
@github-actions

This comment has been minimized.

@ambient-code ambient-code bot removed this from the Merge Queue milestone Feb 27, 2026
@markturansky
Copy link
Contributor Author

Claude Code Review — PR #707

Summary

Extracts the inline review prompt from amber-auto-review.yml into a reusable .claude/commands/amber.review.md command file, and adds fork-PR prompt injection protection by checking out the base ref separately. Clean, well-structured refactor that improves maintainability and security.

Issues by Severity

Blocker Issues

None

Critical Issues

None

Major Issues

1. Fork PR detection relies on prompt-level instruction, not workflow enforcement.github/workflows/amber-auto-review.yml:88-96

The fork detection (Fork PR: ${{ ... }}) is passed as a text hint to Claude, which must then decide which file to read. A malicious fork PR could include a modified .claude/commands/amber.review.md in the PR head that tells Claude to ignore the "Security Note" and read from the head checkout instead. Since Claude is an LLM, this is a soft gate, not a hard gate.

Suggested hardening: Use a workflow-level conditional to set the command file path deterministically:

- name: Determine review command source
  id: review-source
  run:  < /dev/null | 
    if [ "${{ github.event.pull_request.head.repo.full_name }}" \!= "${{ github.repository }}" ]; then
      echo "cmd_path=base-ref/.claude/commands/amber.review.md" >> "$GITHUB_OUTPUT"
    else
      echo "cmd_path=.claude/commands/amber.review.md" >> "$GITHUB_OUTPUT"
    fi

# Then in the prompt:
# Read and follow ${{ steps.review-source.outputs.cmd_path }} exactly.

This removes the decision from Claude and makes it a workflow-enforced path.

2. Trailing whitespace on line 89.github/workflows/amber-auto-review.yml:89

Line 89 (For fork PRs, read the command file from base-ref/.claude/commands/amber.review.md ) has a trailing space. The pre-commit trailing-whitespace hook should catch this, but YAML strings in | blocks may bypass it.

Minor Issues

1. The base-ref checkout runs on every PR, including same-repo PRs.github/workflows/amber-auto-review.yml:26-31

Minor overhead. Consider adding a condition: if: github.event.pull_request.head.repo.full_name \!= github.repository to skip the extra checkout for internal PRs. Low priority since fetch-depth: 1 makes it fast.

2. Command file frontmatter description could mention fork protection.claude/commands/amber.review.md:2

The description says "Perform a comprehensive code review..." but doesn't mention this is also the source-of-truth for the CI workflow. Could help future maintainers understand the dual use.

Positive Highlights

  1. Smart fork-PR security model: Checking out the base ref separately to prevent prompt injection from forks is a solid defensive pattern. The pull_request_target + base-ref approach is the correct GitHub Actions security pattern.
  2. Clean extraction: The command file is well-structured with clear review axes, severity classifications, and output format. Moving from inline YAML to a standalone file improves readability and testability significantly.
  3. All referenced memory files exist: The 7 context/pattern files referenced in the command all exist in the repo.
  4. Same-repo PRs can iterate on the review prompt: Allowing same-repo PRs to use their own modified command file is a good developer experience choice — authors can test review prompt improvements in their PR.

Recommendations

  1. (Major) Harden fork detection to workflow-level path resolution instead of prompt-level instruction
  2. (Minor) Fix trailing whitespace on workflow line 89
  3. (Minor) Consider conditional base-ref checkout only for fork PRs

@github-actions
Copy link
Contributor

github-actions bot commented Feb 27, 2026

Claude Code Review

Summary

This PR extracts the amber.review prompt from the inline GitHub Actions workflow into a reusable slash command file (.claude/commands/amber.review.md), and adds a meaningful security control that prevents fork PRs from injecting malicious review instructions. The change is well-scoped and the core security logic is sound. Two minor issues are worth addressing before merge.

Issues by Severity

Blocker Issues

None.

Critical Issues

None.

Major Issues

None.

Minor Issues

1. Direct expression interpolation in shell script

.github/workflows/amber-auto-review.yml -- Determine command file path step:

Using GitHub context expressions directly inside run: scripts is a GitHub Actions security anti-pattern (see GitHub hardening docs). github.event.pull_request.head.repo.full_name is technically user-controlled, and while GitHub naming rules make shell injection practically impossible here, the pattern is discouraged and sets a bad precedent. The fix is to assign the context values to env: variables in the step definition, then reference those env vars inside the shell script -- the standard pattern for safely using user-controlled context values in shell.

2. Double space in generated prompt

.github/workflows/amber-auto-review.yml -- prompt step:

There are two consecutive spaces between the repo value and PR NUMBER: in the arguments string. Unlikely to break anything but inconsistent.

3. ARGUMENTS template variable is undocumented in the command file

.claude/commands/amber.review.md -- lines 8-10:

The ARGUMENTS placeholder in the User Input section is a Claude slash-command template substitution, but this is not obvious to contributors reading the raw file. A brief note in the frontmatter description would aid discoverability.

Positive Highlights

  • Excellent security design: The fork detection logic is fail-safe -- if head.repo.full_name is null/empty, the comparison evaluates to true and correctly defaults to the safer base-ref path. Security failures close in the right direction.
  • Clean separation of concerns: The workflow handles CI orchestration and security gating; the command file owns review logic. This matches the project slash-command pattern and makes the review instructions reusable outside CI.
  • Meaningful inline comment: The comment 'Deterministic security logic - no AI reasoning involved' is excellent documentation of intent -- it tells future maintainers why the fork check is a shell script rather than part of the AI prompt.
  • Shallow clone for base-ref: Using fetch-depth: 1 for the base-ref checkout is a sensible performance choice since only the command file is needed.
  • Significant prompt reduction: Replacing 55 lines of inline prompt with 3 lines that delegate to the command file is a clear maintenance win -- future changes to review behavior only require touching one file.

Recommendations

  1. (Minor, easy fix) Apply the env variable indirection pattern to the shell script to align with GitHub security best practices.
  2. (Minor) Fix the double space in the prompt arguments line.
  3. (Optional) Add a brief note in the command file clarifying that ARGUMENTS is a template substitution variable replaced at invocation time.

Review generated by amber.review


🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants