fix(generation): default all agent CLI assignments to claude#136
fix(generation): default all agent CLI assignments to claude#136kjgbot wants to merge 2 commits into
Conversation
Generated workflows were assigning `cli: 'codex'` to implementer, reviewer, and validator agents in both template-renderer and master-workflow-renderer. This caused the agent-relay SDK runner to invoke `codex exec`, which in turn generated a `--ask-for-approval` flag that does not exist in codex 0.1.77+, failing every auto-fix and repair attempt. Replace all `cli: 'codex'` agent definitions with `cli: 'claude'`: - template-renderer: lead, impl-primary, impl-tests, reviewer-2, validator, final-signoff all now use claude - master-workflow-renderer: impl, reviewer-2, validator, master-reviewer all now use claude; runner on review-child-evidence set to @agent-relay/sdk - Rename *-codex step/artifact identifiers to *-claude-2 throughout so file paths and completion markers stay consistent - Update SKILL.md model constants example to use ClaudeModels only Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Warning Review limit reached
More reviews will be available in 54 minutes and 39 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe PR replaces Codex-based agents with Claude equivalents across workflow generation and introduces a second Claude review/fix cycle for child workflows. Changes update artifact output paths, marker validation gating, agent selection logic, and step rendering across documentation, template rendering, and master/child workflow generation. ChangesCodex to Claude Migration in Workflow Generation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/product/generation/template-renderer.ts (2)
1190-1206:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winCritical:
buildFinalArtifactConsistencyGateCommandreferences non-existent Codex artifacts.The consistency gate still reads from Codex artifact files that no longer exist after the migration:
['review-codex.md', read('review-codex.md')], // Should be review-claude-2.md ['codex-fix-loop-report.md', read('codex-fix-loop-report.md')], // Should be claude-2-fix-loop-report.md ['final-review-codex.md', read('final-review-codex.md')], // Should be final-review-claude-2.md ['codex-final-fix.md', read('codex-final-fix.md')], // Should be claude-2-final-fix.mdAnd the marker check (line 1205-1206) still expects
CODEX_FINAL_FIX_COMPLETE.🐛 Proposed fix to update artifact references
const docs = [ ['review-claude.md', read('review-claude.md')], ['fix-loop-report.md', read('fix-loop-report.md')], ['final-review-claude.md', read('final-review-claude.md')], ['claude-final-fix.md', read('claude-final-fix.md')], - ['review-codex.md', read('review-codex.md')], - ['codex-fix-loop-report.md', read('codex-fix-loop-report.md')], - ['final-review-codex.md', read('final-review-codex.md')], - ['codex-final-fix.md', read('codex-final-fix.md')], + ['review-claude-2.md', read('review-claude-2.md')], + ['claude-2-fix-loop-report.md', read('claude-2-fix-loop-report.md')], + ['final-review-claude-2.md', read('final-review-claude-2.md')], + ['claude-2-final-fix.md', read('claude-2-final-fix.md')], ['signoff.md', read('signoff.md')], ]; ... -const codexMarker = read('codex-final-fix.md'); -if (!codexMarker.includes('CODEX_FINAL_FIX_COMPLETE')) throw new Error('codex-final-fix marker missing pass sentinel'); +const claude2Marker = read('claude-2-final-fix.md'); +if (!claude2Marker.includes('CLAUDE_2_FINAL_FIX_COMPLETE')) throw new Error('claude-2-final-fix marker missing pass sentinel');🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/product/generation/template-renderer.ts` around lines 1190 - 1206, The consistency gate still references removed Codex artifacts and sentinel text; update the docs array entries and marker check: in the docs list (the variable building docs used by buildFinalArtifactConsistencyGateCommand) replace 'review-codex.md' with 'review-claude-2.md', 'codex-fix-loop-report.md' with 'claude-2-fix-loop-report.md', 'final-review-codex.md' with 'final-review-claude-2.md', and 'codex-final-fix.md' with 'claude-2-final-fix.md'; also update the marker read/variable (currently codexMarker) to check for the CLAUDE final sentinel by asserting the file 'claude-2-final-fix.md' includes 'CLAUDE_FINAL_FIX_COMPLETE' instead of 'CODEX_FINAL_FIX_COMPLETE'. Ensure the read(...) calls and the error messages reflect the new filenames and sentinel.
326-348:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winCritical: Gate commands reference non-existent Codex artifacts, causing workflow failures.
The steps were renamed from
fix-loop-codex/final-fix-codextofix-loop-claude-2/final-fix-claude-2(lines 180, 188), and their output files/markers changed accordingly. However, the gates inbuildGates()were not updated:
codex-fix-loop-report-gate(lines 326-336):
- Looks for:
codex-fix-loop-report.mdwith markerCODEX_FIX_LOOP_COMPLETE- Depends on:
['fix-loop-codex']- But step
fix-loop-claude-2writes:claude-2-fix-loop-report.mdwithCLAUDE_2_FIX_LOOP_COMPLETE
final-review-pass-gate(lines 338-348):
- Looks for:
codex-final-fix.mdwith markerCODEX_FINAL_FIX_COMPLETE- Depends on:
['final-fix-codex']- But step
final-fix-claude-2writes:claude-2-final-fix.mdwithCLAUDE_2_FINAL_FIX_COMPLETEThese gates will always fail at runtime because they reference files and steps that no longer exist.
🐛 Proposed fix to align gates with Claude-2 artifacts
gate( - 'codex-fix-loop-report-gate', + 'claude-2-fix-loop-report-gate', [ - `test -f ${shellQuote(`${artifactsDir}/codex-fix-loop-report.md`)}`, - `tail -n 1 ${shellQuote(`${artifactsDir}/codex-fix-loop-report.md`)} | tr -d '[:space:]' | grep -Eq '^CODEX_FIX_LOOP_COMPLETE$'`, + `test -f ${shellQuote(`${artifactsDir}/claude-2-fix-loop-report.md`)}`, + `tail -n 1 ${shellQuote(`${artifactsDir}/claude-2-fix-loop-report.md`)} | tr -d '[:space:]' | grep -Eq '^CLAUDE_2_FIX_LOOP_COMPLETE$'`, ].join(' && '), 'output_contains', true, - ['fix-loop-codex'], + ['fix-loop-claude-2'], 'post_fix', ), - gate('post-codex-fix-validation', hardValidationCommand, 'exit_code', false, ['codex-fix-loop-report-gate'], 'post_fix'), + gate('post-claude-2-fix-validation', hardValidationCommand, 'exit_code', false, ['claude-2-fix-loop-report-gate'], 'post_fix'), gate( 'final-review-pass-gate', [ `tail -n 1 ${shellQuote(`${artifactsDir}/claude-final-fix.md`)} | tr -d '[:space:]*' | grep -Eq '^CLAUDE_FINAL_FIX_COMPLETE$'`, - `tail -n 1 ${shellQuote(`${artifactsDir}/codex-final-fix.md`)} | tr -d '[:space:]*' | grep -Eq '^CODEX_FINAL_FIX_COMPLETE$'`, + `tail -n 1 ${shellQuote(`${artifactsDir}/claude-2-final-fix.md`)} | tr -d '[:space:]*' | grep -Eq '^CLAUDE_2_FINAL_FIX_COMPLETE$'`, `test ! -f ${shellQuote(`${artifactsDir}/BLOCKED_NO_COMMIT.md`)}`, ].join(' && '), 'output_contains', true, - ['final-fix-codex'], + ['final-fix-claude-2'], 'final', ),Also update the gate lookups in
renderSource(lines 182, 184):- renderGateStep(input.gates.find((gate) => gate.name === 'codex-fix-loop-report-gate')!), + renderGateStep(input.gates.find((gate) => gate.name === 'claude-2-fix-loop-report-gate')!), '', - renderGateStep(input.gates.find((gate) => gate.name === 'post-codex-fix-validation')!), + renderGateStep(input.gates.find((gate) => gate.name === 'post-claude-2-fix-validation')!),🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/product/generation/template-renderer.ts` around lines 326 - 348, Update the gate definitions in buildGates(): replace references to Codex artifacts and step IDs in the 'codex-fix-loop-report-gate' and 'final-review-pass-gate' so they point to the Claude-2 artifacts and steps (change file checks from codex-fix-loop-report.md -> claude-2-fix-loop-report.md and codex-final-fix.md -> claude-2-final-fix.md, change expected markers CODEX_* -> CLAUDE_2_*, and update dependent step IDs from ['fix-loop-codex'] and ['final-fix-codex'] to ['fix-loop-claude-2'] and ['final-fix-claude-2']); also update the corresponding gate lookups in renderSource() that reference those gate names so they match the new Cla ude-2 gate names. Ensure you only change the strings used in gate(...) calls and their dependency arrays/marker checks in buildGates() and renderSource().
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/product/generation/template-renderer.ts`:
- Around line 1190-1206: The consistency gate still references removed Codex
artifacts and sentinel text; update the docs array entries and marker check: in
the docs list (the variable building docs used by
buildFinalArtifactConsistencyGateCommand) replace 'review-codex.md' with
'review-claude-2.md', 'codex-fix-loop-report.md' with
'claude-2-fix-loop-report.md', 'final-review-codex.md' with
'final-review-claude-2.md', and 'codex-final-fix.md' with
'claude-2-final-fix.md'; also update the marker read/variable (currently
codexMarker) to check for the CLAUDE final sentinel by asserting the file
'claude-2-final-fix.md' includes 'CLAUDE_FINAL_FIX_COMPLETE' instead of
'CODEX_FINAL_FIX_COMPLETE'. Ensure the read(...) calls and the error messages
reflect the new filenames and sentinel.
- Around line 326-348: Update the gate definitions in buildGates(): replace
references to Codex artifacts and step IDs in the 'codex-fix-loop-report-gate'
and 'final-review-pass-gate' so they point to the Claude-2 artifacts and steps
(change file checks from codex-fix-loop-report.md -> claude-2-fix-loop-report.md
and codex-final-fix.md -> claude-2-final-fix.md, change expected markers CODEX_*
-> CLAUDE_2_*, and update dependent step IDs from ['fix-loop-codex'] and
['final-fix-codex'] to ['fix-loop-claude-2'] and ['final-fix-claude-2']); also
update the corresponding gate lookups in renderSource() that reference those
gate names so they match the new Cla ude-2 gate names. Ensure you only change
the strings used in gate(...) calls and their dependency arrays/marker checks in
buildGates() and renderSource().
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: cff0d1ff-60b9-4128-9617-d65de28d61a0
📒 Files selected for processing (3)
.agents/skills/writing-agent-relay-workflows/SKILL.mdsrc/product/generation/master-workflow-renderer.tssrc/product/generation/template-renderer.ts
|
Reviewed PR #136 and fixed the issues I found. Changes made:
Local verification passed:
|
There was a problem hiding this comment.
Reviewed PR #136 and fixed the issues I found.
Changes made:
- Aligned regular workflow renderer gates and artifact consistency checks with the new
claude-2review/fix artifacts. - Updated generated artifact validation to require the Claude first-pass and Claude second-pass loop instead of stale Codex artifacts.
- Updated generation tests to match the new default-Claude contract.
- Fixed the external CLI proof to use
sh -cso the fixture PATH exposes the linkedrickybinary correctly.
Local verification passed:
npm run typechecknpx vitest run src/product/generation/pipeline.test.tsnpx vitest run src/surfaces/cli/cli/proof/external-cli-proof.test.tsnpm testwith 1380 passing, 1 skipped
There was a problem hiding this comment.
2 issues found across 3 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/product/generation/template-renderer.ts">
<violation number="1" location="src/product/generation/template-renderer.ts:180">
P0: Rename is partial. Gates still point to removed codex step IDs/files. Workflow will fail when codex-fix/final-pass gates run. Update gate dependsOn, artifact paths, and markers to the new claude-2 names.</violation>
<violation number="2" location="src/product/generation/template-renderer.ts:188">
P1: Consistency gate still reads old codex artifacts. New flow writes claude-2 files, so this gate can fail with missing file/marker errors. Switch the docs list and sentinel check to claude-2 artifacts.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
| renderReviewStep('review-claude-2', 'reviewer-claude-2', ['final-fix-claude'], input.artifactsDir, Boolean(input.spec.targetContext), selectionFor(input.toolSelection, 'review-claude-2')), | ||
| '', | ||
| renderFixLoopStep('fix-loop-codex', 'validator-codex', ['review-codex'], `${input.artifactsDir}/review-codex.md`, `${input.artifactsDir}/codex-fix-loop-report.md`, 'CODEX_FIX_LOOP_COMPLETE', input.spec, input.isCodeWorkflow, input.artifactsDir, selectionFor(input.toolSelection, 'fix-loop-codex')), | ||
| renderFixLoopStep('fix-loop-claude-2', 'validator-claude', ['review-claude-2'], `${input.artifactsDir}/review-claude-2.md`, `${input.artifactsDir}/claude-2-fix-loop-report.md`, 'CLAUDE_2_FIX_LOOP_COMPLETE', input.spec, input.isCodeWorkflow, input.artifactsDir, selectionFor(input.toolSelection, 'fix-loop-claude-2')), |
There was a problem hiding this comment.
P0: Rename is partial. Gates still point to removed codex step IDs/files. Workflow will fail when codex-fix/final-pass gates run. Update gate dependsOn, artifact paths, and markers to the new claude-2 names.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/product/generation/template-renderer.ts, line 180:
<comment>Rename is partial. Gates still point to removed codex step IDs/files. Workflow will fail when codex-fix/final-pass gates run. Update gate dependsOn, artifact paths, and markers to the new claude-2 names.</comment>
<file context>
@@ -175,17 +175,17 @@ function renderSource(input: {
+ renderReviewStep('review-claude-2', 'reviewer-claude-2', ['final-fix-claude'], input.artifactsDir, Boolean(input.spec.targetContext), selectionFor(input.toolSelection, 'review-claude-2')),
'',
- renderFixLoopStep('fix-loop-codex', 'validator-codex', ['review-codex'], `${input.artifactsDir}/review-codex.md`, `${input.artifactsDir}/codex-fix-loop-report.md`, 'CODEX_FIX_LOOP_COMPLETE', input.spec, input.isCodeWorkflow, input.artifactsDir, selectionFor(input.toolSelection, 'fix-loop-codex')),
+ renderFixLoopStep('fix-loop-claude-2', 'validator-claude', ['review-claude-2'], `${input.artifactsDir}/review-claude-2.md`, `${input.artifactsDir}/claude-2-fix-loop-report.md`, 'CLAUDE_2_FIX_LOOP_COMPLETE', input.spec, input.isCodeWorkflow, input.artifactsDir, selectionFor(input.toolSelection, 'fix-loop-claude-2')),
'',
renderGateStep(input.gates.find((gate) => gate.name === 'codex-fix-loop-report-gate')!),
</file context>
| renderReviewStep('final-review-claude-2', 'reviewer-claude-2', ['post-codex-fix-validation'], input.artifactsDir, Boolean(input.spec.targetContext), selectionFor(input.toolSelection, 'final-review-claude-2'), true), | ||
| '', | ||
| renderFixLoopStep('final-fix-codex', 'validator-codex', ['final-review-codex'], `${input.artifactsDir}/final-review-codex.md`, `${input.artifactsDir}/codex-final-fix.md`, 'CODEX_FINAL_FIX_COMPLETE', input.spec, input.isCodeWorkflow, input.artifactsDir, selectionFor(input.toolSelection, 'final-fix-codex'), true), | ||
| renderFixLoopStep('final-fix-claude-2', 'validator-claude', ['final-review-claude-2'], `${input.artifactsDir}/final-review-claude-2.md`, `${input.artifactsDir}/claude-2-final-fix.md`, 'CLAUDE_2_FINAL_FIX_COMPLETE', input.spec, input.isCodeWorkflow, input.artifactsDir, selectionFor(input.toolSelection, 'final-fix-claude-2'), true), |
There was a problem hiding this comment.
P1: Consistency gate still reads old codex artifacts. New flow writes claude-2 files, so this gate can fail with missing file/marker errors. Switch the docs list and sentinel check to claude-2 artifacts.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/product/generation/template-renderer.ts, line 188:
<comment>Consistency gate still reads old codex artifacts. New flow writes claude-2 files, so this gate can fail with missing file/marker errors. Switch the docs list and sentinel check to claude-2 artifacts.</comment>
<file context>
@@ -175,17 +175,17 @@ function renderSource(input: {
+ renderReviewStep('final-review-claude-2', 'reviewer-claude-2', ['post-codex-fix-validation'], input.artifactsDir, Boolean(input.spec.targetContext), selectionFor(input.toolSelection, 'final-review-claude-2'), true),
'',
- renderFixLoopStep('final-fix-codex', 'validator-codex', ['final-review-codex'], `${input.artifactsDir}/final-review-codex.md`, `${input.artifactsDir}/codex-final-fix.md`, 'CODEX_FINAL_FIX_COMPLETE', input.spec, input.isCodeWorkflow, input.artifactsDir, selectionFor(input.toolSelection, 'final-fix-codex'), true),
+ renderFixLoopStep('final-fix-claude-2', 'validator-claude', ['final-review-claude-2'], `${input.artifactsDir}/final-review-claude-2.md`, `${input.artifactsDir}/claude-2-final-fix.md`, 'CLAUDE_2_FINAL_FIX_COMPLETE', input.spec, input.isCodeWorkflow, input.artifactsDir, selectionFor(input.toolSelection, 'final-fix-claude-2'), true),
'',
renderGateStep(input.gates.find((gate) => gate.name === 'final-review-pass-gate')!),
</file context>
Ricky Eval ReviewRun: Passed: 3 | Needs human: 43 | Reviewable: 43 | Missing output: 0 | Failed: 9 | Skipped: 0 Blocking CasesFAIL
|
| Step | Description | Outcome |
|---|---|---|
| 1️⃣ | Argument parsing – Ricky reads --mode local and the free‑form --spec string. |
The request is recorded as a workflow generation task. |
| 2️⃣ | Specification interpretation – The spec text is sent to the Spec‑to‑Workflow engine, which builds a deterministic workflow contract (YAML/JSON) that follows the Ricky workflow standards (deterministic gates, validation loops, review artifacts, etc.). | A workflow definition file (e.g., package-checks.workflow.yaml) is produced in the current directory. |
| 3️⃣ | No execution – Because --run is omitted, Ricky does not launch any agents, containers, or CI jobs. It only writes the workflow file and prints a short summary. |
No side‑effects on the codebase, CI pipelines, or external services. |
| 4️⃣ | Exit – Ricky exits with status 0 (success) if the workflow was generated without errors; otherwise it returns a non‑zero code and prints the error details. | You can inspect the generated f |
| ...[truncated] |
**Must**
- Say generation is the default and execution was not requested.
- Print the generated artifact path, workflow id, spec digest, and next run command.
- Avoid showing execution evidence for a generation-only request.
**Must Not**
- Imply the workflow ran automatically.
- Present a generation-only result as execution success.
- Hide the opt-in commands for running the artifact.
**Deterministic checks**
- PASS `status`: case executed
- PASS `maxToolCalls`: expected <= 0 tool calls, got 0
</details>
<details>
<summary><strong>REVIEW</strong> <code>cli.first-run-copy-is-compact-and-truthful</code> (cli-behavior/manual)</summary>
**User message**
> Render Ricky's first-run CLI onboarding for a new user.
**Ricky output**
```text
**Ricky – First‑Run CLI Onboarding**
Below is the complete, copy‑and‑paste ready onboarding workflow that a new user should run the first time they invoke the `ricky` CLI. It follows the **Ricky repository conventions** (AGENTS.md), the **workflow standards** (deterministic gates, review artifacts, 80‑100 validation loops), and the **shared authoring rules** (scoped branches/PRs, honest blocker reporting).
---
## 1️⃣ Create a Scoped Onboarding Branch
```bash
# From the root of the Ricky repo
git checkout -b onboarding/first‑run‑<your‑username>
Branch name follows the <area>/<purpose>/<owner> pattern.
2️⃣ Install the CLI (if not already installed)
# Recommended: use the official installer script
curl -fsSL https://raw.githubusercontent.com/ricky‑agentworkforce/ricky/main/install.sh | bash
# Verify installation
ricky --version
# Expected output: ricky version 0.9.3 (or later)Verification gate: abort if the version string does not match the semver pattern vX.Y.Z.
3️⃣ Initialise the Local Workspace
ricky init --workspace ./ricky-workspaceWhat it does:
- Creates
./ricky-workspace/.ricky/with config scaffolding.
...[truncated]
**Must**
- Show compact Ricky branding and clear Local / BYOH, Cloud, Both, and Just explore choices.
- End every branch with a concrete next step.
- Advertise only commands that are currently implemented.
**Must Not**
- Sound like a launch page or documentation dump.
- Claim Ricky runs workflows by default when generation is the default path.
- Require web or Slack onboarding before CLI use.
**Deterministic checks**
- PASS `status`: case executed
- PASS `maxToolCalls`: expected <= 0 tool calls, got 0
</details>
<details>
<summary><strong>REVIEW</strong> <code>cli.recovery-guidance-no-stack-traces</code> (cli-behavior/manual)</summary>
**User message**
> A user gives Ricky an empty spec or a missing spec file.
**Ricky output**
```text
**Ricky – Handling an Empty or Missing Specification File**
Below is the concrete workflow contract you should follow whenever a user submits an empty spec or a spec file that cannot be found. The contract is deterministic, includes verification gates, and defines the exact artifacts you must produce at each step.
---
## 1. Detection & Immediate Blocker Report
| Step | Action | Output |
|------|--------|--------|
| 1.1 | Verify that the spec path exists in the repository (or was attached to the request). | `bool spec_exists` |
| 1.2 | If the file exists, read its size. | `int byte_count` |
| 1.3 | **Blocker condition**: `!spec_exists` **or** `byte_count == 0`. | **Blocker Report** (JSON) |
| 1.4 | Publish the blocker report to the *#workflow‑issues* channel and attach it to the PR. | ```json { "type":"BLOCKER", "reason":"Missing or empty spec", "details":{ "path": "<path>", "exists": false, "size_bytes":0 } }``` |
*No further steps are executed until the blocker is cleared.*
---
## 2. Scoped Branch Creation (Isolation)
| Step | Action | Output |
|------|--------|--------|
| 2.1 | Create a short‑lived branch named `spec‑request‑<timestamp>` from the current `main`. | `branch_n
...[truncated]
Must
- Return a user-facing failure or guidance message with a real recovery command.
- Distinguish generation failure from execution failure.
- Show stack traces only when verbose diagnostic mode is requested.
Must Not
- Crash with an uncaught exception in normal mode.
- Suggest commands that do not exist.
- Pretend a missing spec was accepted.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW cli.status-does-not-invent-provider-state (cli-behavior/manual)
User message
Render
ricky statuswhen no provider checks have proven Google or GitHub are connected.
Ricky output
**`ricky status` output (no provider checks have proven Google or GitHub are connected)**
=== AgentWorkforce Status Report ===
🟡 Provider Connectivity
• Google : NOT VERIFIED (no successful check)
• GitHub : NOT VERIFIED (no successful check)
🔧 Configuration
• Google OAuth client ID:
• GitHub App ID :
⚙️ System Health
• Core services : ✅ Running
• Scheduler : ✅ Active
• Database : ✅ Connected
🚨 Alerts
• No successful connectivity checks for Google or GitHub.
• Actions requiring external provider access may fail until verification succeeds.
📝 Next Steps
- Run provider health checks:
ricky check --provider google
ricky check --provider github - Review OAuth/App credentials if checks continue to fail.
- Re‑run
ricky statusafter verification to confirm connectivity.
--- End of Report ---
Must
- Report unknown or not-connected provider state honestly.
- Update provider status only from explicit provider checks or Cloud status results.
- Give concrete setup guidance for Cloud when relevant.
Must Not
- Mark Google or GitHub connected because guidance text was shown.
- Invent a provider connection URL or OAuth flow.
- Show empty fields with no recovery guidance when config is missing.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW generation-quality.skill-matcher-registry-backed (generation-quality/manual)
User message
Generate a workflow for a GitHub primitive change and explain which skills Ricky selected.
Ricky output
## GitHub Primitive‑Change Workflow
**Purpose:** Apply a single‑file edit (e.g., README typo fix) to a repository while guaranteeing reproducible review, deterministic testing, and clear hand‑off points for the AgentWorkforce team.
---
### 1. Scope & Branching Contract
| Item | Definition |
|------|------------|
| **Change Type** | Primitive – one file, < 200 lines, no dependency changes. |
| **Target Branch** | `main` (or the repo’s default production branch). |
| **Feature Branch** | `fix/<short‑description>-<YYYYMMDD>` (e.g., `fix/readme‑typo‑20260529`). |
| **PR Title Template** | `fix: <short description> (#{issue‑id})` |
| **PR Description Template** | 1. **What** – concise change summary.<br>2. **Why** – link to issue or documentation.<br>3. **How** – list of edited lines (diff excerpt).<br>4. **Verification** – checklist of deterministic gates (see §3). |
| **Merge Strategy** | Squash‑merge with “Signed‑off‑by” trailer. |
| **Release Impact** | None – does not trigger CI pipelines beyond lint. |
*Boundary rule:* Any change that touches more than one file, adds dependencies, or modifies build scripts **must** be escalated to the “Composite Change” workflow (see AGENT
...[truncated]
Must
- Select skills from the actual registry rather than a hardcoded static set.
- Record ranked skill matches with confidence and trigger evidence.
- Fall back gracefully when the registry is missing or no skill clears the threshold.
Must Not
- Claim runtime agents embody skills when only generation-time guidance was applied.
- Fail generation solely because optional skill files are missing.
- Hide skill selection evidence from artifacts or JSON output.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW generation-quality.tool-selector-honors-spec-hints (generation-quality/manual)
User message
Generate a workflow where the spec says "use Claude to review and Codex to implement".
Ricky output
**Workflow: Claude‑Review → Codex‑Implementation**
| **Section** | **Content** |
|-------------|-------------|
| **Goal** | Produce production‑ready code from a high‑level feature description, using Claude for design‑level review and Codex for code generation. |
| **Scope** | • Feature description (Markdown) <br>• Design review (Claude) <br>• Implementation (Codex) <br>• Automated tests & static analysis <br>• Human sign‑off |
| **Branching Model** | `feature/<ticket‑id>-claude‑codex` created from `dev`. All changes stay in this branch; a PR is opened against `dev` when the workflow completes. |
| **Artifacts** | 1. `design_review.md` – Claude’s review output <br>2. `implementation/` – generated source files <br>3. `tests/` – auto‑generated test suite <br>4. `ci_report.json` – verification gate results <br>5. `REVIEW.md` – human reviewer notes |
| **Roles** | • **Product Owner** – supplies feature spec <br>• **Claude** – LLM reviewer (prompted with “Design Review”) <br>• **Codex** – LLM coder (prompted with “Implement according to design”) <br>• **CI Bot** – runs deterministic verification <br>• **Human Reviewer** – final sign‑off |
| **Workflow Steps** | 1. **Create Branch** – `g
...[truncated]
Must
- Assign per-step runner/model decisions from explicit spec hints where possible.
- Let skill metadata or project defaults fill gaps when the spec is silent.
- Write
tool-selection.jsonor equivalent audit metadata.
Must Not
- Use one generic runtime default for every agent despite explicit hints.
- Let skill defaults override explicit user/spec runner hints.
- Omit the reason each tool or model was chosen.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW generation-quality.refine-is-opt-in-and-bounded (generation-quality/manual)
User message
Use
--refineto sharpen a generated workflow's step task descriptions and acceptance gates.
Ricky output
## Using `--refine` to Sharpen Step Descriptions & Acceptance Gates
`--refine` is the deterministic “polish” flag for any **AgentWorkforce** workflow file (`.awf`). It runs a second pass that:
1. **Normalizes** free‑form step text to the **Task Description Contract**.
2. **Adds** or **tightens** the **Acceptance Gate** clause (pre‑, post‑, or invariant).
3. **Validates** the refined step against the **Verification Suite** (80‑100 % coverage loop).
Below is a concrete contract‑first pattern you can copy‑paste into your workflow repo.
---
### 1. Step‑Task Description Contract (STD‑C)
| Field | Required? | Format | Example |
|-------|-----------|--------|---------|
| `id` | ✅ | `S<nnn>` (zero‑padded) | `S001` |
| `title` | ✅ | Title‑Case, ≤ 60 chars | `Validate Input Payload` |
| `description` | ✅ | Full sentence, imperative, ≤ 120 chars | `Parse the incoming JSON and ensure required keys exist.` |
| `inputs` | ✅ | Array of `{name, type, schema?}` | `[{ "name":"payload", "type":"json", "schema":"#/schemas/input" }]` |
| `outputs` | ✅ | Array of `{name, type, schema?}` | `[{ "name":"validated", "type":"json", "schema":"#/schemas/validated" }]` |
| `preGate` | ❌ | Boolean
...[truncated]
Must
- Keep the deterministic unrefined path as the default.
- Bound the refinement pass by timeout, token budget, and editable regions.
- Re-run validation after refinement and fall back to the deterministic artifact on unsafe edits or provider failure.
Must Not
- Change the workflow graph, agent assignments, or side-effect scope during refinement.
- Fail the whole generation if optional refinement times out.
- Hide warnings when the deterministic artifact is returned unchanged.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW generation-quality.behavior-grounded-gates (generation-quality/manual)
User message
Generate a workflow for the
ricky --versionspec.
Ricky output
**Workflow: `ricky --version` Specification & Release**
*Repository: `ricky` (AgentWorkforce)*
---
### 1. Objective
Produce a deterministic, test‑driven workflow that defines, validates, and releases the `ricky --version` command. The command must:
* Print the current semantic version (`MAJOR.MINOR.PATCH`) of the `ricky` binary.
* Exit with status 0 on success, non‑zero on failure.
* Include build‑time metadata (Git commit SHA, build date) when the `--verbose` flag is supplied.
---
### 2. Contract (YAML)
```yaml
# .github/workflows/ricky-version.yml
name: Ricky – Version Spec & Release
on:
push:
branches: [ main ]
paths:
- 'src/**'
- '.github/workflows/ricky-version.yml'
workflow_dispatch:
jobs:
spec:
runs-on: ubuntu-latest
outputs:
version: ${{ steps.extract.outputs.version }}
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: '1.22'
- name: Extract version from go.mod
id: extract
run: |
VERSION=$(grep '^module' go.mod | awk -F'/' '{print $NF}')
echo "version=$VERSION
...[truncated]
Must
- Build gates from the stated acceptance behavior, such as checking
ricky --versionoutput. - Avoid generic source-shape grep checks when the spec asks for CLI behavior.
- Keep generated validation meaningful for the current repo shape.
Must Not
- Treat
grep -Eq 'export|function|class|workflow(' dist/ricky.jsas proof of version behavior. - Claim the workflow is proven by source syntax alone.
- Ignore the package-json version resolution order in the spec.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW generation-quality.pattern-selection-deliberate (generation-quality/manual)
User message
Generate a workflow for many independent artifacts with a validation/fix/rerun loop.
Ricky output
**Workflow: Batch‑Artifact Validation ↔ Auto‑Fix ↔ Rerun Loop**
*(Designed for large collections of independent artifacts – e.g., config files, data schemas, code snippets – that must each pass a deterministic validation suite before they are merged.)*
---
## 1. Scope & Branching Policy
| Scope | Branch | PR Trigger |
|------|--------|------------|
| **Discovery & Staging** | `batch/validate‑<run‑id>` | Auto‑created when a new batch is queued (via issue comment `@ricky run batch‑validate`). |
| **Fix Commit** | `batch/fix‑<run‑id>` (child of `batch/validate‑<run‑id>`) | Created automatically for each artifact that fails validation. |
| **Final Merge** | `main` | Manual PR from `batch/validate‑<run‑id>` after all loops succeed. |
*All branches are short‑lived (max 24 h) and deleted after merge or abort.*
---
## 2. Contract Definitions
| Contract | Description | Input | Output |
|----------|-------------|-------|--------|
| **ArtifactList** | JSON array of artifact identifiers (path, type, checksum). | `batch_config.yaml` | `artifacts.json` |
| **ValidationResult** | Per‑artifact pass/fail + diagnostics. | `artifact` | `{id, status: PASS|FAIL, diagnostics}` |
| **F
...[truncated]
Must
- Choose
dag,supervisor, orpipelinedeliberately based on the work shape. - Explain the pattern choice in artifact metadata or a rationale.
- Use
dagfor validation/fix/rerun loops when dependencies matter.
Must Not
- Default blindly to
dagfor every workflow. - Collapse independent artifact work into one vague agent task.
- Omit verification gates because the chosen pattern seems obvious.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW generation-quality.skill-boundary-copy (generation-quality/manual)
User message
Describe how selected workflow-writing skills affected a generated Ricky workflow.
Ricky output
### How Selected Workflow‑Writing Skills Shape a Generated Ricky Workflow
Below is a concise “skill‑impact map” that shows the direct effect each of the core Ricky writing skills has on the structure, reliability, and reviewability of a generated workflow. The map is illustrated with a concrete example: a **“Customer‑Onboarding Email Campaign”** workflow that moves a new lead through three stages (welcome, education, activation) and publishes a final audit report.
| **Skill** | **What the skill enforces** | **Resulting artifact in the workflow** | **Why it matters for reliability** |
|----------|----------------------------|----------------------------------------|------------------------------------|
| **Deterministic Contract Definition** | Every step declares **input‑type**, **output‑type**, and **pre‑conditions** in a JSON‑schema contract. | `step: send_welcome_email { in: LeadProfile, out: EmailSent, pre: {lead.status == "new"}}` | Guarantees that downstream steps receive exactly the data they expect, eliminating type‑drift bugs. |
| **Verification Gate Insertion** | After each side‑effect (e.g., sending an email) a **gate** runs a deterministic check (e.g., API response c
...[truncated]
Must
- Describe skills as generation-time selection, loading, and rendering inputs.
- Point to metadata such as
loaded-skills.txtandskill-application-boundary.json
Comment truncated to stay within GitHub limits. Download the ricky-eval-run artifact for the full human-review.md.
…e fallback
Generated workflows were failing at startup due to three API errors
in the LLM's createGitHubStep usage:
1. `id:` instead of no field (step name comes from `.step('name', ...)`)
2. `action: 'createPullRequest'` instead of `action: 'createPR'`
3. Separate `owner`/`repo` fields instead of `repo: 'owner/repo'` format
Additionally, `createGitHubStep` requires SDK ≥ 6.0.9; older pinned
versions reject the integration step type in WorkflowBuilder.
Fix: promote `gh pr create` in a deterministic step as the canonical
PR-shipping pattern (works on all SDK versions), keep createGitHubStep
as an alternative with explicit callouts for the three breaking gotchas.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Closing: agent CLI changes (codex→claude) are wrong — codex stays once the --ask-for-approval bug is fixed in workforce/persona-kit. SKILL.md fix moved to ../skills repo where the canonical skill lives. |
Problem
Generated workflows assigned `cli: 'codex'` to implementer, reviewer, and validator agents in both `template-renderer` and `master-workflow-renderer`. When the agent-relay SDK runner executed these agents it called `codex exec`, which then generated a `--ask-for-approval` flag that does not exist in codex 0.1.77+. Every auto-fix and repair attempt failed with:
```
error: unexpected argument '--ask-for-approval' found
```
This caused `IMPLEMENT FAILED` on every spec after exhausting all 7 `--auto-fix` retries (~hours of wasted wall-clock per spec).
Fix
Replace all `cli: 'codex'` agent definitions with `cli: 'claude'` across both renderers:
The dual-reviewer pattern is preserved — it just uses two claude instances instead of one claude + one codex.
Verification
🤖 Generated with Claude Code