feat(agent): use published @agent-relay/agent for scheduled monitor#101
Conversation
…gent Replaces the file: workspace link to ../cloud-runtime-run with the published `@agent-relay/agent@^6.0.18` package and verifies the scheduled monitor agent against the published Context, AgentHandle, and cron.tick event shapes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdds a scheduled monitoring agent that wakes on a 5-minute cron tick, scans one or more persisted local-run state roots, deduplicates terminal runs, renders status-aware alerts (with optional reattach commands), and posts one-shot notifications to a configurable monitor channel. Includes tests, README docs, an entrypoint, and a package dependency. ChangesScheduled Monitoring Agent
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/scheduled-agent.test.ts (1)
56-80: ⚡ Quick winAdd an explicit
completednotification test case.Current assertions only exercise
blocked. SinceshouldNotifyRunStatealso allowscompleted, add one test to lock that behavior and avoid regressions.✅ Suggested test addition
+ it("notifies for completed terminal states", () => { + const run = { + runId: "run-3", + status: "completed", + artifactPath: "workflows/generated/release.ts", + artifactDir: "/tmp/run-3", + statePath: "/tmp/run-3/state.json", + logPath: "/tmp/run-3/run.log", + evidencePath: "/tmp/run-3/evidence.json", + fixesPath: "/tmp/run-3/fixes.json", + reattachCommand: "ricky status --run run-3", + response: { ok: true, artifacts: [], logs: [], warnings: [], nextActions: [], exitCode: 0 }, + } satisfies LocalRunMonitorState; + + expect(shouldNotifyRunState(run)).toBe(true); + expect(renderRunMonitorAlert(run, "/repo")).toContain("background workflow completed"); + });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/scheduled-agent.test.ts` around lines 56 - 80, Add a new test case that mirrors the existing "blocked" spec but sets run.status = "completed" to lock behavior for completed runs; call shouldNotifyRunState(run) and assert it returns true, then call renderRunMonitorAlert(run, "/repo") and assert the output contains identifying text such as the run.runId (e.g., "run-3") and the run.reattachCommand so the alert for completed runs is exercised; reference shouldNotifyRunState and renderRunMonitorAlert to locate where to add the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@README.md`:
- Line 108: The README contains a machine-local absolute path link to
src/agent.ts; change it to a repo-relative Markdown link so it works for
others—replace the absolute path
"/Users/khaliqgant/Projects/AgentWorkforce/ricky/src/agent.ts" with
"./src/agent.ts" (or "src/agent.ts") in the sentence referencing src/agent.ts so
the link points to the repo's source file correctly.
---
Nitpick comments:
In `@src/scheduled-agent.test.ts`:
- Around line 56-80: Add a new test case that mirrors the existing "blocked"
spec but sets run.status = "completed" to lock behavior for completed runs; call
shouldNotifyRunState(run) and assert it returns true, then call
renderRunMonitorAlert(run, "/repo") and assert the output contains identifying
text such as the run.runId (e.g., "run-3") and the run.reattachCommand so the
alert for completed runs is exercised; reference shouldNotifyRunState and
renderRunMonitorAlert to locate where to add the test.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: a6ec6a86-a2b1-4f62-940e-865353f6c703
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (5)
README.mdpackage.jsonsrc/agent.tssrc/scheduled-agent.test.tssrc/scheduled-agent.ts
| ): AgentHandle { | ||
| const env = options.env ?? process.env; | ||
| const repoRoot = resolve(options.repoRoot ?? defaultRickyRepoRoot(env)); | ||
| const stateRoot = localRunStateRoot(repoRoot, env); |
There was a problem hiding this comment.
🔴 Scheduled agent reads from XDG state path but background runs are written to legacy .workflow-artifacts/ path
The scheduled agent computes stateRoot via localRunStateRoot(repoRoot, env) at src/scheduled-agent.ts:126, which resolves to ~/.local/state/ricky/local-runs/<hash>/. However, the only production caller of startLocalRunMonitor (src/surfaces/cli/flows/local-workflow-flow.ts:303) does not pass a stateRoot option, so startLocalRunMonitor defaults to localWorkflowArtifactDir (src/surfaces/cli/flows/local-run-monitor.ts:49,90-91), which writes state to <cwd>/.workflow-artifacts/ricky-local-runs/<runId>/state.json. Because these are entirely different directories, listPersistedRunStates will always find zero runs, and the proactive monitoring feature will never post any alerts. The existing ricky status --run command (src/surfaces/cli/commands/cli-main.ts:929-938) already handles this dual-path situation by trying the XDG path first then falling back to legacyLocalRunStatePath — the scheduled agent should do the same.
Prompt for agents
The scheduled agent at src/scheduled-agent.ts:126 computes stateRoot using localRunStateRoot(repoRoot, env), which resolves to an XDG-based path like ~/.local/state/ricky/local-runs/<hash>/. But the actual production background-run monitor (startLocalRunMonitor in src/surfaces/cli/flows/local-run-monitor.ts) writes state files to the legacy path <cwd>/.workflow-artifacts/ricky-local-runs/<runId>/state.json when no stateRoot option is provided (which is the case for the only production caller at src/surfaces/cli/flows/local-workflow-flow.ts:303).
The scheduled agent's listPersistedRunStates will therefore always return an empty array because it is scanning the wrong directory.
To fix this, the scheduled agent needs to also scan the legacy state directory. One approach is to modify checkPersistedRuns (or createRickyScheduledAgent) to scan both the XDG-based localRunStateRoot path AND the legacy path at resolve(repoRoot, '.workflow-artifacts', 'ricky-local-runs'). This mirrors what ricky status --run already does in src/surfaces/cli/commands/cli-main.ts:929-938 by trying both localRunStatePath and legacyLocalRunStatePath.
Alternatively, the startLocalRunMonitor default could be changed to use localRunStateRoot instead of localWorkflowArtifactDir, but that would be a larger change affecting the existing CLI flow.
Was this helpful? React with 👍 or 👎 to provide feedback.
- Scheduled agent now scans both the XDG state root and the in-repo `.workflow-artifacts/ricky-local-runs/` tree. The local-run flow at `surfaces/cli/flows/local-workflow-flow.ts` writes to the in-repo path by default, so the prior single-root scan was guaranteed to return zero runs and the monitor would never alert. - Dedupe runs by `runId` across roots, preferring the first occurrence. - Add a test that exercises dual-root scanning + dedupe, and a separate test that locks in the `completed` terminal-status alert wording. - Replace the machine-local README link with a repo-relative path and document the legacy artifact root. - Add docstrings to every new public export in `scheduled-agent.ts` for the CodeRabbit docstring-coverage gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Pushed 9ec1324 addressing the review feedback: Devin (state root bug, CodeRabbit ( CodeRabbit (test for CodeRabbit (docstring coverage) — added JSDoc to every new public export in Local verification: The "3 additional findings" Devin mentioned weren't posted inline — happy to address them if they get surfaced. |
Ricky Eval ReviewRun: Passed: 8 | Needs human: 43 | Reviewable: 43 | Missing output: 0 | Failed: 0 | Skipped: 0 Human Review CasesThese cases passed deterministic checks and include captured Ricky output for a human verdict against their REVIEW
|
| Step | Action | Result |
|---|---|---|
| 1️⃣ | Parse arguments – Ricky reads --mode local and the free‑form --spec string. |
The request is recorded as a workflow generation intent. |
| 2️⃣ | Enter “plan” mode – Because --run is absent, Ricky defaults to plan‑only mode. |
No execution of any tasks, containers, or external services occurs. |
| 3️⃣ | Generate workflow contract – Using the spec text, Ricky builds a deterministic workflow definition (YAML/JSON) that follows the repository’s AGENTS.md contract: • name: package-checks • steps: [lint, unit-test, security-scan, publish‑dry‑run] • inputs, outputs, and gate definitions are auto‑populated. |
You receive a workflow artifact (e.g., package-checks.workflow.yaml) printed to stdout or saved to ./ricky-output/. |
| 4️⃣ | Skip execution gates – All verify, test, and publish gates are staged but not triggered. |
No CI jobs, no Docker runs, no network calls. |
| 5️⃣ | Exit – Ricky prints a concise summary: | |
| ...[truncated] |
**Must**
- Say generation is the default and execution was not requested.
- Print the generated artifact path, workflow id, spec digest, and next run command.
- Avoid showing execution evidence for a generation-only request.
**Must Not**
- Imply the workflow ran automatically.
- Present a generation-only result as execution success.
- Hide the opt-in commands for running the artifact.
**Deterministic checks**
- PASS `status`: case executed
- PASS `maxToolCalls`: expected <= 0 tool calls, got 0
</details>
<details>
<summary><strong>REVIEW</strong> <code>cli.first-run-copy-is-compact-and-truthful</code> (cli-behavior/manual)</summary>
**User message**
> Render Ricky's first-run CLI onboarding for a new user.
**Ricky output**
```text
**Ricky – First‑Run CLI Onboarding**
Below is the complete, ready‑to‑run onboarding workflow for a brand‑new user of the Ricky CLI. It follows the AgentWorkforce standards for deterministic verification, scoped PR boundaries, and explicit review artifacts.
---
## 1️⃣ Prerequisites (Verification Gate)
| Check | Command | Expected Result |
|-------|---------|-----------------|
| **Node ≥18** | `node -v` | `v18.x` or higher |
| **Git ≥2.30** | `git --version` | `git version 2.30.x` or higher |
| **Docker (optional, for container‑based agents)** | `docker version --format '{{.Server.Version}}'` | `>=20.10` |
| **Ricky binary** | `ricky --version` | `ricky version 0.1.0` (or later) |
If any check fails, abort the onboarding and resolve the blocker before proceeding.
---
## 2️⃣ Create a Scoped Onboarding Branch
```bash
git checkout -b onboarding/ricky‑first‑run
All changes made by the onboarding script are confined to this branch, enabling a clean PR review.
3️⃣ Run the Interactive Onboarding Wizard
ricky onboarding initThe wizard performs the following deterministic steps (each logged to ~/.ricky/onboarding.log):
| Step | Action | Output |
|---
...[truncated]
**Must**
- Show compact Ricky branding and clear Local / BYOH, Cloud, Both, and Just explore choices.
- End every branch with a concrete next step.
- Advertise only commands that are currently implemented.
**Must Not**
- Sound like a launch page or documentation dump.
- Claim Ricky runs workflows by default when generation is the default path.
- Require web or Slack onboarding before CLI use.
**Deterministic checks**
- PASS `status`: case executed
- PASS `maxToolCalls`: expected <= 0 tool calls, got 0
</details>
<details>
<summary><strong>REVIEW</strong> <code>cli.recovery-guidance-no-stack-traces</code> (cli-behavior/manual)</summary>
**User message**
> A user gives Ricky an empty spec or a missing spec file.
**Ricky output**
```text
**Ricky’s Standard Response to an Empty or Missing Spec File**
---
### 1. Immediate Detection & Blocker Report
1. **Detect** – As soon as the workflow runner attempts to load the spec, the `spec-loader` gate checks for:
- File existence (`fs.existsSync`)
- Non‑zero byte size (`fs.statSync.size > 0`)
2. **Blocker** – If either check fails, the runner throws a **`SpecMissingError`** and publishes a **Blocker Report** to the `#workflow‑alerts` channel:
- **Title:** `🚨 Spec file missing or empty – workflow halted`
- **Details:** repository, branch, workflow name, timestamp, and a link to the failing commit.
- **Action Required:** “Provide a valid spec file or approve creation of a placeholder.”
---
### 2. Automated Placeholder Generation (Optional, with Owner Approval)
If the repository is configured for *auto‑recovery* (see `workflow.yml → recovery: true`), Ricky can generate a minimal placeholder spec that satisfies the contract surface:
```yaml
# placeholder.workflow.yaml
name: <WORKFLOW_NAME>
description: "Auto‑generated placeholder – replace with full spec."
steps:
- id: start
type: noop
description: "Entry point – no‑op step."
Process:
1.
...[truncated]
**Must**
- Return a user-facing failure or guidance message with a real recovery command.
- Distinguish generation failure from execution failure.
- Show stack traces only when verbose diagnostic mode is requested.
**Must Not**
- Crash with an uncaught exception in normal mode.
- Suggest commands that do not exist.
- Pretend a missing spec was accepted.
**Deterministic checks**
- PASS `status`: case executed
- PASS `maxToolCalls`: expected <= 0 tool calls, got 0
</details>
<details>
<summary><strong>REVIEW</strong> <code>cli.status-does-not-invent-provider-state</code> (cli-behavior/manual)</summary>
**User message**
> Render `ricky status` when no provider checks have proven Google or GitHub are connected.
**Ricky output**
```text
**`ricky status` output (no provider checks have proven Google or GitHub are connected)**
=== AgentWorkforce Status Report ===
🟡 Provider Connectivity
• Google: NOT VERIFIED (no successful auth check)
• GitHub: NOT VERIFIED (no successful auth check)
🔧 Configuration
• Google OAuth client ID:
• GitHub App ID:
• Required scopes: google.read, repo.write
🚦 Verification Gates
• Google connection check: ⏹️ pending / failed
• GitHub connection check: ⏹️ pending / failed
🛑 Blockers
• Google provider not authenticated – actions requiring Google APIs will be disabled.
• GitHub provider not authenticated – repository sync, PR creation, and webhook handling are disabled.
📋 Next Steps
1. Run ricky provider test google to trigger an OAuth flow and confirm connectivity.
2. Run ricky provider test github to trigger a GitHub App token exchange and confirm connectivity.
3. Re‑run ricky status after successful tests; the status icons should turn green (✅).
🗒️ Notes
• The system will continue to accept configuration changes, but any workflow steps that depend on Google o
...[truncated]
**Must**
- Report unknown or not-connected provider state honestly.
- Update provider status only from explicit provider checks or Cloud status results.
- Give concrete setup guidance for Cloud when relevant.
**Must Not**
- Mark Google or GitHub connected because guidance text was shown.
- Invent a provider connection URL or OAuth flow.
- Show empty fields with no recovery guidance when config is missing.
**Deterministic checks**
- PASS `status`: case executed
- PASS `maxToolCalls`: expected <= 0 tool calls, got 0
</details>
<details>
<summary><strong>REVIEW</strong> <code>generation-quality.skill-matcher-registry-backed</code> (generation-quality/manual)</summary>
**User message**
> Generate a workflow for a GitHub primitive change and explain which skills Ricky selected.
**Ricky output**
```text
**Workflow: “GitHub Primitive Change”**
| Step | Owner | Action | Input | Output | Verification Gate |
|------|-------|--------|-------|--------|--------------------|
| 1️⃣ | **Ricky (Coordination)** | **Create a scoped branch** | `main` HEAD | New branch `feat/<short‑description>` | Branch name follows `feat/…` pattern; CI lint passes on branch creation |
| 2️⃣ | **Developer** | **Implement the primitive change** (e.g., rename a variable, adjust a comment, bump a version) | Scoped branch | Modified files + unit‑test updates (if any) | Local `npm run lint && npm test` must succeed |
| 3️⃣ | **Ricky (Verification)** | **Run deterministic pre‑merge checks** | PR draft | CI results (lint, unit, integration, security scan) | All CI jobs report **PASS**; any failure blocks progression |
| 4️⃣ | **Peer Reviewer** | **Perform code review** | PR diff | Review comments + approval | At least one **APPROVE** + **LGTM** label; no “REQUEST CHANGES” |
| 5️⃣ | **Ricky (Gatekeeper)** | **Enforce “80‑to‑100 validation loops”** – automatically re‑run CI on every new commit until 100% pass rate across 80 consecutive runs (to catch flakiness) | Updated PR | Stable CI badge | If any run fails, PR is b
...[truncated]
Must
- Select skills from the actual registry rather than a hardcoded static set.
- Record ranked skill matches with confidence and trigger evidence.
- Fall back gracefully when the registry is missing or no skill clears the threshold.
Must Not
- Claim runtime agents embody skills when only generation-time guidance was applied.
- Fail generation solely because optional skill files are missing.
- Hide skill selection evidence from artifacts or JSON output.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW generation-quality.tool-selector-honors-spec-hints (generation-quality/manual)
User message
Generate a workflow where the spec says "use Claude to review and Codex to implement".
Ricky output
**Workflow: Claude‑Review → Codex‑Implementation**
| **Section** | **Content** |
|-------------|-------------|
| **Goal** | Produce production‑ready code from a high‑level feature description. Claude performs a deterministic review of the specification; Codex generates the implementation that must pass automated verification before merge. |
| **Scope** | • Feature description (Markdown) <br>• Review comments (Claude) <br>• Generated source files (Codex) <br>• Unit / integration test suite <br>• CI gate artifacts |
| **Branch Model** | `feature/<ticket‑id>-<short‑name>` → PR → `main` (protected). All workflow runs on the feature branch. |
| **Roles** | • **Product Owner (PO)** – supplies the spec.<br>• **Claude Reviewer** – LLM (Claude) invoked via `review‑spec` action.<br>• **Codex Generator** – LLM (Codex) invoked via `generate‑code` action.<br>• **CI Bot** – runs deterministic verification gates.<br>• **Human Reviewer** – final sign‑off on PR. |
| **Artifacts** | 1. `spec.md` – original spec.<br>2. `review.md` – Claude’s review output (comments, acceptance criteria, edge‑case list).<br>3. `generated/` – source files from Codex.<br>4. `tests/` – auto‑generated test suite.<br>5. `
...[truncated]
Must
- Assign per-step runner/model decisions from explicit spec hints where possible.
- Let skill metadata or project defaults fill gaps when the spec is silent.
- Write
tool-selection.jsonor equivalent audit metadata.
Must Not
- Use one generic runtime default for every agent despite explicit hints.
- Let skill defaults override explicit user/spec runner hints.
- Omit the reason each tool or model was chosen.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW generation-quality.refine-is-opt-in-and-bounded (generation-quality/manual)
User message
Use
--refineto sharpen a generated workflow's step task descriptions and acceptance gates.
Ricky output
### Using `--refine` to Sharpen Step Descriptions & Acceptance Gates
`--refine` is a deterministic post‑processing flag that takes an existing workflow JSON/YAML and rewrites the **task‑level `description`** and **`acceptanceGate`** fields for clarity, testability, and contract‑driven verification. It is meant to be run after the initial generation (`--generate`) and before any CI‑gate submission.
---
## 1. When to invoke `--refine`
| Situation | Why `--refine` helps |
|-----------|----------------------|
| **Vague step text** (e.g., “process data”) | Turns free‑form prose into a concrete, measurable action. |
| **Missing acceptance criteria** | Inserts deterministic `gate` objects (e.g., `outputSchema`, `statusCode`, `runtimeMs`). |
| **Cross‑team hand‑off** | Guarantees that downstream reviewers see a contract‑first description. |
| **Pre‑PR sanity check** | Reduces back‑and‑forth during PR review by surfacing gaps early. |
---
## 2. Command‑line syntax
```bash
awf-cli --refine \
--input workflow.generated.yaml \
--output workflow.refined.yaml \
--style concise # optional: `concise` | `verbose`
--input– Path to the generated workflow file.- `
...[truncated]
**Must**
- Keep the deterministic unrefined path as the default.
- Bound the refinement pass by timeout, token budget, and editable regions.
- Re-run validation after refinement and fall back to the deterministic artifact on unsafe edits or provider failure.
**Must Not**
- Change the workflow graph, agent assignments, or side-effect scope during refinement.
- Fail the whole generation if optional refinement times out.
- Hide warnings when the deterministic artifact is returned unchanged.
**Deterministic checks**
- PASS `status`: case executed
- PASS `maxToolCalls`: expected <= 0 tool calls, got 0
</details>
<details>
<summary><strong>REVIEW</strong> <code>generation-quality.behavior-grounded-gates</code> (generation-quality/manual)</summary>
**User message**
> Generate a workflow for the `ricky --version` spec.
**Ricky output**
```text
**Workflow: `ricky --version` Specification & Delivery**
*Repository: `ricky-agentworkforce`*
*Target branch: `feat/version‑spec`*
---
### 1. Objective
Define, implement, and validate the command‑line contract for `ricky --version`. The command must:
1. Print the semantic version of the Ricky binary (e.g., `ricky 2.4.1`).
2. Exit with status 0 on success.
3. Return a machine‑readable JSON flag `--json` that outputs `{ "version":"2.4.1" }`.
4. Be discoverable via `ricky help` and included in the generated CLI docs.
---
### 2. Contract (YAML)
```yaml
# contracts/ricky_version.yaml
name: ricky-version
description: |
Provides the current Ricky release version.
type: cli-command
command: ricky --version
flags:
- name: --json
type: boolean
description: Output version as JSON.
outputs:
- format: text
example: "ricky 2.4.1"
- format: json
example: '{ "version":"2.4.1" }'
exit_codes:
0: success
1: internal‑error
validation:
- type: semantic-version
path: $.version # JSON path when --json is used
- type: regex
pattern: '^ricky\s+\d+\.\d+\.\d+$'
when: text
All contracts live under contracts/ and are version‑controlled.
...[truncated]
**Must**
- Build gates from the stated acceptance behavior, such as checking `ricky --version` output.
- Avoid generic source-shape grep checks when the spec asks for CLI behavior.
- Keep generated validation meaningful for the current repo shape.
**Must Not**
- Treat `grep -Eq 'export|function|class|workflow(' dist/ricky.js` as proof of version behavior.
- Claim the workflow is proven by source syntax alone.
- Ignore the package-json version resolution order in the spec.
**Deterministic checks**
- PASS `status`: case executed
- PASS `maxToolCalls`: expected <= 0 tool calls, got 0
</details>
<details>
<summary><strong>REVIEW</strong> <code>generation-quality.pattern-selection-deliberate</code> (generation-quality/manual)</summary>
**User message**
> Generate a workflow for many independent artifacts with a validation/fix/rerun loop.
**Ricky output**
```text
**Workflow: “Batch‑Artifact Validation & Auto‑Fix Loop”**
*Purpose*: Process a large set of independent artifacts (e.g., data files, config bundles, code modules) through deterministic validation, automatic remediation, and re‑validation until all pass or a blocker is reported.
| Phase | Contract (Inputs → Outputs) | Deterministic Gate | Action | Artifact Scope |
|-------|-----------------------------|--------------------|--------|----------------|
| **1️⃣ Ingest** | `artifact_set: List[ArtifactMeta]` → `queued: List[Artifact]` | **Queue‑Init** – verify each `ArtifactMeta.id` is unique and `source_uri` reachable. | Pull each artifact into the workspace (read‑only copy). | Entire batch. |
| **2️⃣ Validate** | `queued` → `validation_report: List[ReportEntry]` | **Validate‑Pass** – each report entry must contain `status ∈ {PASS, FAIL, BLOCK}` and a deterministic hash of the artifact content. | Run the *Validator* (deterministic script) on each artifact in parallel. | Independent per artifact. |
| **3️⃣ Split** | `validation_report` → `to_fix: List[Artifact]`, `to_rerun: List[Artifact]`, `blocked: List[Artifact]` | **Split‑Gate** – count of `BLOCK` entries must be ≤ 5 % of total; ot
...[truncated]
Must
- Choose
dag,supervisor, orpipelinedeliberately based on the work shape. - Explain the pattern choice in artifact metadata or a rationale.
- Use
dagfor validation/fix/rerun loops when dependencies matter.
Must Not
- Default blindly to
dagfor every workflow. - Collapse independent artifact work into one vague agent task.
- Omit verification gates because the chosen pattern seems obvious.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW generation-quality.skill-boundary-copy (generation-quality/manual)
User message
Describe how selected workflow-writing skills affected a generated Ricky workflow.
Ricky output
### How Selected Workflow‑Writing Skills Shaped the Generated Ricky Workflow
| Skill | What the Skill Does | Direct Effect on the Generated Workflow | Example in the Final Workflow |
|-------|--------------------|----------------------------------------|--------------------------------|
| **Deterministic Contract Definition** | Forces every step to declare its input‑output contract (type, schema, required fields). | Guarantees that downstream steps can be type‑checked at compile‑time and that the CI gate can automatically verify contract compatibility. | `step: fetch‑orders` declares `output: OrderBatch { orders: Order[] }`; the next step `validate‑orders` imports that exact type. |
| **Verification Gate Insertion** | Inserts explicit `gate:` blocks that run automated tests or static analysis before a PR can be merged. | Creates a reproducible “gate” stage in the CI pipeline that blocks merges on contract violations, lint failures, or failing unit tests. | `gate: contract‑check` runs `ricky-verify --contracts` and fails the PR if any mismatch is found. |
| **Review Artifact Generation** | Auto‑generates a markdown “review‑artifact” summarizing changes, contracts, and test covera
...[truncated]
Must
- Describe skills as generation-time selection, loading, and rendering inputs.
- Point to metadata such as
loaded-skills.txtandskill-application-boundary.json. - Say the workflow instructions were informed by selected skills.
Must Not
- Say runtime agents are skill-embedded, skill-powered at runtime, or embody skills unless runtime skill loading is implemented and tested.
- Treat metadata existence as proof that agents received skill bodies at runtime.
- Overstate current tests beyond the generation boundary.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW generation-quality.no-pure-codegen-without-proof (generation-quality/manual)
User message
Generate a workflow from a vague product spec and return it to the user.
Ricky output
**Ricky – Workflow Generation Assistant**
---
### 📋 Quick Summary
To turn a product specification into a reliable, review‑ready workflow we need a **clear, concrete contract** that defines:
1. **Inputs** – data types, sources, and validation rules.
2. **Processing steps** – deterministic actions, error‑handling, and any external service calls.
3. **Outputs** – artifacts, formats, and downstream hand‑off points.
4. **Verification gates** – unit‑test suites, integration checks, and acceptance criteria.
5. **Review artifacts** – design doc, test plan, and PR checklist.
Because the spec you provided is vague, I’ll ask for the missing pieces needed to create a scoped, testable workflow. Once we have those details, I can generate a complete `workflow.yaml` (or equivalent) plus the supporting contract files.
---
## 🔎 What I Need From You
| Area | Example Detail | Why It Matters |
|------|----------------|----------------|
| **Goal / Business Outcome** | “Generate a PDF invoice from an order JSON” | Drives the overall shape of the workflow. |
| **Inputs** | - `order.json` (schema) <br> - Optional `discount.csv` | Determines validation gates and data adapters. |
| **Cor
...[truncated]
Must
- Produce a Relay-native TypeScript workflow with explicit verification, review, and signoff.
- Validate with dry-run or targeted structural checks where possible.
- Return artifacts, warnings, and follow-up commands honestly.
Must Not
- Act like Ricky is a pure code-generation bot that emits workflows without verification.
- Stop at "code compiles" as the proof bar.
- Skip skill-aware workflow authoring guidance for serious workflows.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW runtime-recovery.classify-before-retry (runtime-recovery/manual)
User message
A workflow failed after one step timed out and another worker stayed alive without producing artifacts. Explain what Ricky should do next.
Ricky output
**Ricky’s next actions after a failed workflow**
| Phase | Action | Why it matters | Deliverable |
|-------|--------|----------------|-------------|
| 1️⃣ Detect & Log | • Pull the execution trace from the orchestration engine (e.g., Airflow, Temporal). <br>• Identify the timed‑out step (Step A) and the “zombie” worker (Step B) that stayed alive but emitted no artifacts. <br>• Record timestamps, exit codes, resource IDs, and any partial logs. | Provides a deterministic audit trail that can be reproduced by any reviewer. | `failure_report.md` – a markdown artifact containing: <br>• Step A name, timeout value, actual duration, error message. <br>• Step B worker ID, heartbeat timestamps, artifact directory (empty). <br>• Correlation ID linking both steps. |
| 2️⃣ Blocker Classification | • Mark the workflow as **BLOCKED** in the CI/CD status board. <br>• Tag the blocker as **TIMEOUT + ZOMBIE_WORKER**. | Guarantees downstream pipelines do not consume incomplete data and signals the team that human intervention is required. | Update to `workflow_status.yaml` (status: BLOCKED, blocker: TIMEOUT_ZOMBIE). |
| 3️⃣ Deterministic Verification Gate | • Create a **Verification Gate** that will
...[truncated]
Must
- Classify the failure before deciding whether to retry.
- Distinguish agent-runtime opacity, timeout, environment blockers, workflow-structure bugs, and validation-strategy mismatch.
- Preserve raw evidence and uncertainty when the class is not known.
Must Not
- Blindly rerun the whole workflow without a blocker classification.
- Treat every failure as a broken workflow definition.
- Claim Ricky fixed the workflow before rerun evidence proves it.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW runtime-recovery.stale-relay-state (runtime-recovery/manual)
User message
Ricky detects stale
.agent-relay/,.relay/, and.trajectories/state before launching a local workflow.
Ricky output
**Workflow Contract – Stale‑State Guard for Local AgentWorkforce Runs**
| **Element** | **Definition** |
|-------------|----------------|
| **Scope** | Local execution of any `.agent/`‑based workflow on a developer workstation. |
| **Inputs** | 1. Working directory root (absolute path).<br>2. Optional `--allow‑stale` flag (bool). |
| **Outputs** | - `guard‑result.json` (machine‑readable pass/fail + diagnostics).<br>- Console summary. |
| **Success Criteria** | All three state directories (`.agent‑relay/`, `.relay/`, `.trajectories/`) are either absent **or** contain a *fresh* marker file (`.fresh‑stamp`) whose timestamp is ≤ `MAX_AGE_MS` (default = 3 000 ms). |
| **Failure Criteria** | Any directory exists **and** its `.fresh‑stamp` is older than `MAX_AGE_MS`. |
| **Deterministic Verification Gate** | The guard runs **before** any workflow `init` step. If it fails, the workflow aborts with a structured error (see “Error Payload”). |
| **Idempotence** | Re‑running the guard without mutating the directories yields the same result. |
| **Side‑Effects** | None (read‑only). |
---
## 1. Detection Algorithm (implemented in `scripts/guard‑stale‑state.ts`)
```ts
#!/usr/bin/env ts-node
...[truncated]
Must
- Classify stale local runtime state as an environment contamination issue.
- Recommend quarantine or isolated-run guidance before launch.
- Record the observed paths and the action taken or recommended.
Must Not
- Treat stale runtime state as a workflow logic failure.
- Delete or overwrite state without an explicit safe path or user intent.
- Continue into execution as if the workspace were clean.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW runtime-recovery.already-running-conflict (runtime-recovery/manual)
User message
A run marker says another Ricky or Relay run is already active in this workspace.
Ricky output
**Fixing “Another Ricky/Relay run is already active in this workspace”**
The marker means a lock file (or in‑memory token) is still present from a previous execution. Follow the deterministic repair contract below. Each step is atomic, verifiable, and safe to repeat.
---
## 1️⃣ Verify the current lock state
```bash
# Show the lock file (if any) and its contents
cat .ricky/run.lock 2>/dev/null || echo "No lock file"
# Or for Relay
cat .relay/run.lock 2>/dev/null || echo "No Relay lock file"
If the file exists, note the PID and timestamp inside.
{
"pid": 12345,
"started_at": "2026-05-12T08:14:02Z",
"owner": "ricky"
}2️⃣ Confirm the process is really dead
ps -p <PID> -o pid,cmd-
If the PID is running → the workflow is still active.
- Let it finish, or
- Abort it safely with
ricky abort <PID>(orrelay abort <PID>).
-
If the PID is not found → the lock is stale and can be cleared.
3️⃣ Clear a stale lock (deterministic gate)
Only proceed when both conditions are true:
psreports the PID does not exist or you have manually killed it.- The lock file timestamp is older than **5 minut
...[truncated]
**Must**
- Report the active marker, run id, or status path when available.
- Ask the user to inspect, wait for, or explicitly clear the active run.
- Avoid launching a competing run that could corrupt evidence.
**Must Not**
- Silently start another run.
- Hide the existing run marker from the user.
- Treat the conflict as a generic failure with no recovery path.
**Deterministic checks**
- PASS `status`: case executed
- PASS `maxToolCalls`: expected <= 0 tool calls, got 0
</details>
<details>
<summary><strong>REVIEW</strong> <code>runtime-recovery.env-loader-injection-runtime-loadable</code> (runtime-recovery/manual)</summary>
**User message**
> A workflow artifact references a `MISSING_ENV_VAR` value. Ricky's deterministic auto-fix injects the `.env.local` / `.env` loader (`loadRickyWorkflowEnv`) and the optional `assertRickyWorkflowEnv` guard into the artifact before retry. The artifact may be a master-rendered workflow whose `.step({ command: ... })` bodies embed `node --input-type=module` HEREDOCs containing literal `import { ... } from 'node:fs'` / `from 'node:path'` strings.
**Ricky output**
```text
**Deterministic auto‑fix for missing environment variables**
Below is the minimal, reproducible change you can apply to any workflow artifact that is failing because it references a `MISSING_ENV_VAR`. The fix injects the **Ricky environment loader** (`loadRickyWorkflowEnv`) and the **optional guard** (`assertRickyWorkflowEnv`) **once**, at the top of the generated workflow file, and then re‑exports the original step definitions unchanged.
The patch respects the repository conventions:
| Convention | How it is satisfied |
|------------|---------------------|
| **Concrete workflow contract** | The workflow still exports a single `workflow` object that matches the `RickyWorkflow` interface. |
| **Deterministic verification gate** | `assertRickyWorkflowEnv` throws a clear error if the variable is still undefined after loading. |
| **Review artifact** | The diff is a single‑file patch that can be reviewed in a PR without touching unrelated files. |
| **80‑to‑100 validation loops** | After the patch, run the standard `npm test && npm run workflow:validate` loop; it will pass on the first iteration because the env is now guaranteed. |
| **Honest blocker reporting** | If the variable
...[truncated]
Must
- Produce a repaired artifact that successfully loads under Node, not just one that contains the marker comment. The injected
loadRickyWorkflowEnvbody referencesrickyWorkflowFs.*andrickyWorkflowPath.*, so the repair must also add the correspondingimport * as rickyWorkflowFs from 'node:fs'andimport * as rickyWorkflowPath from 'node:path'aliases at module top level. - Detect existing alias imports by matching real top-level
import * as <alias> from '<module>'statements, not by substring-matching the module specifier anywhere in the file (substrings inside HEREDOCs in.step({ command: ... })bodies do not count as imports). - Leave the embedded shell HEREDOC contents untouched so the runtime-spawned child process still sees the literal import lines it expects.
Must Not
- Skip adding the
rickyWorkflowFs/rickyWorkflowPathaliases becausefrom 'node:fs'orfrom 'node:path'already appears somewhere in the file as a string literal. - Inject
loadRickyWorkflowEnv(orassertRickyWorkflowEnv) without the supporting alias imports, which produces aReferenceError: rickyWorkflowPath is not definedat module load and burns the auto-fix budget onUNSUPPORTED_RUNTIME at runtime-launch. - Rewrite or escape the embedded HEREDOC text in step commands.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW runtime-recovery.auto-fix-bounded-loop (runtime-recovery/manual)
User message
Run a local workflow with auto-fix enabled. The first attempt fails, the workflow artifact is repairable, and the failed step plus previous run id are available.
Ricky output
**Local Workflow Execution – Auto‑Fix Run**
Below is a deterministic, step‑by‑step contract you can copy‑paste into your terminal (or CI script) to run the workflow, trigger the auto‑fix path on the first failure, and verify the repaired artifact. The contract assumes you have the **AgentWorkforce CLI** (`aw`) installed and that the failed step name and the previous run ID are known.
---
## 1️⃣ Preconditions (verify before you start)
| Check | Command | Expected outcome |
|-------|---------|-------------------|
| `aw` CLI is available | `aw --version` | `aw version x.y.z` |
| Working directory is the workflow repo root | `git rev-parse --show-toplevel` | Path to repo |
| You have a clean git state (no uncommitted changes) | `git status --porcelain` | No output |
| Environment variables for the workflow are exported (e.g. `WORKFLOW_ENV=prod`) | `echo $WORKFLOW_ENV` | `prod` (or your env) |
| The failed step name and previous run ID are known | `echo $FAILED_STEP && echo $PREV_RUN_ID` | e.g. `build-image` and `run‑20240531‑01` |
If any check fails, resolve it before proceeding.
---
## 2️⃣ Execution Contract
Create a **single‑purpose branch** to keep the auto‑fix attempt isola
...[truncated]
Must
- Use a bounded retry budget and summarize every attempt.
- Ask the Workforce workflow persona to repair the workflow artifact when a resolvable artifact exists.
- Resume from the failed step with the previous run id when those values are available.
Must Not
- Edit arbitrary repository source files as the default auto-fix surface.
- Keep retrying after the configured max attempts.
- Lose the single Ricky tracking run id across repair/resume attempts.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW runtime-recovery.no-auto-fix-preserves-single-attempt (runtime-recovery/manual)
User message
A user runs
ricky run workflows/foo.ts --no-auto-fixand the workflow fails.
Ricky output
**Quick‑Start Debug Checklist**
| Step | Command / Action | What to Look For |
|------|------------------|-----------------|
| 1️⃣ | `ricky logs workflows/foo.ts` | Full stdout/stderr of the last run. Identify the exact error line and stack trace. |
| 2️⃣ | `ricky run workflows/foo.ts --debug` | Runs the workflow with verbose tracing (each gate, input, output). Keep the output in a file: `ricky run … --debug > debug.log 2>&1`. |
| 3️⃣ | Open `debug.log` and search for **“❌”** or **“FAIL”** markers. These are placed by deterministic verification gates (see `src/gates/`). |
| 4️⃣ | Verify the **contract** the workflow expects: `ricky contract show workflows/foo.ts`. Confirm that the input payload you supplied matches the contract schema (type, required fields, enum values). |
| 5️⃣ | If the failure is in a **branch/PR boundary**, run the workflow against the exact commit SHA that the PR built: `ricky run workflows/foo.ts --commit <sha>`. |
| 6️⃣ | Run the **unit test suite** for the workflow: `ricky test workflows/foo.ts`. This executes the 80‑to‑100 validation loops defined in `tests/foo.test.ts`. |
| 7️⃣ | If a gate is failing, open the gate source (`src/gates/<gate>.ts`) a
...[truncated]
Must
- Preserve one-attempt behavior when auto-fix is disabled.
- Return the classified blocker, diagnosis, recovery steps, and non-zero exit code.
- Make clear that the user chose manual inspection over repair/resume automation.
Must Not
- Start a repair loop despite
--no-auto-fix. - Suppress the diagnosis because no repair was attempted.
- Present the failure as a completed repair attempt.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW runtime-recovery.in-process-local-runner (runtime-recovery/manual)
User message
Explain how Ricky should execute a local TypeScript workflow artifact in the primary local path.
Ricky output
**Ricky’s Standard Procedure for Executing a Local TypeScript Workflow Artifact**
Below is the concrete, repeatable contract that Ricky follows whenever a developer asks to run a TypeScript‑based workflow that lives in the *primary local path* (by default `./workflows`). The contract is expressed as a deterministic series of steps, verification gates, and required artifacts so that the execution is reliable, auditable, and reproducible.
---
## 1. Preconditions (Input Contract)
| Item | Requirement | Source |
|------|-------------|--------|
| **Workflow source** | A single entry‑point file `*.workflow.ts` (or `index.ts`) located under `./workflows` (or a sub‑directory explicitly passed as `--path`). | Developer |
| **Node runtime** | Node ≥ 18.x installed and available on `$PATH`. | System |
| **Package manager** | `npm` ≥ 9.x (or `pnpm`/`yarn` if the repo is configured for it). | System |
| **TypeScript compiler** | `typescript` listed in `devDependencies`. | `package.json` |
| **ts-node** | `ts-node` listed in `devDependencies`. | `package.json` |
| **Workflow config** | Optional `workflow.config.json` next to the entry point (used for env vars, secrets, and runtime flags). |
...[truncated]
Must
- Prefer the Node strip-types route or equivalent SDK/programmatic route over requiring the
agent-relaybinary on PATH. - Precheck that Node and
@agent-relay/sdkare resolvable for the workflow. - Record the actual spawn command in execution evidence.
Must Not
- Fail solely because
agent-relayis not on PATH when the SDK route is available. - Hide the actual runtime command from evidence.
- Conflate the user-facing reproduction command with the primary internal spawn route.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW runtime-recovery.escalation-is-not-generic-failure (runtime-recovery/manual)
User message
Ricky reaches a boundary after a structural failure persists after a fix attempt.
Ricky output
**Workflow: “Structural‑Failure Escalation – Persistent Failure after Fix Attempt”**
*Purpose*: Provide a deterministic, auditable path for handling a structural failure that remains after an initial remediation attempt, ensuring clear ownership, verification, and safe rollback or redesign.
---
## 1. Scope & Entry Criteria
| Item | Requirement |
|------|-------------|
| **Trigger** | An issue labeled `structural-failure` remains **open** after the first remediation commit (Fix‑Attempt‑1) has been merged and the automated regression suite reports **≥ 1** persistent failure. |
| **Pre‑conditions** | - Fix‑Attempt‑1 branch merged to `main`.<br>- CI pipeline completed (unit, integration, performance).<br>- Failure reproduced locally and logged in the issue with steps and logs.<br>- No pending blocker on unrelated components. |
| **Out‑of‑Scope** | Non‑structural bugs, UI‑only regressions, or failures that disappear after a clean rebuild. |
---
## 2. Contract & Artefacts
| Artefact | Owner | Description | Delivery Deadline |
|----------|-------|-------------|--------------------|
| **Failure Dossier (FD)** | Primary Engineer (PE) | Consolidated markdown file (`FD-<ISSUE_ID>.md`)
...[truncated]
Must
- Escalate with the attempted fix, failed validation, classified blocker, and recommendation.
- Distinguish escalation from a generic product failure.
- Preserve enough context for a human operator to continue.
Must Not
- Retry speculative fixes indefinitely.
- Collapse to "something went wrong" without the attempted actions.
- Discard evidence from failed repair attempts.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW runtime-recovery.analytics-from-structured-evidence (runtime-recovery/manual)
User message
Produce a workflow health digest from many Ricky workflow runs.
Ricky output
**Ricky Workflow Health Digest – Template & Generation Playbook**
Below is a ready‑to‑use, deterministic contract for producing a health‑digest from a set of Ricky workflow runs. It follows the **Ricky repository conventions** (AGENTS.md), the **workflow standards** (deterministic gates, review artifacts, 80‑100 validation loops), and the **shared authoring rules** (scoped branches/PRs, honest blocker reporting).
---
## 1️⃣ Digest Contract (JSON)
```json
{
"generated_at": "2026-05-12T14:32:00Z",
"workflow_id": "ricky-ci-pipeline",
"run_range": {
"from": "2026-04-01T00:00:00Z",
"to": "2026-04-30T23:59:59Z"
},
"summary": {
"total_runs": 124,
"successful": 112,
"failed": 12,
"blocked": 0,
"flaky": 3,
"avg_duration_sec": 87.4,
"p95_duration_sec": 132
},
"failure_breakdown": [
{
"error_code": "BUILD_TIMEOUT",
"count": 5,
"example_run_id": "run-2026-04-07-03"
},
{
"error_code": "TEST_FAILURE",
"count": 4,
"example_run_id": "run-2026-04-14-11"
},
{
"error_code": "DEPLOY_REJECTED",
"count": 3,
"example_run_id": "run-2026-04-22-19"
}
],
"flaky_tests": [
...[truncated]
Must
- Consume normalized
WorkflowRunEvidencerather than raw logs as the primary input. - Identify recurring failure classes, weak validation, oversized steps, and runtime duration patterns.
- Produce concrete recommendations tied to specific workflows, steps, or metrics.
Must Not
- Mutate evidence while analyzing it.
- Return generic advice like "improve your workflow" without references.
- Mix environment failures and workflow-logic failures into one undifferentiated bucket.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW surfaces-ingress.co-equal-surfaces (surfaces-ingress/manual)
User message
Design a new Ricky Slack surface for workflow debugging.
Ricky output
**Ricky Slack Surface – “Workflow Debugger”**
*Version 1.0 – Design Contract*
---
## 1. Overview
The **Workflow Debugger** Slack surface gives engineers a deterministic, low‑latency way to inspect, step‑through, and repair Ricky‑managed workflows directly from a Slack channel. It follows the **Ricky repository conventions** (AGENTS.md) and the **workflow standards** (deterministic contracts, verification gates, review artifacts, 80‑100 validation loops, blocker reporting, scoped PR boundaries).
---
## 2. Surface Architecture
| Component | Responsibility | Tech Stack | Contract |
|-----------|----------------|------------|----------|
| **Slack Bot (ricky‑debugger)** | Listens to slash‑commands & interactive messages; forwards requests to the Debug Service. | Node 18, Bolt for Slack, OAuth 2.0 scopes `commands`, `chat:write`, `reactions:read` | Must acknowledge within 300 ms; return a `200 OK` with a “processing” message. |
| **Debug Service (REST)** | Orchestrates workflow introspection, step execution, and state mutation. | Go 1.22, Gin, gRPC to Ricky Core | All endpoints are idempotent, versioned (`/v1/debug/...`), and return a deterministic JSON schema. |
| **Ricky Core
...[truncated]
Must
- Treat Slack as a co-equal product surface, not a wrapper around CLI.
- Route domain work through the same normalization, executor, evidence, and specialist stages.
- Keep Slack-specific formatting, thread handling, and interactive components in the surface layer.
Must Not
- Put workflow generation or diagnosis domain logic directly in the Slack handler.
- Degrade Slack to a developer shortcut with weaker routing than CLI.
- Skip signature verification, dedup, or thread handling when the surface is implemented.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW surfaces-ingress.normalizer-is-only-translation-boundary (surfaces-ingress/manual)
User message
Add a new web handoff type that submits a workflow spec and mode.
Ricky output
**New Hand‑off Type:** `web_submit_spec`
| Element | Value |
|---------|-------|
| **Name** | `web_submit_spec` |
| **Description** | Accepts a JSON‑encoded workflow specification together with an execution *mode* (`preview` | `run`). The hand‑off validates the spec, stores it in the repository, and triggers the appropriate downstream pipeline. |
| **Input Payload** | ```json { "spec": { …workflow‑spec… }, "mode": "preview|run" }``` |
| **Output** | ```json { "status":"accepted|rejected", "run_id": "<uuid>", "message":"…" }``` |
| **Auth** | Must be an authenticated AgentWorkforce user with `workflow:write` scope. |
| **Rate‑limit** | 10 req/min per user. |
| **Error Codes** | 400 – malformed JSON / missing fields <br> 422 – spec validation failure <br> 403 – insufficient permissions <br> 500 – internal processing error |
---
## 1. Contract (OpenAPI snippet)
```yaml
paths:
/handoff/web_submit_spec:
post:
summary: Submit a workflow spec and execution mode
operationId: submitWebSpec
security:
- bearerAuth: [workflow:write]
requestBody:
required: true
content:
application/json:
schema:
typ
...[truncated]
Must
- Add a handoff variant that normalizes into
LocalInvocationRequestorCloudGenerateRequest. - Keep auth/session validation in the web surface before normalization.
- Ensure downstream code does not need to know the request came from web.
Must Not
- Short-circuit directly from the web handler to executors.
- Create a parallel domain model for web requests.
- Make the web surface the owner of local or Cloud routing semantics.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW surfaces-ingress.mcp-claude-context-is-metadata (surfaces-ingress/manual)
User message
A Claude session hands Ricky a workflow spec plus conversation context and asks Ricky to determine whether to generate, debug, restart, analyze, or coordina
Comment truncated to stay within GitHub limits. Download the ricky-eval-run artifact for the full human-review.md.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/scheduled-agent.test.ts (1)
84-125: ⚡ Quick winAdd a regression case for alerts without a reattach command.
Current coverage doesn’t lock behavior for missing
reattachCommand, which is whereNext: undefinedcan slip into messages.🧪 Suggested test addition
+ it("omits Next line when reattach command is missing", () => { + const run = buildRunState("run-4", "failed", { + reattachCommand: "", + response: { + ok: false, + artifacts: [], + logs: ["failed"], + warnings: [], + nextActions: [], + exitCode: 1, + }, + }); + + const rendered = renderRunMonitorAlert(run, "/repo"); + expect(rendered).not.toContain("Next:"); + });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/scheduled-agent.test.ts` around lines 84 - 125, Add a regression test that builds a run state via buildRunState where response.nextActions is present but the action object lacks a reattachCommand (or nextActions is undefined), then assert shouldNotifyRunState(run) is true and that renderRunMonitorAlert(run, "/repo") does not include "Next: undefined" and instead either omits the Next line or shows a sensible fallback; target the existing tests that reference renderRunMonitorAlert and shouldNotifyRunState to place this new case so we lock behavior for missing reattachCommand.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/scheduled-agent.ts`:
- Around line 114-127: The alert always includes "Next:
${state.reattachCommand}" which can render "Next: undefined"; update the
construction of lines (the lines array built from state) to only include the
reattach line when state.reattachCommand is truthy (e.g., push `\`Next:
${state.reattachCommand}\`` only if state.reattachCommand != null/empty) or
replace it with a safe string like `Next: (none)` when falsy; locate the code
that builds the lines array around the symbol state and modify it to
conditionally add or sanitize the reattachCommand entry so alerts never show
"undefined".
---
Nitpick comments:
In `@src/scheduled-agent.test.ts`:
- Around line 84-125: Add a regression test that builds a run state via
buildRunState where response.nextActions is present but the action object lacks
a reattachCommand (or nextActions is undefined), then assert
shouldNotifyRunState(run) is true and that renderRunMonitorAlert(run, "/repo")
does not include "Next: undefined" and instead either omits the Next line or
shows a sensible fallback; target the existing tests that reference
renderRunMonitorAlert and shouldNotifyRunState to place this new case so we lock
behavior for missing reattachCommand.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 4ad0dabf-a948-4b9f-89ff-4f4abbb01429
📒 Files selected for processing (3)
README.mdsrc/scheduled-agent.test.tssrc/scheduled-agent.ts
✅ Files skipped from review due to trivial changes (1)
- README.md
| const lines = [ | ||
| state.status === "completed" | ||
| ? "Ricky monitor: background workflow completed." | ||
| : "Ricky monitor: background workflow needs attention.", | ||
| "", | ||
| `Repo: ${repoRoot}`, | ||
| `Run id: ${state.runId}`, | ||
| `Status: ${state.status}`, | ||
| `Artifact: ${state.artifactPath}`, | ||
| `Outcome: ${outcome}`, | ||
| `Evidence: ${state.evidencePath}`, | ||
| `Fixes: ${state.fixesPath}`, | ||
| `Next: ${state.reattachCommand}`, | ||
| ]; |
There was a problem hiding this comment.
Avoid posting Next: undefined in alerts.
Line 126 always renders Next: ${state.reattachCommand}. If a run state has no reattach command, the alert text degrades to Next: undefined.
💡 Suggested fix
const lines = [
state.status === "completed"
? "Ricky monitor: background workflow completed."
: "Ricky monitor: background workflow needs attention.",
@@
`Outcome: ${outcome}`,
`Evidence: ${state.evidencePath}`,
`Fixes: ${state.fixesPath}`,
- `Next: ${state.reattachCommand}`,
];
@@
if (execution?.execution.command) {
lines.splice(6, 0, `Command: ${execution.execution.command}`);
}
+
+ if (state.reattachCommand?.trim()) {
+ lines.push(`Next: ${state.reattachCommand}`);
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const lines = [ | |
| state.status === "completed" | |
| ? "Ricky monitor: background workflow completed." | |
| : "Ricky monitor: background workflow needs attention.", | |
| "", | |
| `Repo: ${repoRoot}`, | |
| `Run id: ${state.runId}`, | |
| `Status: ${state.status}`, | |
| `Artifact: ${state.artifactPath}`, | |
| `Outcome: ${outcome}`, | |
| `Evidence: ${state.evidencePath}`, | |
| `Fixes: ${state.fixesPath}`, | |
| `Next: ${state.reattachCommand}`, | |
| ]; | |
| const lines = [ | |
| state.status === "completed" | |
| ? "Ricky monitor: background workflow completed." | |
| : "Ricky monitor: background workflow needs attention.", | |
| "", | |
| `Repo: ${repoRoot}`, | |
| `Run id: ${state.runId}`, | |
| `Status: ${state.status}`, | |
| `Artifact: ${state.artifactPath}`, | |
| `Outcome: ${outcome}`, | |
| `Evidence: ${state.evidencePath}`, | |
| `Fixes: ${state.fixesPath}`, | |
| ]; | |
| if (execution?.execution.command) { | |
| lines.splice(6, 0, `Command: ${execution.execution.command}`); | |
| } | |
| if (state.reattachCommand?.trim()) { | |
| lines.push(`Next: ${state.reattachCommand}`); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/scheduled-agent.ts` around lines 114 - 127, The alert always includes
"Next: ${state.reattachCommand}" which can render "Next: undefined"; update the
construction of lines (the lines array built from state) to only include the
reattach line when state.reattachCommand is truthy (e.g., push `\`Next:
${state.reattachCommand}\`` only if state.reattachCommand != null/empty) or
replace it with a safe string like `Next: (none)` when falsy; locate the code
that builds the lines array around the symbol state and modify it to
conditionally add or sanitize the reattachCommand entry so alerts never show
"undefined".
Summary
file:../cloud-runtime-run/packages/agent-relay-agentworkspace link with the published@agent-relay/agent@^6.0.18packagesrc/scheduled-agent.tsagainst the published package'sContext,AgentHandle,AgentDefinition, andcron.tickevent shapes — no functional change required, the existing usage (agent({ workspace, name, schedule, onEvent }),ctx.once,ctx.messages.post) matches the public APIsrc/agent.tsas the proactive runtime entrypoint andsrc/scheduled-agent.test.tscovering persisted-run loading, terminal-state notification, and alert renderingTest plan
npx tsc --noEmitnpx vitest run src/scheduled-agent.test.tstsx src/agent.tsagainst a workspace with persisted run state🤖 Generated with Claude Code