feat: Add cron jobs, OpenClaw ingestor, and sandbox testing#16
feat: Add cron jobs, OpenClaw ingestor, and sandbox testing#16WellDunDun merged 5 commits intodevfrom
Conversation
This commit adds comprehensive support for recurring cron job management, OpenClaw API integration for consuming and persisting query data, and a complete sandbox testing infrastructure with Docker support for isolated environment testing. Includes detailed test fixtures, provisioning scripts, and strategy documentation for ICP/GTM and OpenClaw integration. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds OpenClaw integration: new ingest and cron CLI commands, session parser/writers, cron management, sandbox/devcontainer test infra and fixtures, docs/Makefile updates, and unit/integration tests to ingest OpenClaw sessions into existing log formats. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant CLI as Selftune CLI
participant Ingestor as OpenClaw Ingestor
participant Sessions as Agents/Sessions FS
participant Skills as Skills FS
participant Logs as Shared Logs
User->>CLI: ingest-openclaw --agents-dir --since
CLI->>Ingestor: cliMain()
Ingestor->>Sessions: findOpenClawSessions(agentsDir, since)
Sessions-->>Ingestor: SessionFile[]
loop per session
Ingestor->>Sessions: parseOpenClawSession(filePath)
Sessions-->>Ingestor: ParsedSession
Ingestor->>Skills: findOpenClawSkillNames(agentsDir)
Skills-->>Ingestor: Set<string>
Ingestor->>Logs: writeSession(parsedSession, dryRun?, logPaths)
Logs-->>Ingestor: ack
end
Ingestor->>CLI: update marker
CLI-->>User: ingestion complete
sequenceDiagram
actor User
participant CLI as Selftune CLI
participant CronMgr as Cron Manager
participant OpenClaw as openclaw binary
User->>CLI: cron setup --tz <tz> [--dry-run]
CLI->>CronMgr: setupCronJobs(tz, dryRun)
CronMgr->>CronMgr: load DEFAULT_CRON_JOBS
loop each job
CronMgr->>CronMgr: buildCronAddArgs(job, tz)
CronMgr->>OpenClaw: spawn openclaw cron add ...
OpenClaw-->>CronMgr: exitCode / output
end
CronMgr-->>CLI: setup complete
CLI-->>User: jobs registered / dry-run output
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
Merges dev commits (sandbox harness PR #17, PAI Infrastructure PR #15) into feature branch. Resolved conflicts by combining both branches' unique additions: dev's structural improvements (stream deadlock fix, nullglob guards, bash history, iptables corrections) with HEAD's OpenClaw integration (ingestor, cron, platform references). Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 24
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cli/selftune/init.ts (1)
323-325:⚠️ Potential issue | 🟡 MinorUpdate missing-agent error text to include OpenClaw.
At Line 324, the error message omits
openclaweven though it is now supported, which can mislead users during setup.Suggested patch
- "No supported agent CLI detected (claude, codex, opencode). Install one, then rerun `selftune init`.", + "No supported agent CLI detected (claude, codex, opencode, openclaw). Install one, then rerun `selftune init`.",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cli/selftune/init.ts` around lines 323 - 325, The thrown error message in init.ts (the Error thrown where the CLI detects supported agents) omits the newly supported "openclaw" agent; update the error string thrown (the throw new Error(...) call) to list "openclaw" alongside "claude, codex, opencode" so users see all supported agent CLIs when the check fails.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cli/selftune/cron/setup.ts`:
- Around line 131-137: The code is logging success after awaiting Bun.spawn
without checking the process exit code; update both the setup and remove flows
(the Bun.spawn call stored in proc, and the subsequent await proc.exited) to
inspect proc.exitCode (or the resolved exit status) and only log "Registered:
${job.name} — ${job.description}" (or the equivalent removal success message)
when the exit code is 0; if non-zero, log an error including the exit code and
command context (including job.name/job.description) and fail the operation
(throw or process.exit with a non-zero code) so failures are not reported as
successes.
- Around line 114-119: Replace the direct console.error calls that print the
OpenClaw install guidance with the CLI's structured JSON logging utility used
across cli/selftune (import or reuse the module logger, commonly named
jsonLogger or logger); remove all console.* calls in this block and call the
logger's error/info methods instead, emitting structured objects (e.g., { level:
"error", message: "openclaw is not installed or not in PATH.", suggestion:
"https://openclaw.dev/install" } and additional fields for follow-up lines), and
do the same for the other instances noted (around lines 123-124, 148-153,
183-184, 232-243) so all CLI output uses the JSON logger API rather than
console.*.
In `@cli/selftune/ingestors/openclaw-ingest.ts`:
- Around line 241-252: The current toolResult branch double-counts errors by
incrementing errors for msg.isError and again for each content block with
isError/is_error; change the logic to count at most one error per message:
compute a single boolean (e.g., hasError) that is true if msg.isError === true
OR any contentBlocks have isError === true or is_error === true (use
Array.prototype.some on contentBlocks), then increment errors only once when
hasError is true and remove the per-block increments; refer to the role ===
"toolResult" branch, the msg.isError checks, the contentBlocks loop, and the
errors (errors_encountered) variable when making the change.
- Around line 415-431: parseOpenClawSession can return an empty record on
failure but the loop always calls writeSession and marks the session ingested;
change the loop over pending so that after calling parseOpenClawSession you
validate the parsed session (e.g., check for null/undefined or
Object.keys(session).length === 0 and optionally validate required schema
fields) and if the parse is invalid skip writeSession, do not add to newIngested
or increment ingestedCount, and log a warning/error; keep the existing
saveMarker(OPENCLAW_INGEST_MARKER, ...) behavior but ensure it only includes
successful session IDs collected in newIngested so failed parses are not marked
as ingested.
- Around line 337-433: The CLI currently uses raw console.log/console.error in
cliMain and writeSession (e.g., messages showing agentsDir missing, counts,
dry-run/session ingesting, invalid --since error, and final summary); replace
all console.* calls in cliMain and writeSession with the project's structured
JSON logger (use logger.info / logger.error or the project's logging utility),
ensure logs include contextual fields (agentsDir, sessionId, since,
pending/newIngested counts, dryRun flag, and error details) rather than freeform
strings, and update the import at the top of this module to use the structured
logger so all messages (directory-not-found, invalid date, Found X sessions, N
not yet ingested, per-session ingest start, and final Done summary) are emitted
as JSON-structured logs.
In `@docs/design-docs/sandbox-test-harness.md`:
- Around line 25-27: The markdown table starting with "| Command | Expected
Behavior |" lacks surrounding blank lines and triggers markdownlint MD022/MD058;
add a single blank line before and after that table and similarly ensure there
is a blank line above and below the headings and tables referenced around the
sections that correspond to the other flagged ranges (the headings near the
middle and the larger table later), so every heading and table is separated by
one empty line to satisfy markdownlint rules.
In `@docs/strategy/icp-gtm-strategy.md`:
- Line 235: Several fenced code blocks in icp-gtm-strategy.md are missing
language identifiers and one lacks required surrounding blank-line spacing;
locate the bare triple-backtick fences (``` ) used for code samples and add the
appropriate language tag (e.g., ```json, ```yaml, ```bash, or ```markdown) to
each fenced block and ensure there is a blank line before and after the fenced
block where markdownlint flagged MD031; update the specific fences noted (the
bare ``` occurrences around the flagged sections) so they comply with MD040 and
MD031.
In `@docs/strategy/openclaw-integration.md`:
- Around line 64-77: The markdown fences in the OpenClaw integration doc are
missing language identifiers and/or surrounding blank lines; update each fenced
code block (including the one containing the exported async function register
and other fences referenced) to include the appropriate language tag (e.g.,
```typescript) and ensure there is a blank line before and after the fenced
block so markdownlint rules MD031/MD040 pass; focus on the block containing the
exported register function and other blocks that wrap api.on, api.registerCli,
and program.command usages to add the language identifier and surrounding blank
lines consistently.
In `@Makefile`:
- Line 1: The Makefile's .PHONY list is missing the required targets; update the
.PHONY declaration that currently lists lint/test/check/... to include at least
all and clean, and add simple placeholder targets named all (as the default
build/no-op or aggregate target) and clean (to remove build artifacts or
temporary files) so the minphony check passes; modify the existing .PHONY line
and create/ensure the targets all and clean are present in the Makefile.
In `@package.json`:
- Around line 55-56: The package.json defines identical npm scripts "start" and
"boot" both running "bun run cli/selftune/index.ts --help"; either remove the
redundant "boot" script or change one to its intended behavior—update the "boot"
value to the actual startup command or delete the "boot" entry so only "start"
remains; ensure you modify the "boot" and/or "start" keys in package.json
accordingly and run a quick smoke test (npm run start / npm run boot) to verify
the intended script executes.
In `@README.md`:
- Around line 59-63: Add a blank line before the fenced code block in README.md
so the OpenClaw example is separated from the preceding paragraph; locate the
block that starts with the triple backticks and the "bash" language tag and
insert a single empty line immediately above it to satisfy MD031 and ensure
proper rendering.
In `@tests/cron/setup.test.ts`:
- Line 2: The import list includes an unused symbol mkdirSync which fails the
noUnusedImports lint; edit the import statement that currently imports
mkdirSync, mkdtempSync, rmSync, writeFileSync and remove mkdirSync from that
import so only mkdtempSync, rmSync, and writeFileSync remain, ensuring no other
references to mkdirSync exist in the test (check functions using mkdtempSync,
rmSync, writeFileSync) before committing.
In `@tests/ingestors/openclaw-ingest.test.ts`:
- Line 33: Replace the string concatenation used when writing the file by using
a template/template-like join to satisfy the linter: in the test where
writeFileSync is called (the line using writeFileSync(filePath, lines.map((l) =>
JSON.stringify(l)).join("\n") + "\n", "utf-8")), build the file contents without
using `+` (for example, use an array join that includes the trailing newline or
a template literal) so that the call to writeFileSync(filePath, <contents>,
"utf-8") uses a single expression without `+`; update the code around
writeFileSync and the lines.map(JSON.stringify) expression accordingly.
In `@tests/sandbox/docker/Dockerfile.openclaw`:
- Around line 1-26: The Dockerfile runs provisioning and tests as root; create a
dedicated unprivileged user (e.g., "sandbox" or "runner") and switch to it
before runtime: add a step that creates the user and group (useradd/adduser),
chown the workspace and sandbox directories (/app and /sandbox) and any bun
install location (e.g., /root/.bun or move bun into the new user's home), set
ENV HOME to the new user's home, adjust PATH if needed so bun is available to
that user, and add a USER instruction before the CMD so that RUN bash tests/...
and CMD run as the non-root user; ensure the provisioning scripts and results
directory are owned by that user so they can write to them (references: the
Dockerfile RUN provision-claude.sh, RUN provision-openclaw.sh, ENV HOME, RUN
mkdir -p /app/tests/sandbox/results, and CMD
["bun","run","tests/sandbox/docker/run-openclaw-tests.ts"]).
In `@tests/sandbox/docker/run-openclaw-tests.ts`:
- Line 17: Remove the unused symbols causing lint failures: delete existsSync
from the import list at the top (keep mkdirSync and writeFileSync) and remove
the unused stderr variable/reference inside the testDoctor function; update
testDoctor to not declare or reference stderr (or rename/use it if it was
intended) so the linter no longer flags an unused symbol.
- Around line 70-78: Remove the unused import existsSync and the unused stderr
variable in the testDoctor function, and change the subprocess drain logic in
the function that spawns Bun (the code using Bun.spawn and the proc variable) to
read stdout/stderr concurrently with waiting for proc.exited to avoid deadlocks:
concurrently await Promise.all([proc.exited, new Response(proc.stdout).text(),
new Response(proc.stderr).text()]) (or equivalent) and then return the exitCode,
stdout and stderr; reference the Bun.spawn call and the testDoctor symbol to
locate the places to edit.
In `@tests/sandbox/docker/seed-openclaw.sh`:
- Around line 32-39: The fixture copy currently suppresses errors via `cp ...
2>/dev/null || true`, hiding missing/invalid data; remove that suppression and
instead explicitly detect whether JSONL fixtures exist before copying (check the
glob matching `"${agent_dir}/sessions/"*.jsonl`), log a clear error and exit
non-zero if expected fixtures are absent or if `cp` fails, and otherwise perform
the `cp` into `sessions_dir` letting failures surface; apply the same change to
the other similar block (the second `cp` around lines 43-48).
In `@tests/sandbox/results/sandbox-run-1772396841751.json`:
- Around line 4-90: The sandbox artifact contains host-specific absolute paths
in the JSON keys "command" and "stdout"; update the code that emits these
sandbox results to redact or normalize absolute paths (e.g., replace user home
segments and workspace roots with placeholders like <HOME> or <REPO_ROOT>)
before writing the file. Locate the emitter that serializes the test run JSON
(the object containing the "command" and "stdout" fields) and apply a sanitizer
that canonicalizes paths (strip /Users/<user>/..., TMP dirs, and repo-local
absolute paths) and optionally mask timestamps/IDs, then write the sanitized
payload to disk so committed artifacts are portable and leak-free.
In `@tests/sandbox/results/sandbox-run-1772442003340.json`:
- Around line 135-137: The JSON artifact
tests/sandbox/results/sandbox-run-1772442003340.json contains an unformatted
JSON block around the "stderr" field; reformat the file with your project's JSON
formatter (e.g., prettier --write or jq .) so the array/object indentation and
line breaks follow the repo style and produce valid JSON, ensuring the "stderr"
value retains its newline escapes and content exactly as shown and that the file
parses successfully (focus on the "stderr" key and surrounding array/object
structure).
- Around line 4-130: The sandbox result writer is committing host-specific
absolute paths into the "command" field (e.g., "/Users/danielpetro/...") —
update the serialization flow (the code that produces sandbox-run-*.json, e.g.,
serializeSandboxRun / writeSandboxResult) to sanitize commands before writing:
detect absolute workspace/user paths in the "command" string and replace them
with repo-relative paths (or strip the user/home prefix) via a sanitizeCommand
helper invoked just prior to file write; ensure the sanitizer is used for all
command entries emitted by the serialization logic so tests produce portable,
non-leaking artifacts.
In `@tests/sandbox/run-sandbox.ts`:
- Line 535: Replace the string concatenation at the combined assignment with a
template literal to satisfy the useTemplate lint: construct combined using
cronSetupResult.fullStdout and cronSetupResult.stderr via a template string
(e.g., `${cronSetupResult.fullStdout} ${cronSetupResult.stderr}`). Keep the
subsequent .toLowerCase() call and the variable name combined unchanged so the
rest of the code (references to combined) continues to work.
- Around line 100-121: The loops currently use hard-coded arrays
("agent-alpha","agent-beta") and ("Deploy","CodeReview"); instead, discover
fixtures dynamically by listing the agents and skills directories under
openclawFixturesDir and iterate those names. For agents: use
readdirSync(join(openclawFixturesDir, "agents")) to get agentId values (filter
for directories), then proceed with existing logic using
SANDBOX_OPENCLAW_AGENTS, dstSessions, existsSync, readdirSync, and copyFileSync
to copy .jsonl session files. For skills: list entries from
join(openclawFixturesDir, "skills"), filter for directories, then copy SKILL.md
into SANDBOX_OPENCLAW_DIR using dstSkillDir, existsSync, and copyFileSync.
Ensure directory checks still guard copies and preserve the current file filters
(e.g., file.endsWith(".jsonl") and SKILL.md).
- Around line 437-470: The current presence-only checks (in run-sandbox.ts
around ingestResult checks using queryLogPath, telemetryLogPath, and
skillLogPath) allow false positives if OpenClaw records pre-exist; change the
assertions to detect new records produced by this ingestion/dry-run by capturing
a baseline (count or latest timestamp/marker) before running ingestion and then
verifying an increased count or new entries with timestamp/ID >= the run start
after ingestion. Specifically, in the logic that builds openclawQueries,
openclawTelemetry, and openclawSkills, compare against the pre-ingest snapshot
(or check for entries with a run-specific marker/timestamp) and fail if no new
entries were added; apply the same stronger check for the dry-run path that
currently only checks all_queries_log.jsonl so telemetry and skill logs are
validated as well.
---
Outside diff comments:
In `@cli/selftune/init.ts`:
- Around line 323-325: The thrown error message in init.ts (the Error thrown
where the CLI detects supported agents) omits the newly supported "openclaw"
agent; update the error string thrown (the throw new Error(...) call) to list
"openclaw" alongside "claude, codex, opencode" so users see all supported agent
CLIs when the check fails.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (39)
.devcontainer/devcontainer.json.devcontainer/init-firewall.shARCHITECTURE.mdMakefileREADME.mdcli/selftune/constants.tscli/selftune/cron/setup.tscli/selftune/index.tscli/selftune/ingestors/openclaw-ingest.tscli/selftune/init.tscli/selftune/types.tsdocs/design-docs/index.mddocs/design-docs/sandbox-test-harness.mddocs/product-specs/index.mddocs/strategy/icp-gtm-strategy.mddocs/strategy/openclaw-integration.mdpackage.jsontests/cron/setup.test.tstests/ingestors/openclaw-ingest.test.tstests/sandbox/docker/Dockerfile.openclawtests/sandbox/docker/docker-compose.openclaw.ymltests/sandbox/docker/run-openclaw-tests.tstests/sandbox/docker/seed-openclaw.shtests/sandbox/fixtures/openclaw/agents/agent-alpha/sessions/sess-oc-001.jsonltests/sandbox/fixtures/openclaw/agents/agent-alpha/sessions/sess-oc-002.jsonltests/sandbox/fixtures/openclaw/agents/agent-alpha/sessions/sess-oc-003.jsonltests/sandbox/fixtures/openclaw/agents/agent-beta/sessions/sess-oc-004.jsonltests/sandbox/fixtures/openclaw/agents/agent-beta/sessions/sess-oc-005.jsonltests/sandbox/fixtures/openclaw/cron/jobs.jsontests/sandbox/fixtures/openclaw/skills/CodeReview/SKILL.mdtests/sandbox/fixtures/openclaw/skills/Deploy/SKILL.mdtests/sandbox/provision-openclaw.shtests/sandbox/results/llm-run-1772434673048.jsontests/sandbox/results/llm-run-1772434741410.jsontests/sandbox/results/llm-run-1772434823061.jsontests/sandbox/results/sandbox-run-1772396841751.jsontests/sandbox/results/sandbox-run-1772442003340.jsontests/sandbox/results/sandbox-run-1772442160263.jsontests/sandbox/run-sandbox.ts
…infra - Add isCronJobConfig type guard for runtime validation in cron setup - Check subprocess exit codes in cron setup/removal - Fix error double-counting in OpenClaw session ingestion - Add session validation before writing ingested sessions - Fix subprocess pipe deadlock with Promise.all in Docker test runner - Add non-root user to Dockerfile.openclaw - Use nullglob instead of error swallowing in seed-openclaw.sh - Dynamic fixture discovery in sandbox runner instead of hardcoded lists - Remove unused imports flagged by Biome (mkdirSync, existsSync, stderr) - Fix template literal lint errors in test files - Add all/clean targets to Makefile, remove redundant boot script - Fix markdown lint issues (MD031, MD040, MD058) in docs - Remove outdated milestones section from README (CHANGELOG is source of truth) - Sanitize host-specific paths in sandbox result samples for portability Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (6)
tests/sandbox/docker/seed-openclaw.sh (1)
37-39:⚠️ Potential issue | 🟠 MajorHandle empty fixture globs explicitly before
cp.With
nullglob, empty matches can triggercpwith missing operands and an unclear failure path. Validate matched files and fail with a clear message.🔧 Suggested guard rails
if [ -d "${agent_dir}/sessions" ]; then - cp "${agent_dir}/sessions/"*.jsonl "$sessions_dir/" + session_files=( "${agent_dir}/sessions/"*.jsonl ) + if [ "${`#session_files`[@]}" -eq 0 ]; then + echo "ERROR: No session fixtures found in ${agent_dir}/sessions" >&2 + exit 1 + fi + cp "${session_files[@]}" "$sessions_dir/" fi done @@ target_skill="${TARGET_DIR}/skills/${skill_name}" mkdir -p "$target_skill" - cp "${skill_dir}"* "$target_skill/" + skill_files=( "${skill_dir}"* ) + if [ "${`#skill_files`[@]}" -eq 0 ]; then + echo "ERROR: No skill fixtures found in ${skill_dir}" >&2 + exit 1 + fi + cp -R "${skill_files[@]}" "$target_skill/" done fiAlso applies to: 48-48
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/sandbox/docker/seed-openclaw.sh` around lines 37 - 39, The cp call can be invoked with no operands if the glob matches nothing; modify the logic around copying session fixtures to first expand the glob into an array (e.g., files=("${agent_dir}/sessions/"*.jsonl")) and test that at least one match exists (test -e "${files[0]}") before calling cp "${files[@]}" "$sessions_dir/"; if no matches, print a clear error and exit non‑zero. Apply the same guard for the other cp at the second occurrence (the copy at the other location mentioned in the comment).docs/strategy/icp-gtm-strategy.md (1)
571-576:⚠️ Potential issue | 🟡 MinorAdd required blank-line spacing around the fenced setup block (MD031).
The fenced bash block near Line 572 still violates markdownlint fence spacing rules.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/strategy/icp-gtm-strategy.md` around lines 571 - 576, The fenced bash block under the "Setup:" heading violates MD031 (missing blank line(s) around fenced code); fix by adding a blank line immediately before the triple-backtick fence and a blank line immediately after the closing triple-backtick so the fenced block is separated from surrounding text (update the block shown with the openclaw cron add command).docs/strategy/openclaw-integration.md (1)
64-77:⚠️ Potential issue | 🟡 MinorResolve remaining fenced-block spacing violations (MD031).
These fenced code blocks are still missing required blank lines around fences, so markdownlint will continue to warn.
Also applies to: 177-185, 188-231, 293-303
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/strategy/openclaw-integration.md` around lines 64 - 77, The fenced code block in the register function (export default async function register) violates MD031 because there are no blank lines immediately inside the opening and closing triple-backtick fences; fix by adding a blank line after the opening fence and before the closing fence for the block containing api.on("message:received"), api.on("agent:bootstrap"), and api.registerCli (and apply the same blank-line fixes to the other fenced blocks mentioned); ensure every fenced code block (including blocks around the program.command('selftune') example) has one empty line after the opening ``` and one empty line before the closing ```.cli/selftune/ingestors/openclaw-ingest.ts (1)
332-338: 🧹 Nitpick | 🔵 TrivialUse structured JSON logging utility instead of raw console output.
This module uses
console.log/console.errorthroughoutwriteSessionandcliMain(lines 333-336, 383-385, 392, 402, 405, 414-416, 420-424, 435). Per coding guidelines, CLI modules should route logs through the project's structured JSON logger.As per coding guidelines:
cli/selftune/**/*.ts: "Use structured JSON logging utility for all log output".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cli/selftune/ingestors/openclaw-ingest.ts` around lines 332 - 338, Replace all raw console.log/console.error calls in writeSession and cliMain with the project's structured JSON logging utility (use the existing CLI logger used elsewhere in the repo, e.g., logger.info / logger.error). Specifically update the dry-run block that logs sessionId, session.assistant_turns, skills, and prompt, plus every other console.* occurrences in writeSession and cliMain so they emit structured JSON entries (include context fields like sessionId, prompt, turns, skills, error). Ensure you import and use the same logger instance/pattern used by other cli/selftune modules.cli/selftune/cron/setup.ts (1)
124-131: 🧹 Nitpick | 🔵 TrivialUse structured JSON logging utility instead of raw console output.
This module uses
console.log/console.errorthroughout (lines 124-130, 133, 139, 147-149, 152, 156, 166-168, 177-182, 191, 195, 199, 207-209, 212, 216, 249-259). Per coding guidelines, CLI modules should route logs through the project's structured JSON logger.As per coding guidelines:
cli/selftune/**/*.ts: "Use structured JSON logging utility for all log output".tests/sandbox/run-sandbox.ts (1)
433-513:⚠️ Potential issue | 🟠 MajorHarden OpenClaw assertions to verify per-run deltas (not absolute presence).
These checks can pass on false positives when OpenClaw records already exist in fixtures. Also,
ingest-openclaw --dry-runcurrently validates onlyall_queries_log.jsonl, so telemetry/skill mutations can slip through.Proposed fix
+ const markerPath = join(SANDBOX_SELFTUNE_DIR, "openclaw-ingest-marker.json"); + const countSource = (path: string, source: string): number => { + if (!existsSync(path)) return 0; + return readFileSync(path, "utf-8") + .split("\n") + .filter((line) => { + if (!line.trim()) return false; + try { + return JSON.parse(line).source === source; + } catch { + return false; + } + }).length; + }; + + const queryOpenclawBefore = countSource(queryLogPath, "openclaw"); + const telemetryOpenclawBefore = countSource(telemetryLogPath, "openclaw"); + const skillOpenclawBefore = countSource(skillLogPath, "openclaw"); + const ingestResult = await runCliCommand("ingest-openclaw", [ "ingest-openclaw", "--agents-dir", SANDBOX_OPENCLAW_AGENTS, ]); - // Verify: exit 0 + new records in logs with source: "openclaw" + // Verify: exit 0 + new OpenClaw records were added by this run if (ingestResult.passed) { - // existing presence-only parsing... - if (openclawQueries.length === 0) { ... } - if (openclawTelemetry.length === 0) { ... } - if (openclawSkills.length === 0) { ... } + if (countSource(queryLogPath, "openclaw") <= queryOpenclawBefore) { + ingestResult.passed = false; + ingestResult.error = "Expected new openclaw query records after ingestion"; + } + if (countSource(telemetryLogPath, "openclaw") <= telemetryOpenclawBefore) { + ingestResult.passed = false; + ingestResult.error = "Expected new openclaw telemetry records after ingestion"; + } + if (countSource(skillLogPath, "openclaw") <= skillOpenclawBefore) { + ingestResult.passed = false; + ingestResult.error = "Expected new openclaw skill records after ingestion"; + } - - const markerPath = join(SANDBOX_SELFTUNE_DIR, "openclaw-ingest-marker.json"); if (!existsSync(markerPath)) { ingestResult.passed = false; ingestResult.error = "openclaw-ingest-marker.json not created after ingestion"; } } - const queryLinesBeforeDry = countLines(queryLogPath); + const queryLinesBeforeDry = countLines(queryLogPath); + const telemetryLinesBeforeDry = countLines(telemetryLogPath); + const skillLinesBeforeDry = countLines(skillLogPath); const dryRunResult = await runCliCommand("ingest-openclaw --dry-run", [ "ingest-openclaw", "--agents-dir", SANDBOX_OPENCLAW_AGENTS, "--dry-run", ]); if (dryRunResult.passed) { const queryLinesAfterDry = countLines(queryLogPath); + const telemetryLinesAfterDry = countLines(telemetryLogPath); + const skillLinesAfterDry = countLines(skillLogPath); if (queryLinesAfterDry !== queryLinesBeforeDry) { dryRunResult.passed = false; dryRunResult.error = `Dry run should not write records (before: ${queryLinesBeforeDry}, after: ${queryLinesAfterDry})`; } + if ( + telemetryLinesAfterDry !== telemetryLinesBeforeDry || + skillLinesAfterDry !== skillLinesBeforeDry + ) { + dryRunResult.passed = false; + dryRunResult.error = "Dry run should not mutate telemetry or skill logs"; + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/sandbox/run-sandbox.ts` around lines 433 - 513, The current assertions inspect logs for any "openclaw" records which can be false positives if fixtures already contain those entries; change the checks in the ingest and dry-run blocks to verify per-run deltas instead of absolute presence: before running ingest-openclaw capture line counts for queryLogPath, telemetryLogPath, and skillLogPath (and existence of markerPath), run runCliCommand, then compare counts after to ensure each increased by the expected number of records (or marker created) using the same variables (ingestResult, queryLogPath, telemetryLogPath, skillLogPath, markerPath, SANDBOX_SELFTUNE_DIR); for the dry-run block (dryRunResult) capture the same pre-run counts and marker state and assert none of those changed (including marker not created) so telemetry and skill logs are validated alongside all_queries_log.jsonl; reuse helper countLines and only set ingestResult.error/dryRunResult.error with clear delta details on failure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/sandbox/docker/seed-openclaw.sh`:
- Line 21: The curl healthcheck in the if condition that checks
"${GATEWAY_URL}/healthz" can hang because it lacks explicit timeouts; update the
command used in that conditional (the if curl -sf "${GATEWAY_URL}/healthz" >
/dev/null 2>&1 check) to add sensible --connect-timeout and --max-time values
(e.g., --connect-timeout 2 --max-time 5) so the check respects the retry budget
and will fail fast instead of blocking.
In `@tests/sandbox/results/sandbox-run-1772442160263.json`:
- Line 35: The status table is using padEnd(16) on skill names without
truncation, causing long names (e.g., "ai-image-generation3%") to overflow into
the next column; update the rendering where skill names are formatted (look for
usages of skill.name.padEnd(16) or padEnd(16) inside the status table
generation, e.g., in the function that builds the "Skills" table) to use
skill.name.slice(0, 16).padEnd(16) so names are truncated to 16 chars then
padded to preserve column alignment.
---
Duplicate comments:
In `@cli/selftune/ingestors/openclaw-ingest.ts`:
- Around line 332-338: Replace all raw console.log/console.error calls in
writeSession and cliMain with the project's structured JSON logging utility (use
the existing CLI logger used elsewhere in the repo, e.g., logger.info /
logger.error). Specifically update the dry-run block that logs sessionId,
session.assistant_turns, skills, and prompt, plus every other console.*
occurrences in writeSession and cliMain so they emit structured JSON entries
(include context fields like sessionId, prompt, turns, skills, error). Ensure
you import and use the same logger instance/pattern used by other cli/selftune
modules.
In `@docs/strategy/icp-gtm-strategy.md`:
- Around line 571-576: The fenced bash block under the "Setup:" heading violates
MD031 (missing blank line(s) around fenced code); fix by adding a blank line
immediately before the triple-backtick fence and a blank line immediately after
the closing triple-backtick so the fenced block is separated from surrounding
text (update the block shown with the openclaw cron add command).
In `@docs/strategy/openclaw-integration.md`:
- Around line 64-77: The fenced code block in the register function (export
default async function register) violates MD031 because there are no blank lines
immediately inside the opening and closing triple-backtick fences; fix by adding
a blank line after the opening fence and before the closing fence for the block
containing api.on("message:received"), api.on("agent:bootstrap"), and
api.registerCli (and apply the same blank-line fixes to the other fenced blocks
mentioned); ensure every fenced code block (including blocks around the
program.command('selftune') example) has one empty line after the opening ```
and one empty line before the closing ```.
In `@tests/sandbox/docker/seed-openclaw.sh`:
- Around line 37-39: The cp call can be invoked with no operands if the glob
matches nothing; modify the logic around copying session fixtures to first
expand the glob into an array (e.g., files=("${agent_dir}/sessions/"*.jsonl"))
and test that at least one match exists (test -e "${files[0]}") before calling
cp "${files[@]}" "$sessions_dir/"; if no matches, print a clear error and exit
non‑zero. Apply the same guard for the other cp at the second occurrence (the
copy at the other location mentioned in the comment).
In `@tests/sandbox/run-sandbox.ts`:
- Around line 433-513: The current assertions inspect logs for any "openclaw"
records which can be false positives if fixtures already contain those entries;
change the checks in the ingest and dry-run blocks to verify per-run deltas
instead of absolute presence: before running ingest-openclaw capture line counts
for queryLogPath, telemetryLogPath, and skillLogPath (and existence of
markerPath), run runCliCommand, then compare counts after to ensure each
increased by the expected number of records (or marker created) using the same
variables (ingestResult, queryLogPath, telemetryLogPath, skillLogPath,
markerPath, SANDBOX_SELFTUNE_DIR); for the dry-run block (dryRunResult) capture
the same pre-run counts and marker state and assert none of those changed
(including marker not created) so telemetry and skill logs are validated
alongside all_queries_log.jsonl; reuse helper countLines and only set
ingestResult.error/dryRunResult.error with clear delta details on failure.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (19)
MakefileREADME.mdcli/selftune/cron/setup.tscli/selftune/ingestors/openclaw-ingest.tsdocs/design-docs/sandbox-test-harness.mddocs/strategy/icp-gtm-strategy.mddocs/strategy/openclaw-integration.mdtests/cron/setup.test.tstests/ingestors/openclaw-ingest.test.tstests/sandbox/docker/Dockerfile.openclawtests/sandbox/docker/run-openclaw-tests.tstests/sandbox/docker/seed-openclaw.shtests/sandbox/results/llm-run-1772434673048.jsontests/sandbox/results/llm-run-1772434741410.jsontests/sandbox/results/llm-run-1772434823061.jsontests/sandbox/results/sandbox-run-1772396841751.jsontests/sandbox/results/sandbox-run-1772442003340.jsontests/sandbox/results/sandbox-run-1772442160263.jsontests/sandbox/run-sandbox.ts
- Add --connect-timeout 2 --max-time 5 to curl healthcheck in seed-openclaw.sh - Truncate skill names to 16 chars in status table to prevent column overflow Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/sandbox/docker/seed-openclaw.sh`:
- Around line 58-59: The script currently only prints the number of seeded
sessions and can silently continue if that count is zero; change it to compute
the seeded-files count (using the same find invocation against
"${TARGET_DIR}/agents" and *.jsonl), then if the count is 0 emit an error
message indicating no sessions were seeded (including the evaluated path/glob)
and exit with non-zero status, otherwise print the existing "Seeded: ...
sessions" and "Done." messages; refer to the existing variables TARGET_DIR and
the find/wc pipeline used on the lines with the echo that produces the count.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
cli/selftune/status.tstests/sandbox/docker/seed-openclaw.sh
The script previously printed the seeded session count but continued silently even when that count was zero. Now it captures the count, checks for zero, and exits non-zero with a descriptive error message including the evaluated path/glob. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
tests/sandbox/docker/seed-openclaw.sh (1)
44-49: 🧹 Nitpick | 🔵 TrivialUse
cp -rfor robustness with skill fixtures.The current glob pattern
cp "${skill_dir}"*works for the existing fixtures (which contain only files), but fails silently if subdirectories are added. Usingcp -rensures directory structures are copied and future-proofs against changes to fixture organization.Suggested fix
- cp "${skill_dir}"* "$target_skill/" + cp -r "${skill_dir}"* "$target_skill/"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/sandbox/docker/seed-openclaw.sh` around lines 44 - 49, The loop that copies fixtures uses cp "${skill_dir}"* which fails silently when a skill fixture contains subdirectories; update the copy to use recursive copy (cp -r) so directories are preserved—modify the code that references skill_dir/skill_name/target_skill (inside the for loop) to call cp -r "${skill_dir}"* "$target_skill/" to ensure nested directories and files are copied robustly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@tests/sandbox/docker/seed-openclaw.sh`:
- Around line 44-49: The loop that copies fixtures uses cp "${skill_dir}"* which
fails silently when a skill fixture contains subdirectories; update the copy to
use recursive copy (cp -r) so directories are preserved—modify the code that
references skill_dir/skill_name/target_skill (inside the for loop) to call cp -r
"${skill_dir}"* "$target_skill/" to ensure nested directories and files are
copied robustly.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
tests/sandbox/docker/seed-openclaw.sh
Bug fixes: - Fix routing retries using wrong refiner in evolve-body.ts (#15) - Fix baseline gate evaluating wrong skill body in evolve.ts (#19) - Fix Gate 1 accepting routing tables with zero data rows (#20) - Fix section heading prefix match allowing false positives (#13) Security: - Add ReDoS protection for regex assertions (length cap, try-catch, u flag) (#10) - Harden YES/NO parsing to first-token extraction with punctuation stripping (#23) Type safety & validation: - Add type params to readJsonl calls in baseline.ts (#3) - Validate LLM JSON shape before treating as SkillUnitTest[] (#4) - Add JSON.parse validation in unit-test-cli.ts (#9) Code quality: - Replace console.warn with structured JSON logging (#5, #11) - Sort readdirSync for deterministic traversal (#6) - Tighten fuzzy match to prevent over-selection (#7) - Rename evalFailures to positiveEvals for clarity (#8) - Enforce timeout_ms via Promise.race in runUnitTest (#12) - Enrich deploy audit entry with gates snapshot (#16) - Add Pareto 5th dimension comment (#18) - Extract composability CLI into composability-cli.ts (#21) - Add missing eval files to lint-architecture.ts (#28) - Reformat AssertionType union (#22), template literal (#14), imports (#17) - Reformat baseline conditional (#2) Documentation: - Renumber scope expansion list (#24) - Add text language to fenced code blocks (#25, #27, #30) - Add rollback audit payload note (#26) - Clarify --dir expects corpus root (#29) - Make --tests optional in syntax (#31) Tests: - Remove unused afterEach import (#32) - Replace ! assertion with ?. optional chaining (#33) - Fix temp dir cleanup leak with afterAll (#34) - Add Gate 2 status assertion (#40) - Strengthen routing validation assertions (#41) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Integrates OpenClaw platform support with session ingestion, adds a two-layer sandbox test harness (local and devcontainer-based), introduces firewall-isolated development containers, updates documentation to milestone nomenclature, and expands CLI with cron job management and new subcommands.
New Features
selftune croncommand for managing scheduled jobsChanges by Cohort
Test Coverage
Stats