Skip to content

feat(skillopt): wire pi + generalize skill detection + CodeQL fix#244

Merged
kaghni merged 10 commits into
mainfrom
feat/skillopt-multiagent-hardening
Jun 8, 2026
Merged

feat(skillopt): wire pi + generalize skill detection + CodeQL fix#244
kaghni merged 10 commits into
mainfrom
feat/skillopt-multiagent-hardening

Conversation

@kaghni
Copy link
Copy Markdown
Collaborator

@kaghni kaghni commented Jun 7, 2026

Combined follow-up to the SkillOpt release (3 items in one PR per request).

1. Other agents — generalize detection + wire pi

  • armSkillOptOnSkillUse now arms on EITHER a Skill tool (claude) OR a read of …/skills/<name--author>/SKILL.md (agents with no first-class Skill tool). markSkillPending still gates org-shape + pull manifest.
  • pi wired: the extension arms on a tool_result SKILL.md read and, on the next user prompt (the reaction), spawns the bundled skillopt-worker (agent=pi). Logic inlined into the raw-.ts extension (zero-dep); HIVEMIND_SKILLOPT_* env names are the cross-process contract. esbuild builds + install-pi installs the worker for pi.
  • codex still blocked: its exec_command read of SKILL.md isn't captured (0.136.0 PostToolUse gap) — nothing to detect yet. Filed as a prerequisite.

2. CodeQL — command-injection fix

resolveAgentBin did execSync(\command -v ${AGENT_CMD[agent] ?? agent}`); agenttraces toHIVEMIND_SKILLOPT_AGENTand the?? agent` fallback let an arbitrary value reach a shell. Now resolves KNOWN agents only by walking $PATH in Node (no shell/subprocess). Verified: known agent → bin path, injection payload → undefined.

3. Coverage

  • skillopt-hook: SKILL.md-read arming + ref extraction (cherry-picked with the generalization).
  • pi-extension-source: 5 new source-level guards for the pi wiring (runtime-loaded → source is the bundle guard).

tsc clean (changed files); 69 tests green.

⚠️ Before merge

pi wiring needs a live pi-session E2E — the extension is loaded + compiled by pi's runtime, so it's only source-tested here. Same pattern we used for claude (which is now live-verified). Don't merge until a real pi session has been driven through arm→react→publish.

Summary by CodeRabbit

  • New Features

    • SkillOpt: session-aware detection of org-skill usage that can create per-session judgment windows and spawn background judgment workers triggered by subsequent user reactions.
  • Improvements

    • SkillOpt worker bundles are built and installed for Codex, Hermes, and Pi so judgments run where applicable.
    • More robust agent-binary resolution avoiding shell subprocesses.
    • SkillOpt respects a runtime kill-switch and recursion guard.
  • Tests

    • Expanded test coverage for wiring, detection, env-contracts, arming/reacting flows, and edge cases.

kaghni added 3 commits June 7, 2026 01:08
…ables pi/codex)

Claude invokes skills via a Skill tool; pi/codex have no Skill tool — they READ the skill's
SKILL.md file. Generalize armSkillOptOnSkillUse to also arm on a 'read' tool whose path
matches */skills/<ref>/SKILL.md, recovering the name--author ref from the dir segment
(markSkillPending still gates org-shape + manifest, so bare/non-org dirs are rejected).
Restricted to read tools so EDITING a SKILL.md doesn't arm. Makes pi wireable now, and codex
wireable once its capture records the SKILL.md read.

Tests: SKILL.md-read arming, non-read/non-SKILL.md rejection, skillRefFromSkillFileRead path
extraction (10 hook tests).
…and-injection)

CodeQL flagged resolveAgentBin: `execSync(\`command -v ${AGENT_CMD[agent] ?? agent}\`)`
as 'indirect uncontrolled command line' — `agent` traces back to the HIVEMIND_SKILLOPT_AGENT
env var, and the `?? agent` fallback let an arbitrary value (e.g. 'x; rm -rf ~') reach a shell.

Fix: drop the fallback (resolve KNOWN agents only) and walk $PATH in Node — no shell, no
subprocess — so no command line is ever built from env-derived input. Still finds nvm/volta/fnm
installs (the reason the old `command -v` existed), since those are on PATH. Verified: a known
agent resolves to its bin, an injection payload resolves to undefined.
…n worker

pi has no first-class Skill tool — it USES an org skill by READING its SKILL.md — so the pi
extension now arms SkillOpt on a tool_result whose path is .../skills/<name--author>/SKILL.md
and, on the next user prompt (the reaction), spawns the bundled skillopt-worker (agent=pi) to
judge + improve. Logic is inlined into the raw-.ts extension (zero non-builtin deps, can't import
the trigger); the HIVEMIND_SKILLOPT_* env names are the cross-process contract with the worker.
Gated on org-shape, the kill switch, and the in-worker recursion guard; both call sites sit after
the captureEnabled check so the worker's own pi-judge subprocess can't re-fire.

- esbuild.config.mjs: build skillopt-worker into pi/bundle
- install-pi.ts: install skillopt-worker.js into ~/.pi/agent/hivemind (cleanup already covers it)
- pi extension: skilloptArm (tool_result) + skilloptReact (input) + detached worker spawn
- tests: 5 source-level guards for the wiring (runtime-loaded → source is the bundle guard)

NOTE: codex still blocked — its exec read of SKILL.md isn't captured (0.136.0 PostToolUse gap).
NOTE: pi wiring needs a LIVE pi-session E2E before merge (runtime-loaded; only source-tested here).
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 7, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Detects SKILL.md loads to arm a per-session SkillOpt window, persists pending state in the pi extension, reacts on next user input by spawning a detached skillopt-worker, wires codex/hermes/pi hooks, updates bundling/install, replaces worker shell resolution with PATH-walking, and adds tests.

Changes

SkillOpt integration into pi extension

Layer / File(s) Summary
Shared hook & invocation detection
src/hooks/shared/skillopt-hook.ts, src/skillify/skill-invocations.ts, tests/shared/skillopt-hook.test.ts, tests/shared/skill-invocations.test.ts
Adds skillRefFromSkillFileRead and pathToSkillRef, expands invokedSkillRef to detect SKILL.md reads and command patterns, and updates armSkillOptOnSkillUse to honor a DISABLED flag and arm via the helper.
Pi extension lifecycle wiring
pi/extension-source/hivemind.ts, tests/pi/pi-extension-source.test.ts, tests/shared/skillopt-trigger.test.ts
Adds per-session pending judgment window storage and functions, calls skilloptArm after successful tool_result capture, and calls skilloptReact after persisting user input to spawn the detached skillopt-worker; includes integration tests for spawn env contract, budget/mutation, and gating.
Codex hooks: capture & pre-tool-use wiring
src/hooks/codex/capture.ts, src/hooks/codex/pre-tool-use.ts, tests/shared/agent-model.test.ts
Calls reactSkillOpt after capture and armSkillOptOnSkillUse early in pre-tool-use main flow; adds test verifying -m model arg is passed when configured.
Hermes hooks: capture & pre-tool-use wiring
src/hooks/hermes/capture.ts, src/hooks/hermes/pre-tool-use.ts
Captures reactPrompt in Hermes pre_llm_call and calls reactSkillOpt after capture; invokes armSkillOptOnSkillUse early in Hermes pre-tool-use main flow.
SkillOpt worker bundling & pi installer
esbuild.config.mjs, src/cli/install-pi.ts
Adds skillopt-worker entries to Codex/Hermes/Pi bundles and updates installPi() to copy skillopt-worker.js into the pi agent hivemind directory when present, with a log message.
SkillOpt worker PATH-walking binary resolver
src/skillify/skillopt-worker.ts
Replaces shell-based command -v/execSync resolution with a PATH-walking resolver that checks whitelisted agent binaries via fs.accessSync(..., X_OK) and returns the first executable or undefined.
SkillOpt tests & meta
tests/shared/*, tests/pi/*
Adds and extends tests for parsing, arming, trigger spawn behavior, improve/judge flows, priorEditSummaries formatting, and guard cases (session mismatch, empty reactions, proposer/no-edits, dedup, credential gating).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • efenocchi

Poem

🐰 I hopped through paths where SKILL.md lay,
I marked a window for judgment today,
A worker wakes when the user replies,
PATH walks tidy — no shell in disguise. 🥕

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description covers the three main objectives with technical detail, but lacks a version bump specification as required by the template. Add a version bump section specifying the new version in package.json. For new features (SkillOpt pi integration), use a minor version bump (e.g., 1.3.0).
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately summarizes the three main changes: pi agent wiring, generalized skill detection, and CodeQL security fix.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/skillopt-multiagent-hardening

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 7, 2026

Coverage Report

Scope: files changed in this PR. Enforced threshold: 90% per metric (per file via vitest.config.ts).

Status Category Percentage Covered / Total
🔴 Lines 89.92% (🎯 90%) 473 / 526
🔴 Statements 87.18% (🎯 90%) 544 / 624
🔴 Functions 80.00% (🎯 90%) 40 / 50
🔴 Branches 81.78% (🎯 90%) 377 / 461
File Coverage — 8 files changed
File Stmts Branches Functions Lines
src/cli/install-pi.ts 🟢 100.0% 🟢 100.0% 🟢 100.0% 🟢 100.0%
src/hooks/codex/capture.ts 🟢 100.0% 🔴 85.3% 🟢 100.0% 🟢 100.0%
src/hooks/codex/pre-tool-use.ts 🟢 96.8% 🔴 84.0% 🔴 80.0% 🟢 98.0%
src/hooks/hermes/capture.ts 🔴 86.8% 🔴 83.0% 🟢 100.0% 🔴 86.8%
src/hooks/hermes/pre-tool-use.ts 🟢 92.1% 🔴 87.5% 🟢 100.0% 🟢 90.0%
src/hooks/shared/skillopt-hook.ts 🟢 100.0% 🟢 100.0% 🟢 100.0% 🟢 100.0%
src/skillify/skill-invocations.ts 🟢 95.8% 🔴 87.1% 🟢 100.0% 🟢 100.0%
src/skillify/skillopt-worker.ts 🔴 0.0% 🔴 0.0% 🔴 0.0% 🔴 0.0%

Generated for commit 576fba6.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tests/pi/pi-extension-source.test.ts (1)

197-204: ⚡ Quick win

Make SkillOpt wiring assertions specific (not substring-based).

The current checks can pass with incorrect logic. For this block, assert the exact arming condition and path shape instead of broad substrings.

Proposed test tightening
-    expect(PI_SRC).toContain("/skills/"); // the ref-from-path matcher targets …/skills/<ref>/SKILL.md
-    expect(PI_SRC).toContain("SKILL");
+    expect(PI_SRC).toMatch(/if\s*\(\s*event\.isError\s*!==\s*true\s*\)\s*skilloptArm\(sessionId,\s*event\.input,\s*event\.toolCallId\)/);
+    expect(PI_SRC).toMatch(/\/skills\/\(\[\^\/\]\+\)\/SKILL\\\.md\$/);

As per coding guidelines: tests/** should prefer specific values/patterns over generic substrings.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/pi/pi-extension-source.test.ts` around lines 197 - 204, The test uses
broad substring checks on PI_SRC which can give false positives; update the
assertions to match the exact arming condition and path shape instead of loose
substrings: assert that PI_SRC contains the exact call expression
"skilloptArm(sessionId, event.input, event.toolCallId)" guarded by the exact
condition "event.isError !== true" together in the same conditional, and assert
the ref-from-path matches the precise pattern for SKILL.md under a skills folder
(e.g. a regex like /\/skills\/[^\/]+\/SKILL\.md/); use exact string equality or
a targeted regex match on PI_SRC rather than simple expect(...).toContain checks
to ensure the wiring is correct.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pi/extension-source/hivemind.ts`:
- Line 480: Replace the hard-coded SKILLOPT_PENDING_DIR that points to
join(homedir(), ".deeplake", "state", "skillify", "skillopt", "pending") with
the shared state-dir contract: call getStateDir() (which must treat
empty/whitespace HIVEMIND_STATE_DIR as unset and otherwise honor the override)
and build the pending path relative to that result (e.g., join(getStateDir(),
"skillify", "skillopt", "pending")); update any references to
SKILLOPT_PENDING_DIR to use this new computed path so HIVEMIND_STATE_DIR
overrides are respected.
- Around line 501-505: Restrict arming to read-only tool results: inside
skilloptArm, add a guard that returns unless the incoming tool result explicitly
represents a "read" operation (check the appropriate field on toolInput such as
toolInput.toolName / toolInput.name / toolInput.action or equivalent used in
callers) before computing ref via skilloptRefFromPath; and update the other call
site that invokes skilloptArm for successful tool results so it only calls
skilloptArm when that tool result is a read operation as well.

---

Nitpick comments:
In `@tests/pi/pi-extension-source.test.ts`:
- Around line 197-204: The test uses broad substring checks on PI_SRC which can
give false positives; update the assertions to match the exact arming condition
and path shape instead of loose substrings: assert that PI_SRC contains the
exact call expression "skilloptArm(sessionId, event.input, event.toolCallId)"
guarded by the exact condition "event.isError !== true" together in the same
conditional, and assert the ref-from-path matches the precise pattern for
SKILL.md under a skills folder (e.g. a regex like
/\/skills\/[^\/]+\/SKILL\.md/); use exact string equality or a targeted regex
match on PI_SRC rather than simple expect(...).toContain checks to ensure the
wiring is correct.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: b970a107-71da-4386-9c21-d9cfb67b94a8

📥 Commits

Reviewing files that changed from the base of the PR and between 0b35aec and 7649d47.

📒 Files selected for processing (7)
  • esbuild.config.mjs
  • pi/extension-source/hivemind.ts
  • src/cli/install-pi.ts
  • src/hooks/shared/skillopt-hook.ts
  • src/skillify/skillopt-worker.ts
  • tests/pi/pi-extension-source.test.ts
  • tests/shared/skillopt-hook.test.ts

Comment thread pi/extension-source/hivemind.ts Outdated
Comment thread pi/extension-source/hivemind.ts Outdated
kaghni added 2 commits June 7, 2026 20:06
codex has no Skill/read tool — it SHELLS a read of …/skills/<name--author>/SKILL.md, so the
path lives in the command string. Generalize detection to match a SKILL.md path inside a shell
`command` (pathToSkillRef, shared by the hook + the worker's invokedSkillRef so the window
finds it), and wire codex's hooks: pre-tool-use arms on the Bash command, capture reacts on
UserPromptSubmit (agent=codex — judging runs on the codex CLI, already verified). esbuild builds
skillopt-worker into codex's bundle (install-codex copyDir already ships it).

- skillopt-hook: arm on a shell command reading SKILL.md (codex/hermes), via shared pathToSkillRef
- skill-invocations: invokedSkillRef recognizes read-path OR shell-command SKILL.md loads
- codex pre-tool-use: armSkillOptOnSkillUse in main(); codex capture: reactSkillOpt on UserPromptSubmit
- tests: shell-command arming + invocation recognition

NOTE: needs a live codex E2E (build + install the codex bundle, then a real codex session).
codex reads org skills via a Bash command like `sed -n '1,220p' ~/.codex/skills/.system/<name--author>/SKILL.md`
(confirmed from a real codex session). The detector regex only matched a DIRECT …/skills/<ref>/SKILL.md,
missing the intermediate `.system/` segment. Allow intermediate dirs, capturing the dir right before
SKILL.md; markSkillPending still gates org-shape so `.system`/bare system-skill dirs are rejected.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/hooks/codex/pre-tool-use.ts`:
- Around line 372-375: Wrap the armSkillOptOnSkillUse call in a local try/catch
inside main so it cannot throw and short‑circuit the flow: call
armSkillOptOnSkillUse(...) inside try, on error catch and log the error (using
the existing logging facility) and set a local failClosed flag or skillOptArmed
= false to ensure the system treats the skill as unarmed/unsafe; then always
await processCodexPreToolUse(...) afterward (never skip it), and ensure no
exception from armSkillOptOnSkillUse bubbles out of main. Reference:
armSkillOptOnSkillUse, processCodexPreToolUse, main.

In `@src/skillify/skill-invocations.ts`:
- Around line 68-77: The SQL prefilter that currently only selects rows where
tool_name = "Skill" must be broadened so SKILL.md-based invocations aren't
dropped before invokedSkillRef() runs; update the query used where rows are
fetched (the code that narrows results before calling invokedSkillRef) to also
include tool calls whose tool_input/tool_path/tool_command reference "SKILL.md"
(or otherwise match pathToSkillRef semantics), e.g., change the WHERE clause to
accept tool_name = 'Skill' OR the tool_input/path/command contains the SKILL.md
marker so pathToSkillRef(io?.path|command) can detect them; adjust the fetching
logic in the same module (skill-invocations.ts) where the coarse SQL prefilter
is implemented.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: c4e719ed-1acd-44cc-9e83-2341e62fec74

📥 Commits

Reviewing files that changed from the base of the PR and between 7649d47 and 9c6de81.

📒 Files selected for processing (7)
  • esbuild.config.mjs
  • src/hooks/codex/capture.ts
  • src/hooks/codex/pre-tool-use.ts
  • src/hooks/shared/skillopt-hook.ts
  • src/skillify/skill-invocations.ts
  • tests/shared/skill-invocations.test.ts
  • tests/shared/skillopt-hook.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/shared/skillopt-hook.test.ts
  • src/hooks/shared/skillopt-hook.ts

Comment thread src/hooks/codex/pre-tool-use.ts
Comment thread src/skillify/skill-invocations.ts
…ompt

hermes uses an org skill the same way codex does — by shelling a read of its SKILL.md via the
'terminal' tool (path in tool_input.command). Reuse the shared shell-command detection:
pre-tool-use arms via armSkillOptOnSkillUse on the terminal command, capture reacts on the
pre_llm_call user prompt (agent=hermes — judging runs on the hermes CLI). esbuild builds
skillopt-worker into hermes's bundle (install-hermes copyDir already ships it).

No new detection code — hermes rides the codex generalization (pathToSkillRef on a shell command).
Source-tested; like pi, needs a live hermes session E2E before merge.
@kaghni
Copy link
Copy Markdown
Collaborator Author

kaghni commented Jun 8, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 8, 2026

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

kaghni added 3 commits June 8, 2026 02:18
…ools (coderabbit)

CodeRabbit flagged two issues in the pi wiring: (1) the pending dir hard-coded
~/.deeplake/state/skillify/... and ignored HIVEMIND_STATE_DIR, splitting pi's state from the
rest of Skillify (and from test-isolation overrides); now mirrors getStateDir()'s contract.
(2) skilloptArm fired on ANY tool_result with a matching path, so an edit/write of a SKILL.md
could open a judgment window; now passes the toolName and arms only on read tools. Test updated.
…ool gate (coderabbit)

armSkillOptOnSkillUse is internally swallowed, but its call in main() was unguarded — a throw
would skip processCodexPreToolUse and the top-level catch exits 0 (fail-open for the memory-path
gate). Wrap the call so the SkillOpt side-effect is fail-closed and can't affect the tool decision.
…ads (coderabbit)

The coarse SQL prefilter matched only '"Skill"' tool_calls, dropping the newly-supported
SKILL.md read/shell invocations (pi/codex/hermes) before invokedSkillRef could evaluate them.
Prefilter now matches a Skill tool_call OR a /SKILL.md path. Test asserts both branches.
@kaghni
Copy link
Copy Markdown
Collaborator Author

kaghni commented Jun 8, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 8, 2026

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Add unit coverage for paths the dependency-injected tests never exercised: the REAL fileStore +
spawnWorker + defaultIsOrgSkill/defaultHasCreds in skillopt-trigger (was 27% func/61% branch →
91%/86%); readCurrentSkillRow column-parsing fallbacks (install/scope/version ternaries,
contributors as array vs JSON-string vs non-JSON vs object); the improve flow's empty-reaction,
proposer-no-change, dedup, and session_id-mismatch branches; agent-model's codex model-override;
parseMessage non-object input; summarizeEdit's no-content/no-target halves; and windowing's
session-collision + empty-turn drops. 86 skillopt unit tests, all green.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (5)
tests/shared/skillopt-trigger.test.ts (2)

121-122: ⚡ Quick win

Restore the previous HIVEMIND_STATE_DIR instead of always deleting it.

Current cleanup can leak global env mutation when a value existed before this suite.

Proposed change
 describe("default fileStore + spawnWorker (real implementations)", () => {
   let tmp: string;
-  beforeEach(() => { tmp = mkdtempSync(join(tmpdir(), "skillopt-trig-")); process.env.HIVEMIND_STATE_DIR = tmp; spawnMock.mockClear(); });
-  afterEach(() => { delete process.env.HIVEMIND_STATE_DIR; });
+  let prevStateDir: string | undefined;
+  beforeEach(() => {
+    prevStateDir = process.env.HIVEMIND_STATE_DIR;
+    tmp = mkdtempSync(join(tmpdir(), "skillopt-trig-"));
+    process.env.HIVEMIND_STATE_DIR = tmp;
+    spawnMock.mockClear();
+  });
+  afterEach(() => {
+    if (prevStateDir === undefined) delete process.env.HIVEMIND_STATE_DIR;
+    else process.env.HIVEMIND_STATE_DIR = prevStateDir;
+  });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/shared/skillopt-trigger.test.ts` around lines 121 - 122, The test suite
currently overwrites process.env.HIVEMIND_STATE_DIR in beforeEach and
unconditionally deletes it in afterEach, which can leak global env mutation;
modify the hooks so beforeEach saves the original value (e.g., const
prevHivemindStateDir = process.env.HIVEMIND_STATE_DIR) before assigning tmp
(created via mkdtempSync and stored in tmp) and calling spawnMock.mockClear(),
and then have afterEach restore process.env.HIVEMIND_STATE_DIR to that saved
prevHivemindStateDir (or delete it only if prev was undefined) instead of always
deleting it.

134-145: ⚡ Quick win

Make the worker script assertion path-specific (not substring-based).

Use basename and exact match so path-shape changes don’t mask regressions.

Proposed change
-import { join } from "node:path";
+import { join, basename } from "node:path";
@@
-    expect(args[0]).toContain("skillopt-worker.js");
+    expect(basename(args[0])).toBe("skillopt-worker.js");

As per coding guidelines, tests/** should prefer asserting on specific values (paths, messages) over generic substrings.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/shared/skillopt-trigger.test.ts` around lines 134 - 145, Replace the
substring assertion on the worker script with a path-specific exact match: in
tests/shared/skillopt-trigger.test.ts where spawnWorker is exercised via
runEventTrigger and you inspect spawnMock.mock.calls (args[0]), call
path.basename(args[0]) and assert it equals "skillopt-worker.js"
(expect(path.basename(args[0])).toBe("skillopt-worker.js")) so path-shape
changes won’t hide regressions; ensure you import/require the path module in the
test file.

Source: Coding guidelines

tests/shared/skill-invocations.test.ts (1)

123-135: ⚡ Quick win

Strengthen this with an exact transcript assertion.

This can pass while order/format regresses. Assert the full output string instead of partial toContain checks.

Proposed change
   it("drops rows from a different session_id (path-LIKE collision) and empty-content turns", async () => {
@@
     const out = await windowAroundInvocation(fn, TABLE, inv, { before: 5, after: 5 });
-    expect(out).not.toContain("another session"); // OTHER session_id dropped
-    expect(out).toContain("real");
-    expect(out).toContain("pushback");
+    expect(out).toBe("USER: real\n\nUSER: pushback");
   });

As per coding guidelines, tests/** should prefer asserting on specific values (paths, messages) over generic substrings.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/shared/skill-invocations.test.ts` around lines 123 - 135, The test uses
loose partial assertions (expect(...).toContain) which allows ordering/format
regressions; update the test in tests/shared/skill-invocations.test.ts that
calls mockQuery and windowAroundInvocation (symbols: mockQuery,
windowAroundInvocation, TABLE, inv) to assert the exact transcript output
instead of substrings — replace the three toContain checks with a single precise
assertion (e.g., expect(out).toEqual(...) or
expect(out).toMatchInlineSnapshot(...)) that includes the full expected output
string/format including the retained "real" and "pushback" lines and the exact
representation for the tool_call, ensuring the dropped OTHER session and skipped
empty assistant message are absent.

Source: Coding guidelines

tests/shared/skillopt-improve.test.ts (1)

88-93: ⚡ Quick win

Assert the “no reaction appended” behavior directly.

This test currently proves improve/publish still happens, but not the claimed window behavior. Spy on judge input and assert the whitespace-reaction path passes the same window as reaction: undefined.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/shared/skillopt-improve.test.ts` around lines 88 - 93, The test should
additionally verify that a whitespace reaction is treated like undefined by
spying on the input passed to judge: call improveSkillIfFailed(base(query, {
reaction: "   " })) and capture the arguments given to the judge function (or
the helper that forwards to judge) and assert that the window/body passed equals
the window/body when calling improveSkillIfFailed(base(query, { reaction:
undefined })); update the test to create a judge spy (or mock) and compare its
received window for the two calls so you directly assert the "no reaction
appended" behavior alongside the existing assertions about
judged/failed/improved and inserts length.
tests/shared/skillopt-meta.test.ts (1)

35-41: ⚡ Quick win

Use exact summaries instead of regex on joined output.

Prefer asserting the concrete prior entries directly (e.g., exact array/elements) rather than joined-string regex matches, so this test fails only on real formatting regressions.

Proposed assertion tightening
-    expect(prior.join(" ")).toMatch(/delete @"old rule"/);
-    expect(prior.join(" ")).toMatch(/append: x/);
+    expect(prior).toEqual(['delete @"old rule"', "append: x"]);

As per coding guidelines, tests/** should prefer asserting on specific values over generic substrings.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/shared/skillopt-meta.test.ts` around lines 35 - 41, Replace the loose
regex assertions that inspect prior.join(" ") with exact-value assertions on the
prior array produced by priorEditSummaries (created via metaEntryFor) so the
test fails only on real output changes: locate the test case using
metaEntryFor("p","k",...) and the prior = priorEditSummaries(...) call, then
assert prior equals the expected array (or assert specific prior[0] and prior[1]
strings) that contain the exact "delete @\"old rule\"" and "append: x" entries
instead of using toMatch on the joined string.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/shared/skill-invocations.test.ts`:
- Around line 123-135: The test uses loose partial assertions
(expect(...).toContain) which allows ordering/format regressions; update the
test in tests/shared/skill-invocations.test.ts that calls mockQuery and
windowAroundInvocation (symbols: mockQuery, windowAroundInvocation, TABLE, inv)
to assert the exact transcript output instead of substrings — replace the three
toContain checks with a single precise assertion (e.g., expect(out).toEqual(...)
or expect(out).toMatchInlineSnapshot(...)) that includes the full expected
output string/format including the retained "real" and "pushback" lines and the
exact representation for the tool_call, ensuring the dropped OTHER session and
skipped empty assistant message are absent.

In `@tests/shared/skillopt-improve.test.ts`:
- Around line 88-93: The test should additionally verify that a whitespace
reaction is treated like undefined by spying on the input passed to judge: call
improveSkillIfFailed(base(query, { reaction: "   " })) and capture the arguments
given to the judge function (or the helper that forwards to judge) and assert
that the window/body passed equals the window/body when calling
improveSkillIfFailed(base(query, { reaction: undefined })); update the test to
create a judge spy (or mock) and compare its received window for the two calls
so you directly assert the "no reaction appended" behavior alongside the
existing assertions about judged/failed/improved and inserts length.

In `@tests/shared/skillopt-meta.test.ts`:
- Around line 35-41: Replace the loose regex assertions that inspect
prior.join(" ") with exact-value assertions on the prior array produced by
priorEditSummaries (created via metaEntryFor) so the test fails only on real
output changes: locate the test case using metaEntryFor("p","k",...) and the
prior = priorEditSummaries(...) call, then assert prior equals the expected
array (or assert specific prior[0] and prior[1] strings) that contain the exact
"delete @\"old rule\"" and "append: x" entries instead of using toMatch on the
joined string.

In `@tests/shared/skillopt-trigger.test.ts`:
- Around line 121-122: The test suite currently overwrites
process.env.HIVEMIND_STATE_DIR in beforeEach and unconditionally deletes it in
afterEach, which can leak global env mutation; modify the hooks so beforeEach
saves the original value (e.g., const prevHivemindStateDir =
process.env.HIVEMIND_STATE_DIR) before assigning tmp (created via mkdtempSync
and stored in tmp) and calling spawnMock.mockClear(), and then have afterEach
restore process.env.HIVEMIND_STATE_DIR to that saved prevHivemindStateDir (or
delete it only if prev was undefined) instead of always deleting it.
- Around line 134-145: Replace the substring assertion on the worker script with
a path-specific exact match: in tests/shared/skillopt-trigger.test.ts where
spawnWorker is exercised via runEventTrigger and you inspect
spawnMock.mock.calls (args[0]), call path.basename(args[0]) and assert it equals
"skillopt-worker.js" (expect(path.basename(args[0])).toBe("skillopt-worker.js"))
so path-shape changes won’t hide regressions; ensure you import/require the path
module in the test file.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 562b933c-1584-404e-955c-9ab0c56ef902

📥 Commits

Reviewing files that changed from the base of the PR and between 0bde430 and 53a4952.

📒 Files selected for processing (6)
  • tests/shared/agent-model.test.ts
  • tests/shared/skill-invocations.test.ts
  • tests/shared/skill-org-publish.test.ts
  • tests/shared/skillopt-improve.test.ts
  • tests/shared/skillopt-meta.test.ts
  • tests/shared/skillopt-trigger.test.ts

@kaghni kaghni merged commit ed7f7c2 into main Jun 8, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants