feat(cli): persistent skill cache to skip prpm install on repeat launches by khaliqgant · Pull Request #124 · AgentWorkforce/workforce

khaliqgant · 2026-05-15T19:25:19Z

Summary

Two layers of caching for agentworkforce agent <persona> skill installs, so repeat launches stop re-running npx prpm install / npx skills add.

Layer 1 — persistent source-keyed cache. Content-addressed dir under ~/.agentworkforce/workforce/cache/plugins/<fingerprint>/, keyed by (harness, sorted skill sources, local-file SHA). On a hit the install subprocess is skipped entirely — claude reuses it as --plugin-dir; opencode/codex mirror it into the relayfile mount. Source-string or local-file edits rotate the fingerprint automatically.

Layer 2 — opt-in upstream drift detection. Because a floating ref (@org/skill, github.com/org/repo#x) doesn't change the fingerprint when upstream publishes, the marker (now schema v2) records each remote skill's upstream identity at install time and re-probes it on a TTL:

Source	Recorded	Probe
prpm	resolved `version` from `prpm.lock`	`GET registry.prpm.dev/api/v1/packages/<pkg>` → `latest_version.version`
skill.sh	GitHub blob SHA of `SKILL.md` (path from `skills-lock.json`)	GitHub Contents API w/ `If-None-Match` → 304 when unchanged
local `.md`	—	covered by the fingerprint's content hash

If now - lastUpstreamCheckAt > interval (default 24h), parallel probes run; any drift downgrades the cache hit to a miss and the reinstall self-heals the marker. Every probe fails open — a network error, timeout, 4xx/5xx, or malformed body is treated as "no drift" so a flaky registry never blocks or slows a launch.

Control surface

Flag / env	Effect
(default)	cache on; upstream check every 24h
`--refresh-skills`	unconditional reinstall into the cache dir
`--no-skill-cache` / `AGENTWORKFORCE_NO_SKILL_CACHE=1`	bypass the cache entirely (per-session install)
`--check-upstream`	force a drift probe this launch (ignore TTL)
`--no-check-upstream`	skip the drift probe this launch
`AGENTWORKFORCE_SKILL_CACHE_CHECK_INTERVAL`	`24h` default; `0`=always, `never`/`off`=disable; accepts `ms/s/m/h/d`, bare number = hours
`--install-in-repo`	disengages caching (installs land in the repo, by design)

Tests

persona-kit: 209 tests (# fail 0) — incl. 12 fingerprint/marker tests, 22 new drift/probe tests (mocked HTTP: prpm & github success, 304, 404/5xx fail-open, timeout, mixed sets, v1→v2 migration, TTL math, marker round-trip).
cli: 188 tests (# fail 0) — incl. flag-parse coverage for all four new flags.
Full repo: 590/590 passing across 8 packages.

End-to-end (real services, not mocked)

Verified against live prpm.dev + api.github.com:

Probe layer: recorded 1.0.0 → detects 1.0.0 → 1.1.3; recorded current → no drift; GitHub stale SHA → drift; GitHub real SHA → 304 "unchanged".
Full CLI cycle with a real prpm skill (@agent-relay/choosing-swarm-patterns):
1. cache miss → real prpm install → marker records resolved 1.1.3
2. in-TTL relaunch → plain cache hit, no probe (fast path)
3. --check-upstream + current → "Skills up to date — reusing cache"
4. tampered marker (1.0.0) + --check-upstream → live-detected drift → reinstall → marker self-heals to 1.1.3
5. tampered marker + --no-check-upstream → probe skipped, plain cache hit
Layer-1 e2e (claude --plugin-dir, opencode mount mirror, --refresh-skills, --no-skill-cache, local-file edit invalidation) all confirmed.

Decision trail

A trail trajectory documenting the design decisions (cache shape, never-auto-invalidate fingerprint, all-harness scope, drift-probe choice, precise blob SHA vs. repo-HEAD, schema-v2 migration) is committed at .trajectories/completed/2026-05/traj_47ulsb0rwbid.{json,md}.

🤖 Generated with Claude Code

…ches Every `agentworkforce agent <persona>` previously spawned a fresh `npx prpm install …` per declared skill, even when the skill set hadn't changed — `npx` resolution + registry lookups + tarball fetches added several seconds of latency to every launch. Introduce a content-addressed cache under `~/.agentworkforce/workforce/cache/plugins/<fingerprint>/` keyed by a SHA-256 of `(harness, sorted skill sources, local file contents)`. On a hit, the install subprocess is skipped entirely: - claude reuses the cache dir directly as `--plugin-dir <cacheDir>`. - opencode / codex mirror the cache contents into the relayfile mount before launch (mount-ignored patterns keep this from syncing back to the user's repo). Invalidation is conservative: changing a skill source string (or editing a local `.md` skill) rotates the fingerprint and produces a fresh cache entry. To force-refresh past unchanged remote sources, pass `--refresh-skills`; to bypass the cache entirely, pass `--no-skill-cache` or set `AGENTWORKFORCE_NO_SKILL_CACHE=1`. `--install-in-repo` disengages caching — the user explicitly opted into having installs land in their working tree, so mirroring a cache back into the repo would silently re-introduce the leakage that flag is designed to surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-15T19:25:33Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 2a7f1126-e80b-43e3-bc8e-fe8e62a8950a

📥 Commits

Reviewing files that changed from the base of the PR and between 2afd176 and 9ab2ba7.

📒 Files selected for processing (2)

packages/cli/src/cli.test.ts
packages/cli/src/cli.ts

🚧 Files skipped from review as they are similar to previous changes (1)

packages/cli/src/cli.ts

📝 Walkthrough

Walkthrough

Adds a persistent, content-addressed skill-install cache and upstream drift probing; exposes cache/probe APIs via persona-kit; adds CLI flags (--no-skill-cache, --refresh-skills, upstream check controls) and integrates cache-aware install/mirror logic, locking, and TTL-gated upstream re-checks into runInteractive and mount/non-mount flows.

Changes

Persistent skill-cache system

Layer / File(s)	Summary
Skill-cache module and comprehensive tests `packages/persona-kit/src/skill-cache.ts`, `packages/persona-kit/src/skill-cache.test.ts`	Core module implements canonical SHA-256 fingerprinting (harness + stable-sorted skills, folding in local `.md` file hashes), marker persistence (`.aw-skill-cache.json`, schema v2, v1 read-compatible), marker read/write/update, and `isSkillCacheValid`. Tests cover fingerprint determinism, order-independence, local-content sensitivity, marker round-trip, schema upgrade, upstream round-trip, and `updateSkillCacheMarkerUpstream` behavior.
Upstream drift probe module and tests `packages/persona-kit/src/skill-upstream-probe.ts`, `packages/persona-kit/src/skill-upstream-probe.test.ts`	Install-time builders extract prpm/GitHub identities from cache lockfiles; launch-time probe re-checks registry/GitHub blob SHAs (with conditional GET/ETag) in parallel using injectable fetch with timeout and fail-open per-skill. Scheduling helpers parse intervals and decide due checks. Tests validate parsing, due logic, record building, conditional GET behavior, drift detection, and fail-open/timeouts.
Persona-kit barrel exports `packages/persona-kit/src/index.ts`	Re-exports skill-cache APIs/types (`computeSkillCacheFingerprint`, `skillCacheRoot`, `resolveSkillCacheDir`, `readSkillCacheMarker`, `writeSkillCacheMarker`, `isSkillCacheValid`, `updateSkillCacheMarkerUpstream`, types) and upstream-probe APIs/types (`buildUpstreamRecordsFromCacheDir`, `detectSkillUpstreamDrift`, `parseCheckInterval`, `isUpstreamCheckDue`, types).
CLI imports, helpers, flags, and runInteractive wiring `packages/cli/src/cli.ts`	Adds `fs.cpSync` import and persona-kit cache/probe imports; documents `--no-skill-cache` and `--refresh-skills` in agent help; introduces advisory lock + mirror/write helpers; extends `runInteractive` options; computes fingerprint/cacheDir, checks marker (and optional upstream TTL-based re-check), and on hit mirrors cache or on miss/refresh installs into persistent cache, writes marker, then mirrors. Suppresses installer cleanup when caching is enabled; exports `resolveEnvCheckIntervalMs` and `acquireSkillCacheLock`.
CLI flag parsing & lock tests `packages/cli/src/cli.test.ts`	Adds `parseAgentArgs` test(s) verifying `--no-skill-cache` and `--refresh-skills` parse into `AgentFlags` and preserve persona selector positional arguments; adds `resolveEnvCheckIntervalMs` tests and concurrency tests for `acquireSkillCacheLock` (acquire/block/release and stale/dead-PID stealing).
Trajectories docs & index `.trajectories/completed/*`, `.trajectories/index.json`	Adds a completed trajectory JSON and markdown documenting the persistent cache and upstream probe design and updates the trajectories index `lastUpdated` and entries.

Sequence Diagram

sequenceDiagram
  participant User
  participant CLI as runInteractive
  participant PersonaKit as persona-kit
  participant Installer
  participant CacheFS as cache dir (~/.agentworkforce/...)
  User->>CLI: invoke agent (flags)
  CLI->>PersonaKit: computeSkillCacheFingerprint(harness, skills)
  CLI->>PersonaKit: resolveSkillCacheDir(fingerprint)
  CLI->>PersonaKit: readSkillCacheMarker(cacheDir)
  alt marker valid and not refresh
    CLI->>CacheFS: mirrorSkillCacheInto(cacheDir, target)
  else marker missing/stale or refresh requested
    CLI->>Installer: run installer into cacheDir
    Installer->>PersonaKit: writeSkillCacheMarker(cacheDir, marker)
    CLI->>CacheFS: mirrorSkillCacheInto(cacheDir, target)
  end
  alt marker TTL expired & check-upstream enabled
    CLI->>PersonaKit: detectSkillUpstreamDrift(marker)
    PersonaKit-->>CLI: drift result
    CLI->>Installer: possibly reinstall if drifted
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

willwashburn

Poem

🐰 I stitched a cache of skills, neat and small,
Fingerprints hum so installs need not sprawl,
Mirrors copy blossoms when the hashes align,
Or refresh the spring when the flags draw a line,
The workforce naps safe in a cached little hall ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 51.22% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main objective: introducing a persistent skill cache for CLI launches to skip unnecessary installs, which aligns with the core changeset across multiple files.
Description check	✅ Passed	The description is highly detailed and directly related to the changeset, explaining both the two-layer caching system (persistent source-keyed cache and upstream drift detection) and the control surface, with test results and end-to-end verification.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/persistent-skill-cache

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/cli/src/cli.test.ts`:
- Around line 142-163: The test relies on parseAgentArgs reading
process.env.AGENTWORKFORCE_NO_SKILL_CACHE which can make it flaky; modify the
test (the 'parseAgentArgs: --no-skill-cache and --refresh-skills set flags'
case) to explicitly isolate that env var by saving the original
process.env.AGENTWORKFORCE_NO_SKILL_CACHE, then unset or set it to a known value
before calling parseAgentArgs and restore the original value after the
assertions; ensure this isolation is applied around each call that expects the
default (false) so parseAgentArgs sees a controlled environment.

In `@packages/cli/src/cli.ts`:
- Around line 1378-1390: When the user requested a refresh, the install
currently runs into the existing cache dir which can preserve
lockfiles/artifacts; update the install flow around the block that checks
install.commandString, deferInstallToMount, and skillCacheHit so that if the
refresh-skills flag is set you delete/clean skillCacheDir (when
skillCacheFingerprint && skillCacheDir are present) before calling
runInstall(install.command, installLabel); apply the same change to the later
analogous install block (the second runInstall/writeSkillCacheMarker block) so
both flows clear the cache dir prior to reinstalling.
- Around line 1299-1303: The cache validity check using skillCacheDir,
skillCacheFingerprint, isSkillCacheValid and options.refreshSkills must be
protected by a per-fingerprint inter-process lock to prevent two processes from
simultaneously installing into the same cache; wrap the
read-check-and-possible-install sequence in an exclusive lock keyed by the
fingerprint (e.g., create/acquire a fingerprint-specific lock file like
`${skillCacheFingerprint}.lock` in the cache dir or use an OS/file-locking
library) before calling isSkillCacheValid and before performing the install path
that writes into the cache, ensure the lock is always released (use try/finally)
and fall back to re-checking isSkillCacheValid after acquiring the lock so
redundant installs are avoided.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 2bc42043-8182-4578-81dd-7e002bdf8268

📥 Commits

Reviewing files that changed from the base of the PR and between 1a9352e and 5232fbd.

📒 Files selected for processing (5)

packages/cli/src/cli.test.ts
packages/cli/src/cli.ts
packages/persona-kit/src/index.ts
packages/persona-kit/src/skill-cache.test.ts
packages/persona-kit/src/skill-cache.ts

devin-ai-integration

Devin Review found 1 potential issue.

View 5 additional findings in Devin Review.

devin-ai-integration · 2026-05-15T19:34:17Z

+        if (skillCacheHit && skillCacheDir) {
+          mirrorSkillCacheInto(skillCacheDir, handle.mountDir);
+        } else if (skillCachingEnabled && skillCacheDir && skillCacheFingerprint) {
+          mkdirSync(skillCacheDir, { recursive: true });
+          await runInstallOrThrow(install.command, installLabel, skillCacheDir);
+          writeSkillCacheMarker(skillCacheDir, {
+            fingerprint: skillCacheFingerprint,
+            harness,
+            skills: effectiveSelection.skills.map((s) => ({
+              id: s.id,
+              source: s.source
+            }))
+          });
+          mirrorSkillCacheInto(skillCacheDir, handle.mountDir);
+        } else {
+          await runInstallOrThrow(install.command, installLabel, handle.mountDir);
+        }


🔴 --refresh-skills does not remove old cache dir before reinstalling, leaving stale artifacts and risking corrupted cache on failure

When --refresh-skills is used, the code sets skillCacheHit = false at cli.ts:1302 to bypass the cache, but never removes or cleans the existing cache directory before running the install into it. The help text at line 142 says "overwriting any existing cache dir" but the install simply runs on top of the existing contents.

This has two consequences:

Stale artifacts on success: If an upstream skill package changed structure between versions (removed files), those old files persist in the cache dir and get served to the harness via --plugin-dir (claude) or mirrorSkillCacheInto (mount harnesses at cli.ts:1611).

Corrupted cache after partial failure: If the refresh install partially completes then fails (runInstallOrThrow throws at cli.ts:1602), the old cache marker from the previous successful install remains on disk. On the next run without --refresh-skills, isSkillCacheValid at cli.ts:1303 matches the old marker's fingerprint and returns a false cache hit, causing the harness to see a mix of old and partially-updated skill artifacts.

Prompt for agents

The --refresh-skills flag needs to clean the existing cache directory before reinstalling to ensure a truly fresh install. There are two code paths that need this fix: 1. The mount-deferred install path (cli.ts around line 1598-1614): Before the else-if branch that runs runInstallOrThrow into skillCacheDir, the old cache dir should be removed (rmSync with recursive+force) so that stale files dont persist and the old marker is cleared. 2. The claude non-mount install path (cli.ts around line 1378-1391): Similarly, when refreshSkills is true, the cache dir should be cleaned before runInstall runs. A minimal fix would be to add something like: if (options.refreshSkills && skillCacheDir && existsSync(skillCacheDir)) { rmSync(skillCacheDir, { recursive: true, force: true }); } before the install runs in both paths. This ensures (a) stale files are removed on successful refresh, and (b) the old marker is gone if the refresh fails, so subsequent runs correctly see a cache miss instead of a false hit.

Was this helpful? React with 👍 or 👎 to provide feedback.

cubic-dev-ai

No issues found across 5 files

_{Re-trigger cubic}

…che hit The pure cache-hit path is a near-instant cpSync that doesn't print its own spinner — but the surrounding setupSpinner.stop()/start() was redrawing the "Setting up sandbox mount…" line a second time. Only stop/start the setup spinner around branches that actually run an install spinner of their own. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The source-key fingerprint never rotates on an upstream publish when the persona's `source` string is unchanged (floating refs like `@org/skill` or `github.com/org/repo#x`). This adds a cheap, opt-in second check so a new upstream version is actually consumed without a manual refresh. Marker schema bumped to v2 (v1 stays read-compatible, upgraded in place with no upstream records). At install time each remote skill's upstream identity is recorded from the installer lockfiles: - prpm → resolved `version` from prpm.lock - skill.sh → GitHub blob SHA of the installed SKILL.md, located via skills-lock.json `skillPath` On launch, if the marker's `lastUpstreamCheckAt` is older than the interval (default 24h), parallel lightweight probes compare recorded vs. current — prpm registry GET, GitHub Contents API with `If-None-Match` (304, no body, when unchanged). Any drift downgrades the cache hit to a miss so the reinstall picks up the new content; the marker then self-heals. Every probe fails OPEN: a network error, timeout, 4xx/5xx, or malformed body counts as "no drift" so a flaky registry never blocks or slows a launch beyond the timeout. Control surface: - `--check-upstream` force a probe this launch (ignore the TTL) - `--no-check-upstream` skip the probe this launch - `AGENTWORKFORCE_SKILL_CACHE_CHECK_INTERVAL` (24h default; `0`=always, `never`/`off`=disable; accepts ms/s/m/h/d, bare number = hours) Tests: 22 new persona-kit unit tests (mocked HTTP — prpm/github success, 304, 404/5xx fail-open, timeout, mixed sets, v1→v2 migration, marker round-trip, TTL math) + 1 CLI flag-parse test. Verified end-to-end against live prpm.dev and api.github.com: install records the resolved version, in-TTL launches skip probing, a tampered stale version triggers a live-detected reinstall (1.0.0 → 1.1.3), the marker self-heals, and --no-check-upstream bypasses. A decision trajectory for this work is recorded under .trajectories/completed/2026-05/traj_47ulsb0rwbid. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

🧹 Nitpick comments (1)

packages/persona-kit/src/skill-cache.test.ts (1)

100-124: 💤 Low value

Minor: Test setup could be simplified.

Lines 103-105 attempt to write to a non-existent path, catch the error silently, then create the directory. This works but is confusing. You could simplify by just doing the mkdir with { recursive: true } first.

♻️ Suggested simplification

 test('computeSkillCacheFingerprint: local skill content changes invalidate', async () => {
   await withTmpDir(async (dir) => {
     const skillPath = 'skills/foo.md';
-    await writeFile(join(dir, 'skills', 'foo.md').replace('foo.md', ''), '', 'utf8').catch(
-      () => undefined
-    );
-    // Ensure the dir exists, then write content.
-    await import('node:fs/promises').then((fs) =>
-      fs.mkdir(join(dir, 'skills'), { recursive: true })
-    );
+    const { mkdir } = await import('node:fs/promises');
+    await mkdir(join(dir, 'skills'), { recursive: true });
     await writeFile(join(dir, skillPath), '# original', 'utf8');

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/persona-kit/src/skill-cache.test.ts` around lines 100 - 124, In the
test computeSkillCacheFingerprint: local skill content changes invalidate,
simplify setup by creating the skills directory before writing files: replace
the initial writeFile(...).catch(...) with an explicit await mkdir(join(dir,
'skills'), { recursive: true }) (you already call fs.mkdir later) so that the
skills directory exists prior to writeFile; adjust/remove the redundant
import/second mkdir call and ensure the test still writes join(dir, skillPath)
before computing fingerprints in this test function.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/persona-kit/src/skill-cache.test.ts`:
- Around line 100-124: In the test computeSkillCacheFingerprint: local skill
content changes invalidate, simplify setup by creating the skills directory
before writing files: replace the initial writeFile(...).catch(...) with an
explicit await mkdir(join(dir, 'skills'), { recursive: true }) (you already call
fs.mkdir later) so that the skills directory exists prior to writeFile;
adjust/remove the redundant import/second mkdir call and ensure the test still
writes join(dir, skillPath) before computing fingerprints in this test function.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 73177675-b733-4221-8fb5-cb8a5e00890c

📥 Commits

Reviewing files that changed from the base of the PR and between 09caf5b and 2afd176.

📒 Files selected for processing (10)

.trajectories/completed/2026-05/traj_47ulsb0rwbid.json
.trajectories/completed/2026-05/traj_47ulsb0rwbid.md
.trajectories/index.json
packages/cli/src/cli.test.ts
packages/cli/src/cli.ts
packages/persona-kit/src/index.ts
packages/persona-kit/src/skill-cache.test.ts
packages/persona-kit/src/skill-cache.ts
packages/persona-kit/src/skill-upstream-probe.test.ts
packages/persona-kit/src/skill-upstream-probe.ts

✅ Files skipped from review due to trivial changes (1)

.trajectories/completed/2026-05/traj_47ulsb0rwbid.json

🚧 Files skipped from review as they are similar to previous changes (2)

packages/cli/src/cli.test.ts
packages/cli/src/cli.ts

cubic-dev-ai

3 issues found across 10 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name=".trajectories/completed/2026-05/traj_47ulsb0rwbid.json">

<violation number="1" location=".trajectories/completed/2026-05/traj_47ulsb0rwbid.json:132">
P2: Avoid committing a user-specific absolute filesystem path in `projectId`; use a stable, repo-agnostic identifier instead.</violation>
</file>

<file name="packages/persona-kit/src/skill-upstream-probe.ts">

<violation number="1" location="packages/persona-kit/src/skill-upstream-probe.ts:344">
P2: Missing upstream metadata for remote skills is forced to `drifted: true`, so transient install-time probe failures can repeatedly trigger unnecessary reinstalls and negate cache reuse.</violation>
</file>

<file name="packages/cli/src/cli.ts">

<violation number="1" location="packages/cli/src/cli.ts:1406">
P2: `AGENTWORKFORCE_SKILL_CACHE_CHECK_INTERVAL=never` is currently ignored because `null` from `parseCheckInterval` is coalesced to the default interval. This causes upstream drift checks to run when the user explicitly asked to disable them.</violation>
</file>

_{Tip: Review your code locally with the cubic CLI to iterate faster.
Fix all with cubic
Re-trigger cubic}

cubic-dev-ai · 2026-05-15T21:03:54Z

+  ],
+  "commits": [],
+  "filesChanged": [],
+  "projectId": "/Users/khaliqgant/Projects/AgentWorkforce/workforce",


P2: Avoid committing a user-specific absolute filesystem path in projectId; use a stable, repo-agnostic identifier instead.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At .trajectories/completed/2026-05/traj_47ulsb0rwbid.json, line 132: <comment>Avoid committing a user-specific absolute filesystem path in `projectId`; use a stable, repo-agnostic identifier instead.</comment> <file context> @@ -0,0 +1,144 @@ + ], + "commits": [], + "filesChanged": [], + "projectId": "/Users/khaliqgant/Projects/AgentWorkforce/workforce", + "tags": [], + "_trace": { </file context>

_{Tip: Review your code locally with the cubic CLI to iterate faster.}

cubic-dev-ai · 2026-05-15T21:03:54Z

+          skillId: skill.id,
+          source: skill.source,
+          kind: 'prpm',
+          drifted: true,


P2: Missing upstream metadata for remote skills is forced to drifted: true, so transient install-time probe failures can repeatedly trigger unnecessary reinstalls and negate cache reuse.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At packages/persona-kit/src/skill-upstream-probe.ts, line 344: <comment>Missing upstream metadata for remote skills is forced to `drifted: true`, so transient install-time probe failures can repeatedly trigger unnecessary reinstalls and negate cache reuse.</comment> <file context> @@ -0,0 +1,464 @@ + skillId: skill.id, + source: skill.source, + kind: 'prpm', + drifted: true, + note: 'no upstream record (pre-v2 marker) — reinstall to capture identity' + }; </file context>

…-interval Four reviewer findings (CodeRabbit / Devin / cubic): 1. `AGENTWORKFORCE_SKILL_CACHE_CHECK_INTERVAL=never` was ignored: `parseCheckInterval` returns `null` for never/off/false, but the `?? DEFAULT` coalesced `null` back to the 24h default, silently re-enabling drift checks the user disabled. New `resolveEnvCheckIntervalMs()` only falls back on `undefined`. 2. `--refresh-skills` now wipes the cache dir before reinstalling, so a refresh is a true rebuild — stale files from a prior skill version no longer linger, and a partial-failure reinstall can't leave the old marker behind to fake a later cache hit. Applied to both the claude pre-mount path and the mount-deferred path. 3. Added a per-fingerprint advisory install lock (`<cacheDir>.lock`, sibling so a refresh-wipe can't delete a live lock). Two concurrent launches of the same persona no longer both miss and install into the same dir. Stale/dead-pid locks are stolen; if the wait budget is exceeded we proceed unlocked (never hang a launch). After acquiring, validity is re-checked so a peer's just-finished install is reused instead of redundantly reinstalled. 4. Test isolation: the parseAgentArgs flag test now saves/clears/ restores AGENTWORKFORCE_NO_SKILL_CACHE so a machine with that env set can't flake the default-false assertions. Tests: +5 (resolveEnvCheckIntervalMs never/default/0; lock acquire- block-release, stale-steal, dead-pid-steal). Full repo 595/595 green. E2E re-verified: a planted stale file is gone after --refresh-skills and no lock file is left behind. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

khaliqgant · 2026-05-15T21:23:01Z

Addressed all review findings in 9ab2ba7:

never interval ignored (cubic, cli.ts) — fixed. parseCheckInterval returns null for never/off/false, but ?? DEFAULT coalesced null back to 24h. Replaced with resolveEnvCheckIntervalMs() which only falls back to the default on undefined (unset/unparseable); null now correctly disables drift checks. Regression test added.
--refresh-skills left stale artifacts (CodeRabbit + Devin) — fixed. The cache dir is now rmSync'd before the reinstall in both install paths (claude pre-mount + mount-deferred). This both removes stale files from a prior skill version and clears the old marker so a partial-failure reinstall can't fake a later cache hit. The wipe runs under the install lock so a peer's in-progress install isn't clobbered. E2E-verified (planted stale file is gone after refresh).
Concurrent installs into the same cache dir (CodeRabbit) — fixed. Added a per-fingerprint advisory lock at <cacheDir>.lock (sibling file, so a refresh-wipe of the dir can't delete a live lock). Created atomically with wx/O_EXCL. A held lock is stolen only if the holder pid is dead or its heartbeat is >15min stale; if the 10min wait budget is exceeded we proceed unlocked rather than hang a launch. After acquiring, isSkillCacheValid is re-checked so a peer's just-completed install is reused instead of redundantly reinstalled. Tests cover acquire/block/release, stale-steal, and dead-pid-steal.
Env-dependent test flake (CodeRabbit, cli.test.ts) — fixed. The parseAgentArgs flag test now saves/clears/restores AGENTWORKFORCE_NO_SKILL_CACHE.

Full repo: 595/595 tests green (+5 new).

cubic-dev-ai

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/cli/src/cli.ts">

<violation number="1" location="packages/cli/src/cli.ts:724">
P1: The stale-lock check can steal a live lock after 15 minutes because it returns stale on timestamp age before validating PID liveness, which can re-enable concurrent installs into the same cache dir.</violation>
</file>

_{Tip: Review your code locally with the cubic CLI to iterate faster.
Fix all with cubic
Re-trigger cubic}

cubic-dev-ai · 2026-05-15T21:26:50Z

+    if (Number.isFinite(ts) && Date.now() - ts > SKILL_CACHE_LOCK_STALE_MS) {
+      return true;
+    }
+    const pid = Number(pidRaw);
+    if (Number.isInteger(pid) && pid > 0 && !skillCachePidAlive(pid)) {
+      return true;
+    }


P1: The stale-lock check can steal a live lock after 15 minutes because it returns stale on timestamp age before validating PID liveness, which can re-enable concurrent installs into the same cache dir.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At packages/cli/src/cli.ts, line 724: <comment>The stale-lock check can steal a live lock after 15 minutes because it returns stale on timestamp age before validating PID liveness, which can re-enable concurrent installs into the same cache dir.</comment> <file context> @@ -677,6 +677,122 @@ function sessionMountDir(sessionRoot: string): string { + try { + const [pidRaw, tsRaw] = readFileSync(lockPath, 'utf8').split('\n'); + const ts = Number(tsRaw); + if (Number.isFinite(ts) && Date.now() - ts > SKILL_CACHE_LOCK_STALE_MS) { + return true; + } </file context>

Suggested change

if (Number.isFinite(ts) && Date.now() - ts > SKILL_CACHE_LOCK_STALE_MS) {

return true;

}

const pid = Number(pidRaw);

if (Number.isInteger(pid) && pid > 0 && !skillCachePidAlive(pid)) {

return true;

}

const pid = Number(pidRaw);

if (Number.isFinite(ts) && Date.now() - ts > SKILL_CACHE_LOCK_STALE_MS) {

if (!Number.isInteger(pid) || pid <= 0 || !skillCachePidAlive(pid)) {

return true;

}

}

if (Number.isInteger(pid) && pid > 0 && !skillCachePidAlive(pid)) {

return true;

}

_{Tip: Review your code locally with the cubic CLI to iterate faster.}

coderabbitai Bot reviewed May 15, 2026

View reviewed changes

Comment thread packages/cli/src/cli.test.ts

Comment thread packages/cli/src/cli.ts Outdated

Comment thread packages/cli/src/cli.ts

devin-ai-integration Bot reviewed May 15, 2026

View reviewed changes

cubic-dev-ai Bot reviewed May 15, 2026

View reviewed changes

Ricky Schema Cascade and others added 2 commits May 15, 2026 22:14

coderabbitai Bot reviewed May 15, 2026

View reviewed changes

cubic-dev-ai Bot reviewed May 15, 2026

View reviewed changes

Conversation

khaliqgant commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Control surface

Tests

End-to-end (real services, not mocked)

Decision trail

Uh oh!

coderabbitai Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

khaliqgant commented May 15, 2026

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

khaliqgant commented May 15, 2026 •

edited

Loading

coderabbitai Bot commented May 15, 2026 •

edited

Loading

cubic-dev-ai Bot May 15, 2026 •

edited

Loading

cubic-dev-ai Bot May 15, 2026 •

edited

Loading

cubic-dev-ai Bot May 15, 2026 •

edited

Loading