Skip to content

feat(sync-state): per-skill content hashes for drift detection#382

Merged
christso merged 1 commit into
mainfrom
feat/374-content-hashes
May 12, 2026
Merged

feat(sync-state): per-skill content hashes for drift detection#382
christso merged 1 commit into
mainfrom
feat/374-content-hashes

Conversation

@christso
Copy link
Copy Markdown
Contributor

Summary

Adds per-skill contentHash to sync-state under a new sources map keyed by owner/repo. Each entry records the resolved ref + commit SHA and a hash for every installed skill. This is the lockfile-shaped foundation allagents skill update --dry-run (#375) needs to detect drift without re-fetching.

  • Hash algorithm: sha256 over a sorted list of per-file sha256(content) digests. No mtimes; reproducible across machines and re-clones.
  • Field names chosen as a strict subset of gh-skill's .skill-lock.json (contentHashskillFolderHash) so a future migration to that file format is mechanical.
  • Backward-compatible: a pre-existing sync-state.json without sources loads without error.

Also fixes a pre-existing project-scope marketplace resolution bug in addPlugin (resolvePluginSpecWithAutoRegister was being called without workspacePath, hiding freshly-registered project marketplaces). Without that fix the issue's skills add brainstorming --from obra/superpowers gate fails.

Test plan

  • bun run build
  • bun test — 1191 pass / 0 fail
  • Verification gate from feat: skill-level content hashes in sync-state for drift detection (lockfile interop) #374 (using stable obra/superpowers per the issue's "Substitute the example repo if needed" note):
    • After install, .sources["obra/superpowers"].resolvedSha ≥ 7 chars and .skills.brainstorming.contentHash is sha256:-prefixed.
    • Remove + re-install produces the same hash (deterministic).
    • A local edit to .claude/skills/brainstorming/SKILL.md changes the on-disk sha vs. the recorded hash (drift detectable).
    • A pre-existing sync-state.json without sources still loads (workspace status exits 0).

Closes #374

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 12, 2026

Deploying allagents with  Cloudflare Pages  Cloudflare Pages

Latest commit: bb93391
Status:⚡️  Build in progress...

View logs

@christso
Copy link
Copy Markdown
Contributor Author

Code Review: ⚠️ APPROVE WITH SIGNIFICANT NON-BLOCKING CONCERNS

Functional design

Algorithm choice is good: per-file sha256(content), sorted by POSIX relative path, folded into an outer sha256. No mtimes. Reproducible. Schema field names (contentHash, pluginSpec, resolvedRef, resolvedSha, pinnedRef) are a strict subset of gh-skill's .skill-lock.json as recommended by the issue — future migration would be mechanical at the rename level (the hash values would still differ, since the gh-skill folding algorithm isn't matched, but the issue scopes interop to field names).

Verification gate (#374)

  • (1) .sources[...].resolvedSha ≥ 7 chars; .skills.<name>.contentHash is sha256:-prefixed ✓
  • (2) Re-install → same hash (deterministic) ✓
  • (3) Local edit changes the on-disk hash (drift detectable) ✓
  • (4) Pre-existing sync-state.json without sources loads (schema field is .optional()) ✓

CLAUDE.md compliance

  • Conventional commit ✓
  • No Co-Authored-By
  • Test coverage thin — see below.

Non-blocking issues

1. Major overlap with #381 — these are not independent PRs.

Both PRs add:

Whichever lands first will create a substantial rebase for the other. Recommend: merge #381 first (smaller-scoped, version-pinning is the prerequisite mental model), then rebase #382 to extend SyncStateSourceSchema with skills and replace recordSourceProvenance with recordContentProvenance (which would be the superset). The PR descriptions of both already acknowledge they're "designed to land together"; making the dependency explicit in merge order avoids losing fields.

2. Test coverage is thin for a feature this central (CLAUDE.md: "one test per distinct code path")

Zero unit tests for new code:

  • computeSkillFolderHashthe algorithm of this PR. Determinism, sort-order independence (intentional), and the "mtimes excluded" property should each be pinned by a test. A simple two-file fixture with shuffled readdir order would lock in the sort behavior.
  • upsertSyncStateSource and upsertSyncStateSkill — non-atomic read-modify-write, multiple branches (existing source vs. placeholder creation).
  • resolveSkillFolder — three candidate path strategies; each branch is a distinct code path.
  • recordContentProvenance (skip on non-GitHub source, skip on missing fetch result, full happy path).

The PR's "1191 pass / 0 fail" is the pre-existing test count; the only new tests appear to be the verification-gate E2E checks run manually.

3. upsertSyncStateSkill creates a placeholder source on missing-key.

const currentSource: SyncStateSource = currentSources[sourceKey] ?? {
  pluginSpec: sourceKey,
  resolvedRef: 'HEAD',
  resolvedSha: '',
};

If a caller calls upsertSyncStateSkill without first calling upsertSyncStateSource, the sync-state file ends up with resolvedSha: '' — invalid as a SHA but still passing the schema (which is just z.string()). The doc comment hand-waves this as "callers usually call upsertSyncStateSource first." Relying on call ordering is fragile; either:

  • Require the caller to pass the source (no fallback), or
  • Tighten the schema with z.string().regex(/^[0-9a-f]{40}$/).or(z.literal('')) and document the placeholder semantic explicitly.

recordContentProvenance does happen to call both in the right order, so today this is latent. Worth fixing before another caller is added.

4. Drive-by fix in addPlugin.

-    const resolved = await resolvePluginSpecWithAutoRegister(plugin);
+    const resolved = await resolvePluginSpecWithAutoRegister(plugin, { workspacePath });

This is described in the PR body as fixing "a pre-existing project-scope marketplace resolution bug." It's a legitimate one-liner, but:

  • It belongs in its own commit (the PR is one squash commit covering both).
  • No unit test added for the fixed behavior.

If you keep the squash, please add at least one line to the squash commit message: Also fixes: addPlugin missed project-scope marketplaces when resolving plugin@market specs. so it's discoverable in git log.

5. walk() ignores symlinked directories.

entry.isDirectory() returns false for symlinks. A skill folder containing a symlinked subdirectory would silently omit those files from the hash. Probably not a real concern for skills (which are flat-ish markdown), but worth a sentence in the doc comment so future readers don't assume full tree traversal.


Suggested merge order

Approves on functional merit. If addressing the above:

Extends sync-state with a `sources` map keyed by `owner/repo`, each carrying
the resolved git ref + commit SHA and a per-skill `contentHash` derived
deterministically from the on-disk skill folder contents. Lays the groundwork
for `allagents skill update --dry-run` (issue #375) to diff installed vs.
upstream without re-fetching.

- src/models/sync-state.ts: add SyncStateSkillSchema + SyncStateSourceSchema
  with optional `skills` map. Field names are a strict subset of gh-skill's
  `.skill-lock.json` (`contentHash` ↔ `skillFolderHash`) so a future migration
  is mechanical.
- src/core/sync-state.ts: add `computeSkillFolderHash` — sha256 over a sorted
  list of per-file sha256 digests, no mtimes, so the hash is reproducible
  across machines and re-clones. Also `upsertSyncStateSource` /
  `upsertSyncStateSkill` for atomic field updates that preserve the rest of
  the sync-state file.
- src/core/plugin.ts: `fetchPlugin` returns `resolvedRef` and `resolvedSha`
  (via a private `resolveHeadSha` helper using `git rev-parse HEAD`).
- src/cli/commands/plugin-skills.ts: after a successful `skills add --from`
  install (single or `--all`), record the source provenance and per-skill
  content hash via the new helpers.
- src/core/workspace-modify.ts: pre-existing bug — `addPlugin` was calling
  `resolvePluginSpecWithAutoRegister` without `workspacePath`, hiding
  project-scope marketplaces from the resolver. Plumbed through so a
  newly-registered project marketplace is discoverable on the same call.

Verified end-to-end against `obra/superpowers`:
- contentHash present and `sha256:`-prefixed after install.
- Re-install (after `skills remove`) produces the same hash.
- A local edit to the installed SKILL.md changes its on-disk sha vs. the
  recorded sync-state hash (drift detectable).
- Old sync-state.json without `sources` loads without error.

Closes #374
@christso christso force-pushed the feat/374-content-hashes branch from 7c296cd to bb93391 Compare May 12, 2026 14:48
@christso christso merged commit 8411e42 into main May 12, 2026
1 check was pending
@christso christso deleted the feat/374-content-hashes branch May 12, 2026 14:48
christso added a commit that referenced this pull request May 12, 2026
#389)

Reverts the per-skill content-hash machinery added in #382. For git-based
plugins, identity is already given for free by `url + ref` — `git rev-parse
HEAD` after each fetch returns the commit SHA, which is what `resolvedSha`
already records on the source entry. Computing a per-file sha256 tree over
every installed skill folder on every install adds CPU work and code
complexity without giving us anything `resolvedSha` doesn't.

- src/models/sync-state.ts: remove `SyncStateSkillSchema` and the `skills`
  map on `SyncStateSourceSchema`. Each lock entry is now just
  `{ pluginSpec, resolvedRef, resolvedSha, pinnedRef? }`. The top-level
  `lastSync` already records the timestamp.
- src/core/sync-state.ts: drop `computeSkillFolderHash`, `upsertSyncStateSkill`,
  and the unused `crypto.createHash` / `readdir` / `relative` imports. The
  private `persistMergedSyncState` helper folds back into
  `upsertSyncStateSource`.
- src/cli/commands/plugin-skills.ts: rename `recordContentProvenance` →
  `recordSourceProvenance` and drop the now-unused `skills` param + skill
  folder walk. The two callers (`--from <skill>` install and `--all`)
  no longer need to thread `skills` through.

Verified end-to-end: `skill add brainstorming --from obra/superpowers --pin
v3.1.0` writes a `sources["obra/superpowers"]` entry with `pluginSpec`,
`resolvedRef`, `resolvedSha`, `pinnedRef` only — no `skills` map.

`bun test`: 1209 pass / 0 fail.

Closes #388
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: skill-level content hashes in sync-state for drift detection (lockfile interop)

1 participant