Skip to content

feat(pack): roadmap Moves 1-4 — Sonnet-5 lane, CycloneDX 1.7, pack --prove, --cache-channel#278

Merged
theagenticguy merged 4 commits into
mainfrom
release/moves-1-4
Jul 1, 2026
Merged

feat(pack): roadmap Moves 1-4 — Sonnet-5 lane, CycloneDX 1.7, pack --prove, --cache-channel#278
theagenticguy merged 4 commits into
mainfrom
release/moves-1-4

Conversation

@theagenticguy

Copy link
Copy Markdown
Owner

Ships the four OpenCodeHub roadmap moves from the 2026-07-01 M-W-F planning run. Each was developed and gated green on its own branch (move/1..4), then cherry-picked onto this integration branch with the shared-file conflicts (cli/index.ts, code-pack.ts, variance-probe.ts, pack/index.ts) resolved by hand and the full gate suite re-run on the merged result.

Moves

Move 1 — Sonnet-5 tokenizer-provenance lane + variance-probe --pack-tokenizer (packages/pack, packages/eval, packages/cli)
Adds SONNET5_TOKENIZER_ID = anthropic:claude-sonnet-5@2026-06-30 (the anthropic: prefix inherits best_effort determinism). New --pack-tokenizer <id> flag threads a chosen tokenizer lane into the variance-probe's with-pack arm; the resolved lane is recorded on VarianceReport.packTokenizerId so Finding 0001 v2 can attribute token results to a tokenizer. Default tokenizer + Claude model unchanged. Grounded in Sonnet 5's ~30-35% tokenizer inflation vs prior Claude models.

Move 2 — context-BOM to CycloneDX 1.7 + per-file provenance citation (packages/pack)
Bumps the context-BOM specVersion 1.6→1.7 and $schema to bom-1.7.schema.json (1.7 is backward-compatible: additive optional fields only). Each file component gains externalReferences:[{type:"vcs",url}] (when an origin URL is present) plus an opencodehub:commit property, binding each indexed file to its (repoOriginUrl, commit, path) triple. Deterministic (sorted, canonical-JSON hashed, omitted-not-null). Procurement-shaped for AIBOM / EU CRA reviews.

Move 3 — pack --prove emits an in-toto context-attestation predicate (packages/pack, packages/cli)
New --prove flag writes attestation.intoto.json: an in-toto Statement v1 whose subject digest is {sha256: packHash} and whose minted predicateType (https://opencodehub.dev/attestation/context/v0.1) records what the agent read (packHash, contextBomHash, commit, origin, tokenizer, budget, determinism class, sorted BOM items). Canonical + clockless, so the attestation is itself re-derivable. Composes beneath the CI SLSA build provenance; unsigned by design (cosign stays a CI concern).

Move 4 — --cache-channel gates cachePoint markers to opt-in channels (packages/pack, packages/cli)
Auto prompt-caching is now default on Anthropic-direct / Claude-on-AWS / MS Foundry but opt-in on classic Bedrock/Vertex. New --cache-channel <channel> inserts a deterministic cache-breakpoint sentinel at the stable prefix boundary only on the opt-in channels; auto (default) + automatic channels emit no marker, so pack output stays byte-identical to today. cacheChannel is kept OUT of the packHash preimage, so packHash is undisturbed.

Verification (local, on the merged branch)

  • mise run build — pass
  • mise run typecheck — pass
  • pnpm --filter @opencodehub/pack test — 150 pass / 0 fail
  • pnpm --filter @opencodehub/eval test — 73 pass / 0 fail
  • pnpm --filter @opencodehub/cli test — 369 pass / 0 fail (11 platform-lane skips)
  • pnpm exec biome ci . — clean
  • scripts/check-banned-strings.sh — pass

🤖 Generated with Claude Code

theagenticguy and others added 4 commits July 1, 2026 22:53
…k-tokenizer (Move 1)

Sonnet 5 (2026-06-30) ships a tokenizer that inflates the same bytes
~30-35% vs prior Claude tokenizers, so a budget authored under
openai:o200k_base under-provisions when the consuming agent is Sonnet 5
(the pack's budgetTokens->chunkSize map is 1:1).

Add SONNET5_TOKENIZER_ID = "anthropic:claude-sonnet-5@2026-06-30" (the
anthropic: prefix inherits best_effort determinism). Wire an optional
--pack-tokenizer flag through runVarianceProbe to the with-pack assemble
call, and record the lane on VarianceReport.packTokenizerId so Finding
0001 v2 attributes results to a tokenizer. Provenance only -- no runtime
encoder, default lane unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…on (Move 2)

Bump the context read-receipt from CycloneDX 1.6 to 1.7 and bind each
indexed file to its source of record: a CDX-native externalReferences[vcs]
entry citing repoOriginUrl plus an opencodehub:commit property. This makes
the signed re-derivable receipt procurement-shaped for AIBOM / EU CRA
review — a reviewer can re-derive the exact (repoOriginUrl, commit, path)
triple behind the pack.

1.7 is fully backward compatible with 1.4-1.6 (only adds optional fields),
so the bump plus the new externalReferences cannot break a 1.6-shaped
consumer. The citation is a per-pack constant, canonical-JSON hashed and
sorted, so the receipt stays byte-deterministic; fields are omitted (never
null) when origin/commit are absent. indexTime is never cited.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…(Move 3)

Adds a --prove flag to `code-pack` that emits an unsigned in-toto Statement v1
(attestation.intoto.json) whose SUBJECT is the pack's packHash
({ sha256: packHash }) and whose predicate records the context provenance
(packHash, contextBomHash, commit, repoOriginUrl, tokenizerId, budgetTokens,
determinismClass, and the path-sorted BOM item list).

The Statement is a pure function of the manifest (no clock/uuid/run-id), so it
is byte-deterministic and re-derivable. Minted predicateType
https://opencodehub.dev/attestation/context/v0.1. Composable beneath the SLSA
build provenance CI already attests: same envelope shape, distinct
predicateType, subject keyed to the packHash. Signing stays a CI concern.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ls (Move 4)

Auto prompt-caching is now default+free on Anthropic-direct, Claude-on-AWS,
and MS Foundry, but stays opt-in on classic Bedrock/Vertex. --cache-channel
emits a deterministic cache-breakpoint sentinel at the stable prefix boundary
only on the opt-in channels; auto (default) + automatic channels emit no
marker, so the pack stays byte-identical to pre-Move-4. cacheChannel is kept
out of the manifest/packHash preimage, so packHash is undisturbed.
@theagenticguy theagenticguy merged commit 981563e into main Jul 1, 2026
38 checks passed
@theagenticguy theagenticguy deleted the release/moves-1-4 branch July 1, 2026 23:08
@github-actions github-actions Bot mentioned this pull request Jul 1, 2026
theagenticguy pushed a commit that referenced this pull request Jul 1, 2026
🤖 Automated release via release-please
---


<details><summary>root: 0.10.7</summary>

##
[0.10.7](root-v0.10.6...root-v0.10.7)
(2026-07-01)


### Features

* **pack:** roadmap Moves 1-4 — Sonnet-5 lane, CycloneDX 1.7, pack
--prove, --cache-channel
([#278](#278))
([981563e](981563e))
</details>

<details><summary>cli: 0.10.7</summary>

##
[0.10.7](cli-v0.10.6...cli-v0.10.7)
(2026-07-01)


### Features

* **pack:** roadmap Moves 1-4 — Sonnet-5 lane, CycloneDX 1.7, pack
--prove, --cache-channel
([#278](#278))
([981563e](981563e))
</details>

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant