feat(pack): roadmap Moves 1-4 — Sonnet-5 lane, CycloneDX 1.7, pack --prove, --cache-channel#278
Merged
Merged
Conversation
…k-tokenizer (Move 1) Sonnet 5 (2026-06-30) ships a tokenizer that inflates the same bytes ~30-35% vs prior Claude tokenizers, so a budget authored under openai:o200k_base under-provisions when the consuming agent is Sonnet 5 (the pack's budgetTokens->chunkSize map is 1:1). Add SONNET5_TOKENIZER_ID = "anthropic:claude-sonnet-5@2026-06-30" (the anthropic: prefix inherits best_effort determinism). Wire an optional --pack-tokenizer flag through runVarianceProbe to the with-pack assemble call, and record the lane on VarianceReport.packTokenizerId so Finding 0001 v2 attributes results to a tokenizer. Provenance only -- no runtime encoder, default lane unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…on (Move 2) Bump the context read-receipt from CycloneDX 1.6 to 1.7 and bind each indexed file to its source of record: a CDX-native externalReferences[vcs] entry citing repoOriginUrl plus an opencodehub:commit property. This makes the signed re-derivable receipt procurement-shaped for AIBOM / EU CRA review — a reviewer can re-derive the exact (repoOriginUrl, commit, path) triple behind the pack. 1.7 is fully backward compatible with 1.4-1.6 (only adds optional fields), so the bump plus the new externalReferences cannot break a 1.6-shaped consumer. The citation is a per-pack constant, canonical-JSON hashed and sorted, so the receipt stays byte-deterministic; fields are omitted (never null) when origin/commit are absent. indexTime is never cited. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…(Move 3)
Adds a --prove flag to `code-pack` that emits an unsigned in-toto Statement v1
(attestation.intoto.json) whose SUBJECT is the pack's packHash
({ sha256: packHash }) and whose predicate records the context provenance
(packHash, contextBomHash, commit, repoOriginUrl, tokenizerId, budgetTokens,
determinismClass, and the path-sorted BOM item list).
The Statement is a pure function of the manifest (no clock/uuid/run-id), so it
is byte-deterministic and re-derivable. Minted predicateType
https://opencodehub.dev/attestation/context/v0.1. Composable beneath the SLSA
build provenance CI already attests: same envelope shape, distinct
predicateType, subject keyed to the packHash. Signing stays a CI concern.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ls (Move 4) Auto prompt-caching is now default+free on Anthropic-direct, Claude-on-AWS, and MS Foundry, but stays opt-in on classic Bedrock/Vertex. --cache-channel emits a deterministic cache-breakpoint sentinel at the stable prefix boundary only on the opt-in channels; auto (default) + automatic channels emit no marker, so the pack stays byte-identical to pre-Move-4. cacheChannel is kept out of the manifest/packHash preimage, so packHash is undisturbed.
Merged
theagenticguy
pushed a commit
that referenced
this pull request
Jul 1, 2026
🤖 Automated release via release-please --- <details><summary>root: 0.10.7</summary> ## [0.10.7](root-v0.10.6...root-v0.10.7) (2026-07-01) ### Features * **pack:** roadmap Moves 1-4 — Sonnet-5 lane, CycloneDX 1.7, pack --prove, --cache-channel ([#278](#278)) ([981563e](981563e)) </details> <details><summary>cli: 0.10.7</summary> ## [0.10.7](cli-v0.10.6...cli-v0.10.7) (2026-07-01) ### Features * **pack:** roadmap Moves 1-4 — Sonnet-5 lane, CycloneDX 1.7, pack --prove, --cache-channel ([#278](#278)) ([981563e](981563e)) </details> --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Ships the four OpenCodeHub roadmap moves from the 2026-07-01 M-W-F planning run. Each was developed and gated green on its own branch (
move/1..4), then cherry-picked onto this integration branch with the shared-file conflicts (cli/index.ts,code-pack.ts,variance-probe.ts,pack/index.ts) resolved by hand and the full gate suite re-run on the merged result.Moves
Move 1 — Sonnet-5 tokenizer-provenance lane + variance-probe
--pack-tokenizer(packages/pack,packages/eval,packages/cli)Adds
SONNET5_TOKENIZER_ID = anthropic:claude-sonnet-5@2026-06-30(theanthropic:prefix inheritsbest_effortdeterminism). New--pack-tokenizer <id>flag threads a chosen tokenizer lane into the variance-probe's with-pack arm; the resolved lane is recorded onVarianceReport.packTokenizerIdso Finding 0001 v2 can attribute token results to a tokenizer. Default tokenizer + Claude model unchanged. Grounded in Sonnet 5's ~30-35% tokenizer inflation vs prior Claude models.Move 2 — context-BOM to CycloneDX 1.7 + per-file provenance citation (
packages/pack)Bumps the context-BOM
specVersion1.6→1.7 and$schematobom-1.7.schema.json(1.7 is backward-compatible: additive optional fields only). Each file component gainsexternalReferences:[{type:"vcs",url}](when an origin URL is present) plus anopencodehub:commitproperty, binding each indexed file to its(repoOriginUrl, commit, path)triple. Deterministic (sorted, canonical-JSON hashed, omitted-not-null). Procurement-shaped for AIBOM / EU CRA reviews.Move 3 —
pack --proveemits an in-toto context-attestation predicate (packages/pack,packages/cli)New
--proveflag writesattestation.intoto.json: an in-toto Statement v1 whose subject digest is{sha256: packHash}and whose mintedpredicateType(https://opencodehub.dev/attestation/context/v0.1) records what the agent read (packHash, contextBomHash, commit, origin, tokenizer, budget, determinism class, sorted BOM items). Canonical + clockless, so the attestation is itself re-derivable. Composes beneath the CI SLSA build provenance; unsigned by design (cosign stays a CI concern).Move 4 —
--cache-channelgates cachePoint markers to opt-in channels (packages/pack,packages/cli)Auto prompt-caching is now default on Anthropic-direct / Claude-on-AWS / MS Foundry but opt-in on classic Bedrock/Vertex. New
--cache-channel <channel>inserts a deterministic cache-breakpoint sentinel at the stable prefix boundary only on the opt-in channels;auto(default) + automatic channels emit no marker, so pack output stays byte-identical to today.cacheChannelis kept OUT of the packHash preimage, so packHash is undisturbed.Verification (local, on the merged branch)
mise run build— passmise run typecheck— passpnpm --filter @opencodehub/pack test— 150 pass / 0 failpnpm --filter @opencodehub/eval test— 73 pass / 0 failpnpm --filter @opencodehub/cli test— 369 pass / 0 fail (11 platform-lane skips)pnpm exec biome ci .— cleanscripts/check-banned-strings.sh— pass🤖 Generated with Claude Code