v0.8.9
[0.8.9] - 2026-05-29
Added
occ code mapandocc code packCLI subcommands (src/code/command.ts) — token-budgeted, importance-ranked repository maps.mapemits the top symbol signatures per file (Aider-style, cheap structural context);packemits file content, optionally compressed to the architecturally-significant section. Both rank the graph (see below), then greedily admit the highest-ranked files until the token budget is hit, shrinking content or shedding low-rank symbols (partial admission) instead of dropping a file that nearly fits. Flags:--map-tokens <n>(alias--token-budget, default 4096),--mode map|pack,--map-format markdown|xml|json|plain,--compress(pack mode),--no-bias-exports,--max-symbols <n>, and--tokenizer heuristic|o200k_base|cl100k_basebuildRepoMap()and theRepoMapResult/RepoMapEntry/RepoMapSymboltypes (src/code/map.ts) — the programmatic core behindocc code map/pack, with an injectablecountTokensso a real tokenizer can enforce an exact budget. Exposed ascreateOcc().code.mapand re-exported from@cesarandreslopez/occrankNodes()(src/code/rank.ts) — weighted PageRank overimports/calls/inherits/implementsedges, with per-pair call-weight capping and an exported-symbol seed bias, rolling per-symbol scores up into file scores (RankResult/RankedNode/RankedFile). Exposed ascreateOcc().code.rankand re-exported from@cesarandreslopez/occ- Pluggable token counting (
src/tokens.ts) — aTokenizerinterface withHeuristicTokenizer(zero-dependency, language-aware default) andBpeTokenizer(lazily loadsgpt-tokenizer, caches encoding tables per encoding, no global state).createTokenizer(name)andresolveTokenizerName(value)resolveheuristic/o200k_base/cl100k_base. New runtime dependencygpt-tokenizer@^3.4.0 renderRepoMap()(src/code/output.ts) — renders aRepoMapResultasmarkdown,xml,json, orplaintest/code-map.test.tsandtest/tokenizer.test.ts— cover PageRank convergence/normalization/ranking order/export bias, greedy budget enforcement and truncation, repo-map format rendering, and the heuristic/BPE tokenizers (including an exacto200k_basebudget case)
Changed
- The default-scan token budgeter (
applyTokenBudgetinsrc/cli.ts) is now async and tokenizer-pluggable, and the default scan gained a top-level--tokenizer heuristic|o200k_base|cl100k_baseflag so--token-budgettruncation can use an exact BPE count instead of the heuristic - The import-DAG checker (
scripts/check-imports.mjs) now permits the Layer 3codemodule to depend on the Layer 0tokensmodule, keeping the architecture invariant satisfied for budget-accurate repo maps
Migration notes
- Purely additive: existing callers see no behavior change. The new commands, flags, and facade methods (
createOcc().code.map,createOcc().code.rank) are opt-in, and the default tokenizer remains the zero-dependency heuristic — pass--tokenizer o200k_base(orcl100k_base) only when you want an exact BPE-budgeted map
Install
Global install (requires Node.js 18+):
npm i -g @cesarandreslopez/occ
No-install usage:
npx @cesarandreslopez/occ [directories...]