Skip to content

v0.8.9

Choose a tag to compare

@github-actions github-actions released this 30 May 00:19
· 6 commits to main since this release

[0.8.9] - 2026-05-29

Added

  • occ code map and occ code pack CLI subcommands (src/code/command.ts) — token-budgeted, importance-ranked repository maps. map emits the top symbol signatures per file (Aider-style, cheap structural context); pack emits file content, optionally compressed to the architecturally-significant section. Both rank the graph (see below), then greedily admit the highest-ranked files until the token budget is hit, shrinking content or shedding low-rank symbols (partial admission) instead of dropping a file that nearly fits. Flags: --map-tokens <n> (alias --token-budget, default 4096), --mode map|pack, --map-format markdown|xml|json|plain, --compress (pack mode), --no-bias-exports, --max-symbols <n>, and --tokenizer heuristic|o200k_base|cl100k_base
  • buildRepoMap() and the RepoMapResult / RepoMapEntry / RepoMapSymbol types (src/code/map.ts) — the programmatic core behind occ code map/pack, with an injectable countTokens so a real tokenizer can enforce an exact budget. Exposed as createOcc().code.map and re-exported from @cesarandreslopez/occ
  • rankNodes() (src/code/rank.ts) — weighted PageRank over imports/calls/inherits/implements edges, with per-pair call-weight capping and an exported-symbol seed bias, rolling per-symbol scores up into file scores (RankResult / RankedNode / RankedFile). Exposed as createOcc().code.rank and re-exported from @cesarandreslopez/occ
  • Pluggable token counting (src/tokens.ts) — a Tokenizer interface with HeuristicTokenizer (zero-dependency, language-aware default) and BpeTokenizer (lazily loads gpt-tokenizer, caches encoding tables per encoding, no global state). createTokenizer(name) and resolveTokenizerName(value) resolve heuristic / o200k_base / cl100k_base. New runtime dependency gpt-tokenizer@^3.4.0
  • renderRepoMap() (src/code/output.ts) — renders a RepoMapResult as markdown, xml, json, or plain
  • test/code-map.test.ts and test/tokenizer.test.ts — cover PageRank convergence/normalization/ranking order/export bias, greedy budget enforcement and truncation, repo-map format rendering, and the heuristic/BPE tokenizers (including an exact o200k_base budget case)

Changed

  • The default-scan token budgeter (applyTokenBudget in src/cli.ts) is now async and tokenizer-pluggable, and the default scan gained a top-level --tokenizer heuristic|o200k_base|cl100k_base flag so --token-budget truncation can use an exact BPE count instead of the heuristic
  • The import-DAG checker (scripts/check-imports.mjs) now permits the Layer 3 code module to depend on the Layer 0 tokens module, keeping the architecture invariant satisfied for budget-accurate repo maps

Migration notes

  • Purely additive: existing callers see no behavior change. The new commands, flags, and facade methods (createOcc().code.map, createOcc().code.rank) are opt-in, and the default tokenizer remains the zero-dependency heuristic — pass --tokenizer o200k_base (or cl100k_base) only when you want an exact BPE-budgeted map

Install

Global install (requires Node.js 18+):

npm i -g @cesarandreslopez/occ

No-install usage:

npx @cesarandreslopez/occ [directories...]