Add token-aware prompt budgeting as an opt-in path by FuJacob · Pull Request #531 · FuJacob/cotabby

FuJacob · 2026-06-02T03:46:35Z

Summary

The base-model prompt is budgeted in characters as a deliberate ~4-chars-per-token approximation (see PromptSectionBudget's own comment). That ratio is far off for code and non-Latin text. This adds a token-aware budgeting path — the swap the comment anticipated — without paying for the runtime tokenizer on the main-actor prompt path.

TokenCountEstimator (new, pure, tested): a cheap word-aware heuristic (roughly four characters per token within a word, every word at least one token), closer to subword tokenization than a single global ratio and deterministic for tests.
PromptSectionBudget.allocate(_:totalTokens:estimate:) (new, additive): fills by priority against an estimated-token budget, converting each section's token cap to a character cap via that content's own density so the existing character-based truncate is reused unchanged. The character allocate is untouched.
BaseCompletionPromptRenderer takes an optional tokenBudget; nil keeps the character path.

Validation

xcodebuild ... test ... CODE_SIGNING_ALLOWED=NO CODE_SIGNING_REQUIRED=NO \
  -only-testing:CotabbyTests/TokenCountEstimatorTests \
  -only-testing:CotabbyTests/PromptSectionBudgetTests \
  -only-testing:CotabbyTests/BaseCompletionPromptRendererTests
# ** TEST SUCCEEDED **
#   estimator: empty=0, every word >=1 token, longer text estimates more, scales with word count
#   token allocate: priority fill, drops low priority when tight, respects the token budget
#   renderer: the caret prefix stays un-starved under a tight token budget; char-path tests unchanged

swiftlint --strict   # exit 0 (CI-equivalent)
xcodegen generate    # registered the new source + test file

Linked issues

None. Prompting parity: token-aware (vs flat character) section budgeting.

Risk / rollout notes

Opt-in, no behavior change. tokenBudget defaults to nil, so the character path is taken and shipped behavior is byte-for-byte unchanged; the existing budget and renderer tests pass untouched.
This lands the pure, tested estimator and token allocator. Wiring a caller to pass a real token budget is the follow-up: the right budget value and the quality delta over the character approximation need on-device validation, so it stays opt-in until then. The estimator is intentionally approximate and used only for relative budgeting, never a hard token limit.
project.pbxproj regenerated by XcodeGen for the two new files.

Greptile Summary

This PR introduces an opt-in token-aware prompt budgeting path as a drop-in complement to the existing character-based allocator. The tokenBudget parameter defaults to nil so no shipped behaviour changes until a caller is wired in a follow-up.

TokenCountEstimator (new): a pure word-aware heuristic that splits on both whitespace and punctuation, giving closer approximations for code and punctuation-heavy prose without a real tokenizer on the main-actor path.
PromptSectionBudget.allocate(_:totalTokens:estimate:) (new): fills sections by priority against an estimated-token budget, converting each section's token cap to chars via that section's own density so the existing truncate helper is reused unchanged; a max(0,…) clamp prevents density-inverted truncated slices from blocking subsequent sections.
BaseCompletionPromptRenderer: routes to the new token allocator only when tokenBudget is non-nil, leaving the character path byte-for-byte identical.

Confidence Score: 5/5

Safe to merge: the new token path is fully opt-in (nil default), all existing tests pass unchanged, and the two issues raised in the prior review round have been addressed.

The change is additive and isolated behind a nil-default parameter, so no existing behaviour can regress. The new allocator correctly handles priority ordering, the density-inversion clamp, and empty/whitespace content. The estimator now splits on punctuation as well as whitespace, matching what real subword tokenizers do. No caller yet passes a non-nil tokenBudget, so the new path has zero production exposure until explicitly wired.

No files require special attention. The one observation is a test-quality note in PromptSectionBudgetTests.swift around an assertion that holds only for uniform-density data.

Important Files Changed

Filename	Overview
Cotabby/Support/TokenCountEstimator.swift	New pure estimator; splits on both whitespace and punctuation (addressing the previous-thread finding), correctly floors at 1 token per word, returns 0 for empty/whitespace-only input.
Cotabby/Support/PromptSectionBudget.swift	Adds token-aware allocate overload; correctly converts remaining tokens to chars via per-section density, clamps to 0 on over-deduction (addressing prior thread). The max(0,…) clamp means a density-inverted truncated slice can push total token usage over totalTokens, but the relaxation is design-intentional.
Cotabby/Support/BaseCompletionPromptRenderer.swift	Clean opt-in: tokenBudget defaults to nil, preserving existing char-path behavior byte-for-byte; the if-let branch routes to the new token allocate only when a budget is supplied.
CotabbyTests/PromptSectionBudgetTests.swift	Three new token-allocate tests; priority fill and budget-drop are well-covered. The test_tokenAllocate_respectsTokenBudget assertion holds only because the test data is uniform density — non-uniform data can produce used > totalTokens by design.
CotabbyTests/TokenCountEstimatorTests.swift	Good relational test coverage (empty=0, minimum 1 token/word, monotone growth, punctuation boundary splitting); locks behaviour without over-specifying exact counts.
CotabbyTests/BaseCompletionPromptRendererTests.swift	New test verifies the highest-priority caret prefix survives a tight token budget (8 tokens); existing char-path tests left unchanged as claimed.
Cotabby.xcodeproj/project.pbxproj	XcodeGen-regenerated; correctly registers TokenCountEstimator.swift in the main target and TokenCountEstimatorTests.swift in the test target.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[BaseCompletionPromptRenderer.prompt] --> B{tokenBudget != nil?}
    B -- Yes --> C[PromptSectionBudget.allocate\ntotalTokens: tokenBudget\nestimate: TokenCountEstimator.estimate]
    B -- No --> D[PromptSectionBudget.allocate\ntotalChars: contextBudget]
    C --> E[Sort sections by priority descending]
    D --> F[Sort sections by priority descending]
    E --> G[For each section\ncompute charsPerToken density\nconvert remainingTokens to remainingChars\ncap = min maxChars, content.count, remainingChars]
    F --> H[For each section\ncap = min maxChars, content.count, remaining]
    G --> I{cap >= minChars?}
    H --> I
    I -- No --> J[Drop section, continue]
    I -- Yes --> K[truncate + trim]
    K --> L{truncated empty?}
    L -- Yes --> J
    L -- No --> M[Keep section\ndeduct from budget\nclamp to 0]
    M --> N[Return sections in original order]

_{Reviews (2): Last reviewed commit: "Address review feedback on token budgeti..." | Re-trigger Greptile}

The base-model prompt is budgeted in characters as a deliberate ~4-chars-per-token approximation. That ratio is far off for code and non-Latin text, where it can under- or over-fill the real context window. This adds a token-aware path that swaps in an estimated token count, exactly as PromptSectionBudget's own comment anticipated, without paying for the runtime tokenizer on the main-actor prompt path. - TokenCountEstimator is a pure, cheap, word-aware heuristic (roughly four characters per token within a word, every word at least one token) — closer to real subword tokenization than a single global ratio, deterministic for tests. - PromptSectionBudget gains an additive allocate(_:totalTokens:estimate:) that fills by priority against an estimated-token budget, converting each section's token cap to a character cap via that content's own density so the existing character-based truncate is reused unchanged. The character allocate is untouched. - BaseCompletionPromptRenderer takes an optional tokenBudget; nil keeps the character path, so shipped behavior is unchanged. The estimator, the token allocator, and the renderer's token path are all unit-tested (the caret prefix stays un-starved under a tight token budget). Wiring a caller to pass a real token budget is the follow-up: the right budget value and the quality delta need on-device validation, so it stays opt-in until then.

- PromptSectionBudget: clamp remainingTokens at zero. A truncated slice can be token-denser than the section average, so deducting its estimate could drive the remaining budget negative and wrongly drop the next section even when it fits. - TokenCountEstimator: split on punctuation as well as whitespace, so contractions ("can't") and punctuation-joined identifiers ("foo.bar") aren't undercounted as a single word.

FuJacob force-pushed the feat/token-budgeting branch from cc1990f to f865fdc Compare June 2, 2026 03:52

greptile-apps Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread Cotabby/Support/PromptSectionBudget.swift

Comment thread Cotabby/Support/TokenCountEstimator.swift

FuJacob merged commit 1913ad0 into main Jun 2, 2026
4 checks passed

FuJacob deleted the feat/token-budgeting branch June 2, 2026 04:44

FuJacob mentioned this pull request Jun 2, 2026

gif #540

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add token-aware prompt budgeting as an opt-in path#531

Add token-aware prompt budgeting as an opt-in path#531
FuJacob merged 2 commits into
mainfrom
feat/token-budgeting

FuJacob commented Jun 2, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

FuJacob commented Jun 2, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Linked issues

Risk / rollout notes

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

FuJacob commented Jun 2, 2026 •

edited by greptile-apps Bot

Loading