Skip to content

F094: Token Counting — Unify via Tokenizer Port #339

@pocky

Description

@pocky

F094: Token Counting — Unify via Tokenizer Port

Scope

In Scope

  • Inject ports.Tokenizer into baseCLIProvider and wire it through all CLI provider constructors
  • Replace estimateTokens() and estimateInputTokens() calls in execute() and executeConversation() with ports.Tokenizer method calls
  • Remove dead helper functions (estimateTokens, estimateInputTokens) from helpers.go
  • Eliminate mutation side-effect on shared turn state in estimateInputTokens()
  • Create and inject ApproximationTokenizer at the interfaces/cli/ layer where providers are instantiated

Out of Scope

  • Parsing real token counts from provider stream output (NDJSON events)
  • Adding cost tracking (CostUSD, total_cost_usd)
  • Changing OpenAICompatibleProvider token handling (uses resp.Usage directly, which is correct)
  • Per-provider approximation ratios
  • Modifying ports.Tokenizer interface signature
  • Modifying ApproximationTokenizer or TiktokenTokenizer implementations

Deferred

Item Rationale Follow-up
StreamTokenizer (parse real tokens from Claude NDJSON result event) Requires stream parsing infrastructure not yet built future
CostUSD field on results Depends on StreamTokenizer to extract total_cost_usd future
Per-provider approximation ratio (NewApproximationTokenizerWithRatio()) Current uniform len/4 is acceptable; real data needed to calibrate future
Evaluate removal of TiktokenTokenizer May still be useful for OpenAI-model providers; needs usage analysis future

User Stories

US1: Unified Token Counting Path (P1 - Must Have)

As a workflow author,
I want all CLI provider token counts to flow through the ports.Tokenizer interface,
So that states.step.TokensUsed reflects a consistent counting method and I can trust token metrics across providers.

Why this priority: Without this, the Tokenizer port is dead code, token counts bypass the domain architecture, and there is no single point to swap in a more accurate implementation later. This is the core value of the feature.

Acceptance Scenarios:

  1. Given a workflow step using the Claude provider, When the step executes via execute(), Then the Tokens field in AgentResult is computed by ports.Tokenizer.CountTokens() instead of the inline len(output)/4 helper.
  2. Given a conversation workflow using the Gemini provider, When executeConversation() runs, Then both assistant output tokens and input turn tokens are computed via ports.Tokenizer methods.
  3. Given any CLI provider (Claude, Gemini, Copilot, Codex, OpenCode), When instantiated, Then it receives a ports.Tokenizer dependency and stores it in baseCLIProvider.tokenizer.
  4. Given the OpenAICompatibleProvider, When it executes, Then it continues using resp.Usage directly and is not affected by this change.

Independent Test: Run a workflow with awf run using any CLI provider; verify states.<step>.TokensUsed is populated and the value matches len(output)/4 (since ApproximationTokenizer is the default). Confirm via unit test that baseCLIProvider.execute() calls the injected tokenizer mock.

US2: Eliminate Input Token Mutation Side-Effect (P2 - Should Have)

As a developer maintaining conversation execution,
I want input token counting to not mutate turns[i].Tokens in the shared state,
So that conversation state remains immutable during token estimation and no downstream code observes unexpected side-effects.

Why this priority: The current estimateInputTokens() mutates turns in place, which is a correctness hazard for any code reading turn state concurrently or after estimation. Fixing this alongside the port wiring is low-cost and prevents a class of subtle bugs.

Acceptance Scenarios:

  1. Given a conversation with 5 turns, When input tokens are counted for the latest assistant response, Then no Turn.Tokens field on any previous turn is modified as a side-effect.
  2. Given a conversation turn with Tokens already set from a prior step, When executeConversation() counts input tokens, Then the pre-existing Tokens value is preserved unchanged.

Independent Test: Unit test that captures turn state before and after executeConversation() and asserts no mutation on prior turns' Tokens fields.

US3: Dead Code Removal (P3 - Nice to Have)

As a project maintainer,
I want the orphaned estimateTokens() and estimateInputTokens() functions removed from helpers.go,
So that there is no dead code creating maintenance confusion or suggesting an alternative counting path exists.

Why this priority: Strictly a cleanup concern. The functions become unreferenced after US1 is complete, so removal is mechanical and low-risk, but not blocking.

Acceptance Scenarios:

  1. Given the codebase after US1 is implemented, When searching for estimateTokens or estimateInputTokens, Then zero references exist outside of git history.
  2. Given the codebase, When running make build and make lint, Then both pass with zero violations related to unused functions.

Independent Test: grep -r "estimateTokens\|estimateInputTokens" internal/ returns no matches. make lint passes cleanly.

Edge Cases

  • What happens when CountTokens receives an empty string? The approximation tokenizer returns 0 — verify no division-by-zero or negative values propagate.
  • What happens when a provider produces no output (empty outputStr)? Tokens should be 0, not an error.
  • How does the system handle a nil tokenizer? Constructors must require a non-nil ports.Tokenizer; return an explicit error rather than panic at call time.
  • What happens when CountTurnsTokens receives an empty slice of turn contents? Should return 0 tokens without error.

Requirements

Functional Requirements

  • FR-001: System MUST inject a ports.Tokenizer instance into baseCLIProvider at construction time.
  • FR-002: baseCLIProvider.execute() MUST use b.tokenizer.CountTokens(outputStr) to compute output tokens instead of the inline len(output)/4 formula.
  • FR-003: baseCLIProvider.executeConversation() MUST use b.tokenizer.CountTokens() for assistant output tokens and b.tokenizer.CountTurnsTokens() (or equivalent repeated CountTokens calls) for input token estimation.
  • FR-004: Input token counting MUST NOT mutate any field on existing Turn structs in the shared conversation state.
  • FR-005: All CLI provider constructors (NewClaudeProvider, NewGeminiProvider, NewCopilotProvider, NewCodexProvider, NewOpenCodeProvider) MUST accept and forward a ports.Tokenizer to baseCLIProvider.
  • FR-006: The interfaces/cli/ layer MUST create a single ApproximationTokenizer instance and pass it to all CLI provider constructors.
  • FR-007: estimateTokens() and estimateInputTokens() MUST be deleted from helpers.go after all call sites are migrated.
  • FR-008: OpenAICompatibleProvider MUST NOT be modified; it continues using resp.Usage directly.

Non-Functional Requirements

  • NFR-001: Token counting overhead must be negligible — ApproximationTokenizer.CountTokens() must complete in under 1μs for a 10KB string (it is a single division).
  • NFR-002: No new external dependencies introduced; ApproximationTokenizer uses only standard library operations.
  • NFR-003: make build, make lint, and make test must pass with zero violations after implementation.

Success Criteria

  • SC-001: Zero call sites reference estimateTokens or estimateInputTokens in the codebase after implementation.
  • SC-002: All CLI providers produce identical TokensUsed values as before (behavioral equivalence since ApproximationTokenizer uses the same len/4 formula).
  • SC-003: Conversation execution does not mutate any pre-existing turn Tokens values as verified by unit tests.
  • SC-004: Swapping ApproximationTokenizer for TiktokenTokenizer at the injection site changes token counts for all CLI providers simultaneously — verified by a unit test with a mock tokenizer.

Key Entities

Entity Description Key Attributes
ports.Tokenizer Domain port interface for token counting CountTokens(text) (int, error), CountTurnsTokens(contents) (int, error), IsEstimate() bool
ApproximationTokenizer Infrastructure implementation using len(text)/4 heuristic IsEstimate() returns true
baseCLIProvider Shared base struct for all CLI agent providers Gains tokenizer ports.Tokenizer field

Assumptions

  • The ports.Tokenizer interface already has CountTokens, CountTurnsTokens, and IsEstimate methods — no interface changes needed.
  • ApproximationTokenizer already implements ports.Tokenizer correctly and returns len(text)/4 rounded to int.
  • The existing estimateTokens() in helpers.go uses exactly len(output)/4, so switching to ApproximationTokenizer is a behavioral no-op.
  • All CLI provider constructors currently accept executor and logger parameters; adding tokenizer is a consistent extension of the same pattern.
  • TokensEstimated or equivalent field exists on result types, or IsEstimate() is informational only for now.

Metadata

  • Status: backlog
  • Version: v0.9.0
  • Priority: medium
  • Estimation: S

Dependencies

  • Blocked by: none
  • Unblocks: StreamTokenizer (future real-token extraction), CostUSD tracking (future)

Clarifications

Section populated during clarify step with resolved ambiguities.

Notes

  • C057 (migration TokensTokensUsed) is already complete; field names are stabilized.
  • CountTurnsTokens may need to accept []string of turn contents rather than turn structs, to avoid coupling the port to conversation domain types — verify interface signature before implementation.
  • The TiktokenTokenizer implementation should be kept for now; it serves OpenAI-model use cases and validates the port abstraction supports multiple backends.
  • ~100 lines modified across 8 files; no new logic, strictly wiring and cleanup.

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureFeature specificationv0.9.0Target version

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions