feat(tokens): unify counting via ports.Tokenizer port#340
Merged
pocky merged 1 commit intoMay 13, 2026
Conversation
- `CHANGELOG.md`: Document F094 unified token counting changes - `CLAUDE.md`: Add nolint:errcheck replication rule; remove stale pitfall - `docs/development/architecture.md`: Update tokenizer/ description with ports.Tokenizer detail - `docs/development/creating-agent-provider.md`: Add new provider creation guide (1004 lines) - `docs/development/project-structure.md`: Update tokenizer/ directory description - `docs/reference/interpolation.md`: Document TokensInput, TokensOutput, TokensEstimated fields - `docs/user-guide/agent-steps.md`: Add token tracking table with new fields and provider matrix - `go.mod`: Remove tiktoken-go and glamour dependencies - `go.sum`: Remove checksums for removed dependencies - `internal/application/execution_service.go`: Propagate TokensInput, TokensOutput, TokensEstimated into step state - `internal/application/interpolation_helpers.go`: Map new token fields into interpolation context - `internal/domain/workflow/context.go`: Add TokensInput, TokensOutput, TokensEstimated to StepState - `internal/domain/workflow/reference.go`: Register new token properties in ValidStateProperties and alias map - `internal/infrastructure/agents/base_cli_provider.go`: Inject ports.Tokenizer; add extractTokenUsage hook; use real tokens when available, fallback to tokenizer estimate - `internal/infrastructure/agents/base_cli_provider_tokenizer_test.go`: Add 390-line tokenizer integration tests for execute and conversation paths - `internal/infrastructure/agents/claude_provider.go`: Wire extractTokenUsage hook from claude result event usage field - `internal/infrastructure/agents/codex_provider.go`: Wire extractTokenUsage hook from turn.completed event usage field - `internal/infrastructure/agents/copilot_provider.go`: Wire extractTokenUsage hook from assistant.message outputTokens field - `internal/infrastructure/agents/gemini_provider.go`: Wire extractTokenUsage hook from result event stats field - `internal/infrastructure/agents/helpers.go`: Remove dead estimateTokens and estimateInputTokens helpers - `internal/infrastructure/agents/helpers_test.go`: Remove tests for deleted estimation helpers - `internal/infrastructure/agents/opencode_provider.go`: Wire extractTokenUsage hook from step_finish part.tokens field - `internal/infrastructure/agents/options.go`: Add SetTokenizer option for baseCLIProvider injection - `internal/infrastructure/agents/provider_options_test.go`: Add tokenizer injection tests - `internal/infrastructure/tokenizer/tiktoken_tokenizer.go`: Delete TiktokenTokenizer (tiktoken dep removed) - `internal/infrastructure/tokenizer/tiktoken_tokenizer_test.go`: Delete tiktoken tokenizer tests - `pkg/interpolation/reference.go`: Register TokensInput, TokensOutput, TokensEstimated in ValidStateProperties - `pkg/interpolation/reference_json_field_test.go`: Update tests for new token property names - `pkg/interpolation/reference_test.go`: Update reference validation tests - `pkg/interpolation/resolver.go`: Handle TokensEstimated bool type in template resolver Closes #339
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ports.Tokenizerinterface, replacing scatteredestimateTokens/estimateInputTokensinline helpersextractTokenUsagehook that pulls real token counts from its JSON output; afallbackTokenizer(len/4) is used only when the provider does not emit token data, withTokensEstimated=trueto signal approximationTokensInput,TokensOutput, andTokensEstimatedfields to step state, making them accessible as interpolation variables in workflow YAMLtiktoken-godependency entirely and deletes thetiktoken_tokenizerimplementation; theports.Tokenizerport is the single injection point for future real tokenizer swapsChanges
Domain
internal/domain/workflow/context.go: AddTokensInput,TokensOutput,TokensEstimatedfields toStepStateinternal/domain/workflow/reference.go: Register new token fields inValidStatePropertiesandlowercaseToUppercasealias mapApplication
internal/application/execution_service.go: PropagateTokensInput,TokensOutput,TokensEstimatedfrom conversation and single-turn results into step stateinternal/application/interpolation_helpers.go: Include new token fields when building interpolation context from step stateInfrastructure — base provider
internal/infrastructure/agents/base_cli_provider.go: Addtokenizer ports.Tokenizerfield tobaseCLIProvider; default tofallbackTokenizer; addextractTokenUsagehook tocliProviderHooks; replaceestimateTokens/estimateInputTokenscalls with tokenizer calls in bothexecuteandexecuteConversationinternal/infrastructure/agents/base_cli_provider_tokenizer_test.go: New — 390-line test suite covering tokenizer injection,IsEstimatepropagation,CountTurnsTokensslicing, no-mutation guarantee on prior turns, and error-path guardInfrastructure — per-provider hooks
internal/infrastructure/agents/claude_provider.go: Addtokenizerfield; wireextractClaudeTokenUsagehook (parsesresulteventusage, including cache tokens andtotal_cost_usd)internal/infrastructure/agents/gemini_provider.go: Addtokenizerfield; wireextractGeminiTokenUsagehook (parsesresulteventstats)internal/infrastructure/agents/codex_provider.go: Addtokenizerfield; wireextractCodexTokenUsagehook (parsesturn.completedeventusage)internal/infrastructure/agents/copilot_provider.go: Addtokenizerfield; wireextractCopilotTokenUsagehook (parsesassistant.messageeventoutputTokens)internal/infrastructure/agents/opencode_provider.go: Addtokenizerfield; wireextractOpenCodeTokenUsagehook (parsesstep_finisheventpart.tokens)internal/infrastructure/agents/helpers.go: Remove deadestimateTokensandestimateInputTokenshelpers; addintFromMaputility used by extraction hooksinternal/infrastructure/agents/helpers_test.go: Remove tests for deleted helpersinternal/infrastructure/agents/options.go: Add tokenizer-related provider option constantsinternal/infrastructure/agents/provider_options_test.go: Expand option tests to cover tokenizer injectionRemoved
internal/infrastructure/tokenizer/tiktoken_tokenizer.go: Deleted — tiktoken implementation removed;ports.Tokenizeris now the extension pointinternal/infrastructure/tokenizer/tiktoken_tokenizer_test.go: Deleted — accompanying testsInterpolation
pkg/interpolation/reference.go: RegisterTokensInput,TokensOutput,TokensEstimatedinValidStatePropertiespkg/interpolation/resolver.go: Map new token fields intoStepStateDataduring resolutionpkg/interpolation/reference_json_field_test.go: Update fixture expectations for new fieldspkg/interpolation/reference_test.go: Update reference validation testsDependencies
go.mod: Removetiktoken-go,glamour, and several transitive dependencies pulled in by tiktokengo.sum: Remove corresponding checksumsDocs
docs/development/creating-agent-provider.md: New — comprehensive guide for implementing a new agent provider (hooks, token extraction, testing)docs/reference/interpolation.md: DocumentTokensInput,TokensOutput,TokensEstimatedvariables; add per-provider source tabledocs/user-guide/agent-steps.md: Update token tracking section with new fields,TokensEstimatedsemantics, and per-provider source tabledocs/development/architecture.md: Update tokenizer package descriptiondocs/development/project-structure.md: Update tokenizer package annotationProject config
CHANGELOG.md: Add F094 entry under UnreleasedCLAUDE.md: Addnolint:errcheckreplication rule; remove stale pitfall entryTest plan
make build— binary compiles with tiktoken dependency removedmake test— all unit and integration tests pass, including new tokenizer tests inbase_cli_provider_tokenizer_test.gomake lint— zero violations;nolint:errcheckdirectives present with matching comments across all providers{{.states.step.TokensInput}},{{.states.step.TokensOutput}}, and{{.states.step.TokensEstimated}}interpolate correctly in a downstream stepCloses #339
Generated with awf commit workflow