feat: replace placeholder token estimates with real counts from agentic-kit#1210
Merged
Conversation
…ic-kit v1.2.1 - Bump @agentic-kit/ollama from ^1.0.3 to ^1.2.1 - Add LlmUsage and ChatResult types to graphile-llm - Update ChatFunction to return ChatResult (content + usage) instead of string - Rewrite chat.ts to use OllamaAdapter.stream() for real token counts - Update metering.ts to extract real usage from ChatResult - Rewrite llm-api.ts streaming + non-streaming paths to use OllamaAdapter - Update rag-plugin to use chatResult.content and chatResult.usage.totalTokens - Remove all placeholderAmountTokens usage - Update test mocks to match new OllamaAdapter interface Embeddings keep ~4 chars/token estimation (Ollama embedding API has no token counts).
Contributor
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
- Bump @agentic-kit/ollama to ^2.0.0 (breaking: generateEmbedding returns EmbeddingResult)
- Add EmbeddingResult type { embedding: number[], promptTokens: number }
- Update EmbedderFunction return type to Promise<EmbeddingResult>
- Replace ~4 chars/token placeholder in metering.ts with real promptTokens
- Update all embedding consumers to destructure { embedding, promptTokens }
- CLI codegen template extracts .embedding internally (keeps Promise<number[]> API)
- Update tests for new return type shape
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces all placeholder token estimates with real provider counts from
@agentic-kit/ollama@2.0.0.Chat tokens (from earlier commits):
OllamaAdapter.stream()returnsUsagewith realprompt_tokens,completion_tokens,reasoning_tokens,cache_read_tokens,cache_write_tokens.ChatFunctionnow returnsChatResult { content, usage }instead ofstring.Embedding tokens (new commit):
generateEmbedding()now returnsEmbeddingResult { embedding, promptTokens }from Ollama's/api/embedendpoint (prompt_eval_count). Removes the ~4 chars/token placeholder estimate inmetering.ts.Changes
@agentic-kit/ollamafrom^1.2.1to^2.0.0(breaking:generateEmbeddingreturnsEmbeddingResult)EmbeddingResulttype tographile-llm/types.ts, updateEmbedderFunctionreturn typemetering.ts: replaceMath.ceil(text.length / 4)with realpromptTokensfrom providermetering.ts: recordprompt_tokensin usage metadata andrawUsage{ embedding, promptTokens }.embeddinginternally (keepsPromise<number[]>API for generated code)EmbeddingResultshapeReview & Testing Checklist for Human
EmbeddingResulttype flows correctly through metering → billing pipeline (checkrecord_usagereceives realpromptTokensnot estimated values)buildEmbedder({ provider: 'ollama' })should return{ embedding: number[], promptTokens: number }withpromptTokens > 0--auto-embedstill works (codegen template returnsnumber[]notEmbeddingResult)metering.tslogsinputTokens: 0(no estimate since embedding wasn't called)Notes
@agentic-kit/ollama@2.0.0(published, PR Feat/plan #11 merged in agentic-kit)as unknown as EmbedderFunctioncast inmetering-plugin.tsis intentional — the metered wrapper returnsnumber[] | null(stripping token count after recording it), whilellmEmbedderonBuildis typed asEmbedderFunction. Downstream plugins (text-search, text-mutation) already cast to(text: string) => Promise<number[] | null>.Link to Devin session: https://app.devin.ai/sessions/2b5a29d83d3f478e8d3d972653b4879c
Requested by: @pyramation