Agents/Usage: estimate Ollama usage for the dashboard by TylonHH · Pull Request #39112 · openclaw/openclaw

TylonHH · 2026-03-07T19:18:21Z

Summary

estimate native Ollama usage when the final /api/chat chunk omits prompt_eval_count / eval_count
keep exact Ollama counters when they are present
add regression coverage for both the message builder and streamed Ollama responses

Why

The existing usage dashboard already reads assistant message.usage from session transcripts. Native Ollama sometimes omits its eval counters, which leaves local runs at zero usage and makes them disappear from the current dashboard.

This change keeps the existing dashboard path intact and backfills a chars-based estimate only when Ollama does not return token counts.

Scope

fixes native Ollama visibility in the current usage dashboard
does not add a new JSONL usage sink, new dashboard API, or generic fallback for every OpenAI-compatible local provider

Testing

pnpm test -- src/agents/ollama-stream.test.ts

Refs #38726

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ee425487bd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

greptile-apps · 2026-03-07T19:21:02Z

Greptile Summary

This PR adds a characters-based token-count estimate for native Ollama responses that omit prompt_eval_count / eval_count, keeping local-model runs visible in the existing usage dashboard without changing how exact counters are handled when they are present.

Changes:

Adds estimateOllamaPromptTokens and estimateOllamaCompletionTokens helpers that compute a rough chars / 4 estimate from the serialised messages and response text.
Adds resolveUsageCount to prefer the real Ollama counter when it is a positive finite number, falling back to the estimate otherwise.
Extends buildAssistantMessage with an optional usageFallback parameter — the public signature is backwards-compatible.
Computes the fallback in createOllamaStreamFn and passes it to buildAssistantMessage, correctly reading finalResponse.message.content after it has been populated with the accumulated stream content.
Adds regression coverage for both the unit (buildAssistantMessage) and integration (createOllamaStreamFn) paths.

Notable concern:

resolveUsageCount guards with value > 0, so an explicit prompt_eval_count: 0 (returned by some Ollama builds for fully-cached prompts) is treated as absent and replaced by the estimate, which would over-report input tokens for cached sessions.

Confidence Score: 4/5

Safe to merge — targeted, backwards-compatible change to the Ollama streaming path with no impact on other providers.
The implementation is well-scoped and correct for the stated goal. The only behavioural edge case is that an explicit prompt_eval_count: 0 (KV-cache hit) is overridden by the estimate rather than preserved, which could inflate usage numbers for cached prompts. This is a minor accuracy concern, not a correctness bug, and the rest of the logic (accumulation order, fallback wiring, test coverage) is sound.
src/agents/ollama-stream.ts — specifically the resolveUsageCount function and its value > 0 guard.

_{Last reviewed commit: ee42548}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 74142cb653

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

openclaw-barnacle · 2026-04-23T04:30:52Z

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

clawsweeper · 2026-04-26T01:58:58Z

Codex automated review: keeping this open.

Keep PR #39112 open. Current main still zero-fills omitted native Ollama usage counters, the dashboard/session usage path consumes persisted transcript usage rather than estimating later, and the PR’s underlying fix remains useful even though its patch targets the old src/agents/ollama-stream.* layout and needs porting to the bundled Ollama plugin.

Best possible solution:

Keep this PR open or replace it with an equivalent current-main fix in extensions/ollama/src/stream.ts: estimate native Ollama input/output usage only when counters are absent, preserve exact nonnegative counters including explicit zero, estimate from converted message/image/tool/tool-schema payloads and accumulated streamed output, and add focused regression coverage in the current Ollama plugin tests.

What I checked:

Current native Ollama implementation still zero-fills missing counters: buildAssistantMessage builds usage from response.prompt_eval_count ?? 0 and response.eval_count ?? 0, so omitted Ollama counters become persisted zero usage rather than a fallback estimate. (extensions/ollama/src/stream.ts:549, 406ae72fd278)
Stream finalization has the inputs needed for a fallback but does not compute one: createOllamaStreamFn converts messages/tools, accumulates streamed content/thinking/tool calls, mutates finalResponse, then calls buildAssistantMessage(finalResponse, modelInfo) with no fallback usage parameter or estimator. (extensions/ollama/src/stream.ts:620, 406ae72fd278)
Prompt payload surfaces that should feed an estimator are present in current plugin code: Current Ollama messages include text content, images, assistant tool_calls, tool result content, and tool_name; tools are extracted into native Ollama function schemas. A current-main port can estimate from these converted payloads. (extensions/ollama/src/stream.ts:316, 406ae72fd278)
Current tests cover exact counters and zero-cost behavior, not omitted-counter estimates: The current buildAssistantMessage tests assert real prompt_eval_count/eval_count handling and local cost zeroing, but there is no regression asserting nonzero estimated usage when counters are omitted. (extensions/ollama/src/stream-runtime.test.ts:348, 406ae72fd278)
Dashboard usage path reads persisted transcript usage: Session usage parsing reads message.usage or entry.usage, normalizes those values, and sums their token fields; the dashboard usage surface requests sessions.usage, so zero-filled provider usage remains zero downstream. (src/infra/session-cost-usage.ts:134, 406ae72fd278)
PR patch is stale against current file layout: Current main has no src/agents/ollama-stream.ts or src/agents/ollama-stream.test.ts; the active native Ollama runtime is exported from extensions/ollama/runtime-api.ts and implemented in extensions/ollama/src/stream.ts. (extensions/ollama/runtime-api.ts:1, 406ae72fd278)

Remaining risk / open question:

The submitted PR should not merge as-is because it targets the removed src/agents/ollama-stream.* layout.
The eventual estimator needs to preserve explicit nonnegative Ollama counters, including 0, and include prompt payloads such as images, tool calls, tool results, and tool schemas.

Codex Review notes: model gpt-5.5, reasoning high; reviewed against 406ae72fd278.

steipete · 2026-04-27T04:54:32Z

Thanks for this. I ported the fix to the current Ollama plugin runtime on main in 5d5c377.

The original patch targeted the removed src/agents/ollama-stream.* layout, but the active code now lives in extensions/ollama/src/stream.ts. The landed version keeps the same behavior intent:

preserve exact Ollama prompt_eval_count / eval_count values when present, including explicit 0
estimate input/output usage only when native /api/chat omits those counters
include converted prompt payloads such as images and tool schemas in the prompt estimate
persist the estimated usage on the assistant message so existing session/dashboard usage surfaces can see local Ollama runs

Verification:

pnpm test extensions/ollama/src/stream-runtime.test.ts extensions/ollama/src/stream.test.ts
pnpm check:changed
live local Ollama llama3.2:latest through a localhost proxy that stripped prompt_eval_count and eval_count; OpenClaw still produced fallback usage input=8 output=1 totalTokens=9 and completed the stream with done

Closing this PR as carried forward by current main.

openclaw-barnacle Bot added agents Agent runtime and tooling size: S labels Mar 7, 2026

chatgpt-codex-connector Bot reviewed Mar 7, 2026

View reviewed changes

Comment thread src/agents/ollama-stream.ts Outdated

greptile-apps Bot reviewed Mar 7, 2026

View reviewed changes

Comment thread src/agents/ollama-stream.ts

Comment thread src/agents/ollama-stream.ts

TylonHH force-pushed the codex/ollama-usage-dashboard branch from 0bf0edb to fcea3e4 Compare March 8, 2026 00:04

TylonHH added 2 commits March 13, 2026 16:48

Agents: estimate Ollama usage for dashboard

12e3e55

Agents: preserve explicit zero Ollama usage counts

73a8c2c

TylonHH force-pushed the codex/ollama-usage-dashboard branch from fcea3e4 to 73a8c2c Compare March 13, 2026 15:49

style(agents): format ollama stream call

74142cb

chatgpt-codex-connector Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread src/agents/ollama-stream.ts

fix(agents): count Ollama image payloads in usage estimates

3e8cc2f

openclaw-barnacle Bot added the stale Marked as stale due to inactivity label Apr 23, 2026

openclaw-barnacle Bot removed the stale Marked as stale due to inactivity label Apr 26, 2026

steipete closed this Apr 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Agents/Usage: estimate Ollama usage for the dashboard#39112

Agents/Usage: estimate Ollama usage for the dashboard#39112
TylonHH wants to merge 4 commits intoopenclaw:mainfrom
TylonHH:codex/ollama-usage-dashboard

TylonHH commented Mar 7, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

greptile-apps Bot commented Mar 7, 2026

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

openclaw-barnacle Bot commented Apr 23, 2026

Uh oh!

clawsweeper Bot commented Apr 26, 2026 •

edited

Loading

Uh oh!

steipete commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

TylonHH commented Mar 7, 2026

Summary

Why

Scope

Testing

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

greptile-apps Bot commented Mar 7, 2026

Greptile Summary

Confidence Score: 4/5

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

openclaw-barnacle Bot commented Apr 23, 2026

Uh oh!

clawsweeper Bot commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

steipete commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

clawsweeper Bot commented Apr 26, 2026 •

edited

Loading