Skip to content

Agents/Usage: estimate Ollama usage for the dashboard#39112

Closed
TylonHH wants to merge 4 commits intoopenclaw:mainfrom
TylonHH:codex/ollama-usage-dashboard
Closed

Agents/Usage: estimate Ollama usage for the dashboard#39112
TylonHH wants to merge 4 commits intoopenclaw:mainfrom
TylonHH:codex/ollama-usage-dashboard

Conversation

@TylonHH
Copy link
Copy Markdown

@TylonHH TylonHH commented Mar 7, 2026

Summary

  • estimate native Ollama usage when the final /api/chat chunk omits prompt_eval_count / eval_count
  • keep exact Ollama counters when they are present
  • add regression coverage for both the message builder and streamed Ollama responses

Why

The existing usage dashboard already reads assistant message.usage from session transcripts. Native Ollama sometimes omits its eval counters, which leaves local runs at zero usage and makes them disappear from the current dashboard.

This change keeps the existing dashboard path intact and backfills a chars-based estimate only when Ollama does not return token counts.

Scope

  • fixes native Ollama visibility in the current usage dashboard
  • does not add a new JSONL usage sink, new dashboard API, or generic fallback for every OpenAI-compatible local provider

Testing

  • pnpm test -- src/agents/ollama-stream.test.ts

Refs #38726

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S labels Mar 7, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ee425487bd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/ollama-stream.ts Outdated
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 7, 2026

Greptile Summary

This PR adds a characters-based token-count estimate for native Ollama responses that omit prompt_eval_count / eval_count, keeping local-model runs visible in the existing usage dashboard without changing how exact counters are handled when they are present.

Changes:

  • Adds estimateOllamaPromptTokens and estimateOllamaCompletionTokens helpers that compute a rough chars / 4 estimate from the serialised messages and response text.
  • Adds resolveUsageCount to prefer the real Ollama counter when it is a positive finite number, falling back to the estimate otherwise.
  • Extends buildAssistantMessage with an optional usageFallback parameter — the public signature is backwards-compatible.
  • Computes the fallback in createOllamaStreamFn and passes it to buildAssistantMessage, correctly reading finalResponse.message.content after it has been populated with the accumulated stream content.
  • Adds regression coverage for both the unit (buildAssistantMessage) and integration (createOllamaStreamFn) paths.

Notable concern:

  • resolveUsageCount guards with value > 0, so an explicit prompt_eval_count: 0 (returned by some Ollama builds for fully-cached prompts) is treated as absent and replaced by the estimate, which would over-report input tokens for cached sessions.

Confidence Score: 4/5

  • Safe to merge — targeted, backwards-compatible change to the Ollama streaming path with no impact on other providers.
  • The implementation is well-scoped and correct for the stated goal. The only behavioural edge case is that an explicit prompt_eval_count: 0 (KV-cache hit) is overridden by the estimate rather than preserved, which could inflate usage numbers for cached prompts. This is a minor accuracy concern, not a correctness bug, and the rest of the logic (accumulation order, fallback wiring, test coverage) is sound.
  • src/agents/ollama-stream.ts — specifically the resolveUsageCount function and its value > 0 guard.

Last reviewed commit: ee42548

Comment thread src/agents/ollama-stream.ts
Comment thread src/agents/ollama-stream.ts
@TylonHH TylonHH force-pushed the codex/ollama-usage-dashboard branch from 0bf0edb to fcea3e4 Compare March 8, 2026 00:04
@TylonHH TylonHH force-pushed the codex/ollama-usage-dashboard branch from fcea3e4 to 73a8c2c Compare March 13, 2026 15:49
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 74142cb653

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/ollama-stream.ts
@openclaw-barnacle
Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle Bot added the stale Marked as stale due to inactivity label Apr 23, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 26, 2026

Codex automated review: keeping this open.

Keep PR #39112 open. Current main still zero-fills omitted native Ollama usage counters, the dashboard/session usage path consumes persisted transcript usage rather than estimating later, and the PR’s underlying fix remains useful even though its patch targets the old src/agents/ollama-stream.* layout and needs porting to the bundled Ollama plugin.

Best possible solution:

Keep this PR open or replace it with an equivalent current-main fix in extensions/ollama/src/stream.ts: estimate native Ollama input/output usage only when counters are absent, preserve exact nonnegative counters including explicit zero, estimate from converted message/image/tool/tool-schema payloads and accumulated streamed output, and add focused regression coverage in the current Ollama plugin tests.

What I checked:

Remaining risk / open question:

  • The submitted PR should not merge as-is because it targets the removed src/agents/ollama-stream.* layout.
  • The eventual estimator needs to preserve explicit nonnegative Ollama counters, including 0, and include prompt payloads such as images, tool calls, tool results, and tool schemas.

Codex Review notes: model gpt-5.5, reasoning high; reviewed against 406ae72fd278.

@openclaw-barnacle openclaw-barnacle Bot removed the stale Marked as stale due to inactivity label Apr 26, 2026
@steipete
Copy link
Copy Markdown
Contributor

Thanks for this. I ported the fix to the current Ollama plugin runtime on main in 5d5c377.

The original patch targeted the removed src/agents/ollama-stream.* layout, but the active code now lives in extensions/ollama/src/stream.ts. The landed version keeps the same behavior intent:

  • preserve exact Ollama prompt_eval_count / eval_count values when present, including explicit 0
  • estimate input/output usage only when native /api/chat omits those counters
  • include converted prompt payloads such as images and tool schemas in the prompt estimate
  • persist the estimated usage on the assistant message so existing session/dashboard usage surfaces can see local Ollama runs

Verification:

  • pnpm test extensions/ollama/src/stream-runtime.test.ts extensions/ollama/src/stream.test.ts
  • pnpm check:changed
  • live local Ollama llama3.2:latest through a localhost proxy that stripped prompt_eval_count and eval_count; OpenClaw still produced fallback usage input=8 output=1 totalTokens=9 and completed the stream with done

Closing this PR as carried forward by current main.

@steipete steipete closed this Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants