Fix multiple cost/tokens related issues#1795
Conversation
Stop accumulating cost on sess.Cost during streaming. Instead, compute it on the fly from per-message costs via two new Session methods: - TotalCost(): recursively walks messages, sub-sessions, and item costs - OwnCost(): same but excludes sub-sessions (for live events where sub-sessions report their own costs separately) This fixes several cost display bugs: - Sub-session costs were never aggregated into the parent session, causing the sidebar and cost dialog to undercount for multi-agent sessions. - Compaction cost was lost entirely (separate runtime/session, never recorded). Now stored on the summary Item.Cost field so TotalCost() discovers it. - Past sessions showed 0% context and missing costs because LoadFromSession did not set ContextLength and did not aggregate sub-session costs. - Sub-session token usage events were persisted under the parent session ID, temporarily corrupting the parent's DB row. - contextPercent() returned "0%" instead of hiding the indicator when context limit is unknown (past sessions). - cloneSessionItem dropped compaction cost when branching. - recalculateSessionTotals manually summed cost from messages, now redundant with TotalCost(). Assisted-By: cagent
Replace TokenUsage and TokenUsageWithMessage (7-8 positional args) with: - NewTokenUsageEvent(sessionID, agentName, usage): single constructor taking a Usage struct directly - SessionUsage(sess, contextLimit): builds Usage from session fields, eliminating the repeated sess.InputTokens, sess.OutputTokens, sess.InputTokens+sess.OutputTokens pattern at every call site Assisted-By: cagent
The method was defined and tested but never called from anywhere in the codebase. MessageUsageHistory (which IS used for remote mode) and per-message Usage set at creation time cover all cost tracking needs. Also extract contextLength local in the compaction threshold check for readability. Assisted-By: cagent
When reopening a past session, the sidebar context usage percentage was missing because ContextLimit (a model property from models.dev) is never persisted in the session and was not emitted on restore. Make EmitStartupInfo session-aware: when the session has existing token data, look up the model's context limit and emit a TokenUsageEvent so the sidebar can compute and display the context percentage. Assisted-By: cagent
Summarize now emits a TokenUsageEvent after Compact returns, which updates the sidebar cost to include the compaction summary generation. The duplicate emission in the automatic compaction path (RunStream) is removed since Summarize now handles it. Token counts and context % still reflect the last model call; they will correct on the next user message when the reduced context is measured. Assisted-By: cagent
The cost dialog was missing compaction costs because gatherCostData() only iterated over messages from GetAllMessages(), which skips summary items. Compaction costs are stored on session items with Summary set and Cost > 0. Add a pass over session.Messages to pick up these compaction entries and display them in both the total and the per-message breakdown. Assisted-By: cagent
contextPercent() only returned a value when sessionUsage had exactly one entry. During transfer_task, a child session added a second entry, hiding the percentage. After returning to the root agent, both entries remained so the percentage stayed hidden. Look up the session belonging to the current agent via the sessionAgent map so the correct context percentage is shown regardless of how many sessions exist. Assisted-By: cagent
Signed-off-by: David Gageot <david.gageot@docker.com>
There was a problem hiding this comment.
Code Review Summary
Found 1 issue that should be addressed:
Medium Severity
- RAG token usage events are unintentionally filtered out and not persisted to the database
Analysis
This PR makes important improvements to cost/token tracking by introducing TotalCost() and OwnCost() methods, fixing session ID filtering during task transfers, and properly handling compaction costs. However, the session ID filtering inadvertently affects RAG events.
Overall, the refactoring looks solid, but the RAG event filtering needs to be addressed to ensure accurate cost tracking.
| // Only persist token usage for the current session. | ||
| // During task transfers, sub-session events flow through but should | ||
| // not overwrite the parent session's token counts. | ||
| if e.Usage != nil && e.SessionID == sess.ID { |
There was a problem hiding this comment.
Medium: RAG token events are filtered out and not persisted
The condition e.SessionID == sess.ID filters out RAG token usage events because RAG events use an empty SessionID (see runtime.go line ~388: NewTokenUsageEvent("", agentName, ...)).
When RAG operations occur during a session:
- RAG events have
SessionID = "" - The current session has
sess.ID = "some-uuid" - The comparison
"" == "some-uuid"is FALSE - Result: RAG token usage is never persisted via
UpdateSessionTokens()
This means RAG costs are displayed in real-time but not saved to the database, causing session cost reports to undercount actual usage.
Recommendation: Either:
- Pass the actual session ID when creating RAG events, or
- Add special handling for empty SessionID:
if e.Usage != nil && (e.SessionID == sess.ID || e.SessionID == "") {or
3. Accumulate RAG costs directly in the session instead of relying on event-based persistence
No description provided.