Skip to content

feat: add real-time token output speed chip to footer#1756

Closed
LIYUE1918 wants to merge 5 commits into
Hmbown:mainfrom
LIYUE1918:feat/token-output-speed
Closed

feat: add real-time token output speed chip to footer#1756
LIYUE1918 wants to merge 5 commits into
Hmbown:mainfrom
LIYUE1918:feat/token-output-speed

Conversation

@LIYUE1918
Copy link
Copy Markdown

Adds a new StatusItem::OutputSpeed variant that shows estimated tokens-per-second in the footer during streaming, and persists the average speed after the turn completes.

Changes

  • App: new stream_output_chars, stream_started_at, last_turn_output_speed fields
  • ui.rs: counts chars in MessageDelta / ThinkingDelta, computes avg from actual usage.output_tokens at turn end
  • footer_ui.rs: footer_output_speed_spans() with real-time estimate during streaming, stored avg after turn
  • config.rs / config_ui.rs: new StatusItem::OutputSpeed variant
  • tests.rs: 3 unit tests covering hidden/streaming/persisted states

Behaviour

State Footer shows
Idle hidden
Streaming 12 tok/s (estimated, chars/4)
Turn complete 12 tok/s avg (from API output_tokens)
Next turn starts hidden until streaming resumes

LIYUE1918 added 4 commits May 18, 2026 12:40
Adds a new StatusItem::OutputSpeed variant that renders estimated
tokens-per-second in the footer during streaming. Uses a 1:4
token:character ratio to approximate real-time speed from stream
deltas, since the API only reports exact token counts at turn end.

- App: tracks stream_output_chars + stream_started_at
- ui.rs: counts chars in MessageDelta and ThinkingDelta handlers
- config.rs / config_ui.rs: new StatusItem variant with all methods
- footer_ui.rs: footer_output_speed_spans() with color gradient
- resets on TurnStarted and turn completion/interruption/failure
- default_footer_includes_output_speed
- footer_output_speed_hidden_when_not_streaming
- footer_output_speed_renders_rate_during_streaming
- App: new last_turn_output_speed field stores formatted avg speed
- footer_output_speed_spans: real-time estimate during streaming,
  then fallback to stored avg after turn ends
- turn completion: compute avg from actual usage.output_tokens + turn_elapsed
- TurnStarted: clear stored avg so chip hides until next streaming
Replaced turn_started_at (includes request prep + TTFB) with
stream_started_at (first content delta) for a more accurate
measurement of generation throughput.
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a real-time token output speed indicator to the TUI footer, tracking character output during streaming to estimate tokens per second and displaying a final average upon turn completion. The implementation includes new state tracking in the App struct, updates to the event loop for timing and volume capture, and a new footer rendering function. Feedback from the reviewer suggests simplifying the character accumulation logic and refactoring duplicated tracking and formatting code into shared helper methods to improve maintainability.

Comment thread crates/tui/src/tui/ui.rs Outdated
Comment on lines +967 to +969
app.stream_output_chars = app
.stream_output_chars
.saturating_add(u64::try_from(sanitized.len()).unwrap_or(0));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The accumulation of stream_output_chars can be simplified. Using as u64 is safe here as sanitized.len() (representing a single delta) will not exceed u64::MAX. Additionally, this tracking logic is duplicated in the ThinkingDelta handler; consider refactoring it into a helper method on App.

                        app.stream_output_chars = app.stream_output_chars.saturating_add(sanitized.len() as u64);

Comment thread crates/tui/src/tui/ui.rs Outdated
Comment on lines +1093 to +1095
app.stream_output_chars = app
.stream_output_chars
.saturating_add(u64::try_from(sanitized.len()).unwrap_or(0));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The accumulation of stream_output_chars can be simplified here as well, matching the logic in MessageDelta.

                        app.stream_output_chars = app.stream_output_chars.saturating_add(sanitized.len() as u64);

Comment thread crates/tui/src/tui/ui.rs Outdated
Comment on lines +1379 to +1383
let label = if avg_tps >= 10.0 {
format!("{:.0} tok/s avg", avg_tps)
} else {
format!("{:.1} tok/s avg", avg_tps)
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for formatting the tokens-per-second label (rounding to 0 or 1 decimal places based on the value) is duplicated here and in footer_output_speed_spans (lines 684-688 in footer_ui.rs). Consider extracting this into a shared utility function to ensure consistent formatting across the real-time estimate and the final average.

@LIYUE1918
Copy link
Copy Markdown
Author

Addressed both suggestions in 6b77976:

  • Replaced u64::try_from(sanitized.len()).unwrap_or(0) with sanitized.len() as u64 in both MessageDelta and ThinkingDelta handlers.
  • Extracted shared format_tok_s_label() helper into footer_ui.rs and used it in both footer_output_speed_spans() and the turn-completion avg computation.

@Hmbown Hmbown added this to the v0.8.48 milestone May 21, 2026
@Hmbown
Copy link
Copy Markdown
Owner

Hmbown commented May 23, 2026

This PR was opened before the v0.8.41 rebrand and is now stale. Feel free to rebase onto current main and reopen. 鲸鱼兄弟们等你 🐋

@Hmbown Hmbown closed this May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants