Skip to content

feat: Add token processing and generation speed to usage report#328

Merged
dwash96 merged 2 commits intocecli-dev:v0.91.5from
sannysanoff:feature/show-token-speed
Jan 1, 2026
Merged

feat: Add token processing and generation speed to usage report#328
dwash96 merged 2 commits intocecli-dev:v0.91.5from
sannysanoff:feature/show-token-speed

Conversation

@sannysanoff
Copy link
Copy Markdown

Summary

Adds token speed metrics to the usage report when --show-speed flag is enabled.

Features

  • --show-speed flag: Enable display of LLM timing and speed metrics (disabled by default)
  • Time tracking: Measures total LLM response time and time to first token (TtFT)
  • Speed calculation:
    • Prompt processing speed: prompt tokens / time to first token
    • Token generation speed: completion tokens / (total time - time to first token)

Output Example

With --show-speed enabled:

Tokens: 24k sent, 1.4k received.
LLM elapsed time: 35.00 seconds (TtFT: 2.50s)
Speed: 9600 prompt tokens/sec, 57 output tokens/sec

Without the flag, no timing information is shown.

Changes

  • aider/args.py: Added --show-speed argument
  • aider/coders/base_coder.py:
    • Track LLM response time after streaming completes
    • Track time to first token for streaming responses
    • Added _add_speed_info() helper method
    • Speed info only shown when --show-speed is enabled

🤖 Generated with Claude Code

- Add --show-speed flag to enable speed display (disabled by default)
- Track LLM response time (llm_elapsed) after streaming completes
- Track time to first token (TtFT) for streaming responses
- Calculate and display prompt processing speed (tokens/sec)
- Calculate and display token generation speed (output tokens/sec)

Output with --show-speed:
  LLM elapsed time: X.XX seconds (TtFT: X.XXs)
  Speed: XXX prompt tokens/sec, XXX output tokens/sec

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@dwash96
Copy link
Copy Markdown
Collaborator

dwash96 commented Dec 30, 2025

I'm not really into this structure of just sort of adding random if statements in an already long and dense file, instead of say a more general helpers/profiler.py file that, probably a start(self, type), log(self, type), elapsed(self, type) or similar. And the profiler handles gathering, and restoring the state of these types of timers. _add_speed_info() makes sense but I want to drive towards general utilities that handle discrete and useful operations

@sannysanoff
Copy link
Copy Markdown
Author

sannysanoff commented Dec 31, 2025

Hi, thanks for taking time to review my slop. I tried to address your concerns and make it composable instead of adding complexity into already dense logic. Now base_coder.py diff is as minimal as possible, while keeping all functionality. Please advise if I can improve it even better.

@dwash96
Copy link
Copy Markdown
Collaborator

dwash96 commented Dec 31, 2025

No problem, Time to first token is a good metric to have on hand, I think it's good as is, I'll change self.profiler to self.token_profiler after the merge but that's one regex replace lol

@dwash96 dwash96 changed the base branch from main to v0.91.5 December 31, 2025 23:43
@dwash96
Copy link
Copy Markdown
Collaborator

dwash96 commented Dec 31, 2025

Oh, but can you set up and run the pre-commit hook stuff, for the formatting? I want to be a bit more focused on making sure PRs are well formatted just as like a community standard

@dwash96 dwash96 merged commit f9b165e into cecli-dev:v0.91.5 Jan 1, 2026
7 of 8 checks passed
@dwash96 dwash96 mentioned this pull request Jan 1, 2026
@sannysanoff
Copy link
Copy Markdown
Author

Thanks, cool!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants