Skip to content

Token Count Support for the Completions and Experiment#24

Closed
jacekzimonski wants to merge 6 commits intomainfrom
jacek/token-count-support-for-the-completions-and-experiment
Closed

Token Count Support for the Completions and Experiment#24
jacekzimonski wants to merge 6 commits intomainfrom
jacek/token-count-support-for-the-completions-and-experiment

Conversation

@jacekzimonski
Copy link
Copy Markdown
Collaborator

@jacekzimonski jacekzimonski commented Aug 19, 2025

Request from Slack:
do we show the reasoning tokens COUNT in another AI? for models where the reasoning text isn’t available, it would help to show the reasoning tokens [@jacek.zimonski](https://workflowaihq.slack.com/team/U05SZFUQU75)

To get those values needed to:

  • modify the backend endpoints
  • add the calculated token count to the SQL (Clerk)

It would work without the SQL (Clerk), but we would loose the display for the Experiment Page and have it only in the CompletionModal. Reason for that is that when returning the experiment explicitly calculating of the traces information (where we store the token costs) was blocked.

In the frontend we show the Token Count for the Reasoning Models is it's more then 0 in:

  • Experiment Page Completion Cell, below the Cost and Duration
  • In the optional third column in the CompletionModal, next to Duration and Cost
  • In the Completions Page, as one of the new properties
Screenshot 2025-08-19 at 12 51 16 PM Screenshot 2025-08-19 at 12 51 22 PM Screenshot 2025-08-19 at 12 51 34 PM Screenshot 2025-08-19 at 12 55 34 PM

@vercel
Copy link
Copy Markdown

vercel Bot commented Aug 19, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
anotherai-docs Ready Ready Preview Comment Aug 19, 2025 0:46am

@jacekzimonski
Copy link
Copy Markdown
Collaborator Author

@anyacherniss
To test it docker compose will need to be restarted

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Aug 19, 2025

Claude finished @jacekzimonski's task —— View job


📋 Code Review: Token Count Support for Completions and Experiment

Overall Assessment: Excellent Implementation

This PR successfully implements reasoning token count tracking with high-quality code, comprehensive testing, and proper database design. The implementation is clean, well-documented, and follows established patterns throughout the codebase.


🔧 Backend Implementation - Excellent ⭐⭐⭐⭐⭐

Domain Model Changes (backend/core/domain/agent_completion.py:38-42)

  • ✅ Excellent: Clear, comprehensive documentation explaining field semantics
  • ✅ Excellent: Proper nullable type handling (int | None) with clear meaning distinction
  • ✅ Excellent: Follows established patterns in the codebase

Business Logic (backend/core/runners/agent_completion_builder.py:54-74)

  • ✅ Excellent: Clean aggregation logic with proper null handling
  • ✅ Excellent: Distinguishes between None (no reasoning tokens) and 0 (zero tokens used)
  • ✅ Excellent: Efficient single-pass aggregation algorithm
  • ✅ Excellent: Proper type casting with int() for safety

Database Design ⭐⭐⭐⭐⭐

  • ✅ Excellent: Proper nullable column design (Nullable(UInt32))
  • ✅ Excellent: Well-designed minmax index for efficient filtering/sorting
  • ✅ Excellent: Clear migration documentation explaining use cases
  • ✅ Excellent: Follows ClickHouse best practices

API Layer (backend/protocol/api/_api_models.py:320-323)

  • ✅ Excellent: Consistent field definitions across all models
  • ✅ Excellent: Proper type annotations throughout
  • ✅ Excellent: Clean data flow from domain to API models

🧪 Test Coverage - Outstanding ⭐⭐⭐⭐⭐

Comprehensive Test Suite (backend/core/runners/agent_completion_builder_test.py)

  • ✅ Outstanding: 8 comprehensive test scenarios covering all edge cases
  • ✅ Outstanding: Tests single trace, multiple traces, mixed traces, zero tokens
  • ✅ Outstanding: Proper edge case handling (no completions, no usage data)
  • ✅ Outstanding: Clear test names and documentation
  • ✅ Outstanding: Follows established testing patterns

🎨 Frontend Implementation - Excellent ⭐⭐⭐⭐⭐

UI Components (web/src/components/PriceAndLatencyDisplay.tsx)

  • ✅ Excellent: Clean conditional rendering with hasReasoningTokens check
  • ✅ Excellent: Consistent styling with existing cost/duration displays
  • ✅ Excellent: Proper token formatting with formatTokens() utility
  • ✅ Excellent: Responsive design maintained

Type Definitions (web/src/types/models.ts)

  • ✅ Excellent: Consistent field definitions across all interfaces
  • ✅ Excellent: Proper optional field handling
  • ✅ Excellent: Type safety maintained throughout

Integration Points

  • ✅ Excellent: Proper integration in Experiment page, Completion modal, and Completions list
  • ✅ Excellent: Data flows correctly from backend to UI
  • ✅ Excellent: Maintains existing UI patterns

🔒 Security Assessment - Secure

  • ✅ Secure: No sensitive data exposure in token counts
  • ✅ Secure: Proper input validation and type checking
  • ✅ Secure: No injection vulnerabilities identified
  • ✅ Secure: Follows established security patterns

Performance Assessment - Optimized

Database Performance

  • ✅ Optimized: Minmax index enables efficient queries on reasoning_token_count
  • ✅ Optimized: Nullable column design minimizes storage overhead
  • ✅ Optimized: Single-pass aggregation algorithm in business logic

Frontend Performance

  • ✅ Optimized: Conditional rendering avoids unnecessary DOM updates
  • ✅ Optimized: Efficient token formatting utility
  • ✅ Optimized: No performance regressions identified

🏗️ Code Quality - Excellent ⭐⭐⭐⭐⭐

Best Practices

  • ✅ Excellent: Consistent error handling patterns
  • ✅ Excellent: Proper separation of concerns
  • ✅ Excellent: Clear naming conventions throughout
  • ✅ Excellent: Comprehensive code documentation

Architecture

  • ✅ Excellent: Clean data flow from traces → domain → API → frontend
  • ✅ Excellent: Proper abstraction layers maintained
  • ✅ Excellent: Follows established codebase patterns
  • ✅ Excellent: No architectural debt introduced

Maintainability

  • ✅ Excellent: Well-documented field semantics
  • ✅ Excellent: Clear test coverage for future modifications
  • ✅ Excellent: Extensible design for future token types

📊 Implementation Highlights

  1. Robust Null Handling: Perfect distinction between None (no reasoning) vs 0 (zero tokens)
  2. Performance-First Database Design: Efficient indexing strategy for analytics queries
  3. Comprehensive Edge Case Testing: All scenarios covered with clear assertions
  4. Consistent UI Integration: Seamless integration across all interfaces
  5. Production-Ready Code: Follows all established patterns and best practices

🎯 Final Recommendation: APPROVE

This PR demonstrates excellent engineering practices with:

  • High-quality implementation across backend and frontend
  • Outstanding test coverage with comprehensive edge cases
  • Proper database design with performance optimization
  • Security-conscious approach with no vulnerabilities
  • Clean, maintainable code following established patterns

The implementation is production-ready and ready for merge.


@jacekzimonski
Copy link
Copy Markdown
Collaborator Author

Minor Suggestions

@claude fixed all the Minor Suggestions, please look and confirm

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Aug 19, 2025

Claude finished @jacekzimonski's task —— View job


✅ Minor Suggestions Review Complete

All previously identified minor suggestions have been successfully addressed:

Todo List:

  • Check for database index on reasoning_token_count
  • Verify unit tests for token aggregation logic
  • Verify unit tests for UI conditional rendering behavior
  • Check for added documentation/comments in domain model
  • Provide final confirmation

✅ CONFIRMED: All Minor Suggestions Addressed

1. Database Index Added

  • File: backend/core/storage/clickhouse/migrations/5_reasoning_tokens_index.sql
  • Implementation: Added minmax index on reasoning_token_count column
  • Benefits: Enables efficient filtering and sorting operations as documented in the migration

2. Unit Tests Added

  • File: backend/core/runners/agent_completion_builder_test.py (193 lines of comprehensive tests)
  • Coverage: Excellent test coverage including:
    • Single trace reasoning token aggregation
    • Multiple trace aggregation (summing tokens)
    • Mixed traces (some with/without reasoning tokens)
    • Edge cases (zero tokens, no traces, no usage data)
    • Proper null handling when no reasoning tokens exist

3. Documentation Added

  • File: backend/core/domain/agent_completion.py:38-42
  • Implementation: Comprehensive docstring explaining:
    • When the field is populated (summing from LLMTrace reasoning tokens)
    • Difference between None (no reasoning tokens used) vs 0 (reasoning model used but generated zero tokens)
    • Source of the data (InferenceUsage.prompt)

4. UI Implementation Verified

  • File: web/src/components/PriceAndLatencyDisplay.tsx:26,38-43
  • Implementation: Proper conditional rendering with hasReasoningTokens check
  • Logic: Only displays when reasoningTokens != null && reasoningTokens > 0

Quality Assessment

The fixes demonstrate excellent attention to detail:

  • Database: Proper indexing strategy for performance
  • Testing: Comprehensive edge case coverage with 8 test scenarios
  • Documentation: Clear, technical documentation explaining field semantics
  • UI: Clean conditional rendering avoiding UI clutter

No remaining issues found. The implementation is production-ready and follows best practices throughout.


Copy link
Copy Markdown
Collaborator

@anyacherniss anyacherniss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This passes QA for me, but will not merge until Guillaume reviews

@jacekzimonski
Copy link
Copy Markdown
Collaborator Author

jacekzimonski commented Aug 20, 2025

@anyacherniss
Yes agree, too much change on backend to merge it without @guillaq review

@jacekzimonski
Copy link
Copy Markdown
Collaborator Author

The backend changed a lot since this proposed change, so a different implementation if we move with this functionality will be needed. Becasue of that I will be closing the PR

@jacekzimonski
Copy link
Copy Markdown
Collaborator Author

@anyacherniss
Don't have anything in my queue, so will create a new implementation for this one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants