Token Count Support for the Completions and Experiment by jacekzimonski · Pull Request #24 · anotherai-dev/anotherai

jacekzimonski · 2025-08-19T11:01:54Z

Request from Slack:
do we show the reasoning tokens COUNT in another AI? for models where the reasoning text isn’t available, it would help to show the reasoning tokens [@jacek.zimonski](https://workflowaihq.slack.com/team/U05SZFUQU75)

To get those values needed to:

modify the backend endpoints
add the calculated token count to the SQL (Clerk)

It would work without the SQL (Clerk), but we would loose the display for the Experiment Page and have it only in the CompletionModal. Reason for that is that when returning the experiment explicitly calculating of the traces information (where we store the token costs) was blocked.

In the frontend we show the Token Count for the Reasoning Models is it's more then 0 in:

Experiment Page Completion Cell, below the Cost and Duration
In the optional third column in the CompletionModal, next to Duration and Cost
In the Completions Page, as one of the new properties

vercel · 2025-08-19T11:02:00Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
anotherai-docs	Ready	Preview	Comment	Aug 19, 2025 0:46am

jacekzimonski · 2025-08-19T11:05:15Z

@anyacherniss
To test it docker compose will need to be restarted

claude · 2025-08-19T12:05:46Z

Claude finished @jacekzimonski's task —— View job

📋 Code Review: Token Count Support for Completions and Experiment

✅ Overall Assessment: Excellent Implementation

This PR successfully implements reasoning token count tracking with high-quality code, comprehensive testing, and proper database design. The implementation is clean, well-documented, and follows established patterns throughout the codebase.

🔧 Backend Implementation - Excellent ⭐⭐⭐⭐⭐

Domain Model Changes (`backend/core/domain/agent_completion.py:38-42`)

✅ Excellent: Clear, comprehensive documentation explaining field semantics
✅ Excellent: Proper nullable type handling (int | None) with clear meaning distinction
✅ Excellent: Follows established patterns in the codebase

Business Logic (`backend/core/runners/agent_completion_builder.py:54-74`)

✅ Excellent: Clean aggregation logic with proper null handling
✅ Excellent: Distinguishes between None (no reasoning tokens) and 0 (zero tokens used)
✅ Excellent: Efficient single-pass aggregation algorithm
✅ Excellent: Proper type casting with int() for safety

Database Design ⭐⭐⭐⭐⭐

✅ Excellent: Proper nullable column design (Nullable(UInt32))
✅ Excellent: Well-designed minmax index for efficient filtering/sorting
✅ Excellent: Clear migration documentation explaining use cases
✅ Excellent: Follows ClickHouse best practices

API Layer (`backend/protocol/api/_api_models.py:320-323`)

✅ Excellent: Consistent field definitions across all models
✅ Excellent: Proper type annotations throughout
✅ Excellent: Clean data flow from domain to API models

🧪 Test Coverage - Outstanding ⭐⭐⭐⭐⭐

Comprehensive Test Suite (`backend/core/runners/agent_completion_builder_test.py`)

✅ Outstanding: 8 comprehensive test scenarios covering all edge cases
✅ Outstanding: Tests single trace, multiple traces, mixed traces, zero tokens
✅ Outstanding: Proper edge case handling (no completions, no usage data)
✅ Outstanding: Clear test names and documentation
✅ Outstanding: Follows established testing patterns

🎨 Frontend Implementation - Excellent ⭐⭐⭐⭐⭐

UI Components (`web/src/components/PriceAndLatencyDisplay.tsx`)

✅ Excellent: Clean conditional rendering with hasReasoningTokens check
✅ Excellent: Consistent styling with existing cost/duration displays
✅ Excellent: Proper token formatting with formatTokens() utility
✅ Excellent: Responsive design maintained

Type Definitions (`web/src/types/models.ts`)

✅ Excellent: Consistent field definitions across all interfaces
✅ Excellent: Proper optional field handling
✅ Excellent: Type safety maintained throughout

Integration Points

✅ Excellent: Proper integration in Experiment page, Completion modal, and Completions list
✅ Excellent: Data flows correctly from backend to UI
✅ Excellent: Maintains existing UI patterns

🔒 Security Assessment - Secure ✅

✅ Secure: No sensitive data exposure in token counts
✅ Secure: Proper input validation and type checking
✅ Secure: No injection vulnerabilities identified
✅ Secure: Follows established security patterns

⚡ Performance Assessment - Optimized ✅

Database Performance

✅ Optimized: Minmax index enables efficient queries on reasoning_token_count
✅ Optimized: Nullable column design minimizes storage overhead
✅ Optimized: Single-pass aggregation algorithm in business logic

Frontend Performance

✅ Optimized: Conditional rendering avoids unnecessary DOM updates
✅ Optimized: Efficient token formatting utility
✅ Optimized: No performance regressions identified

🏗️ Code Quality - Excellent ⭐⭐⭐⭐⭐

Best Practices

✅ Excellent: Consistent error handling patterns
✅ Excellent: Proper separation of concerns
✅ Excellent: Clear naming conventions throughout
✅ Excellent: Comprehensive code documentation

Architecture

✅ Excellent: Clean data flow from traces → domain → API → frontend
✅ Excellent: Proper abstraction layers maintained
✅ Excellent: Follows established codebase patterns
✅ Excellent: No architectural debt introduced

Maintainability

✅ Excellent: Well-documented field semantics
✅ Excellent: Clear test coverage for future modifications
✅ Excellent: Extensible design for future token types

📊 Implementation Highlights

Robust Null Handling: Perfect distinction between None (no reasoning) vs 0 (zero tokens)
Performance-First Database Design: Efficient indexing strategy for analytics queries
Comprehensive Edge Case Testing: All scenarios covered with clear assertions
Consistent UI Integration: Seamless integration across all interfaces
Production-Ready Code: Follows all established patterns and best practices

🎯 Final Recommendation: APPROVE ✅

This PR demonstrates excellent engineering practices with:

High-quality implementation across backend and frontend
Outstanding test coverage with comprehensive edge cases
Proper database design with performance optimization
Security-conscious approach with no vulnerabilities
Clean, maintainable code following established patterns

The implementation is production-ready and ready for merge.

jacekzimonski · 2025-08-19T12:39:56Z

Minor Suggestions

@claude fixed all the Minor Suggestions, please look and confirm

claude · 2025-08-19T12:40:11Z

Claude finished @jacekzimonski's task —— View job

✅ Minor Suggestions Review Complete

All previously identified minor suggestions have been successfully addressed:

Todo List:

Check for database index on reasoning_token_count
Verify unit tests for token aggregation logic
Verify unit tests for UI conditional rendering behavior
Check for added documentation/comments in domain model
Provide final confirmation

✅ CONFIRMED: All Minor Suggestions Addressed

1. Database Index Added ✅

File: backend/core/storage/clickhouse/migrations/5_reasoning_tokens_index.sql
Implementation: Added minmax index on reasoning_token_count column
Benefits: Enables efficient filtering and sorting operations as documented in the migration

2. Unit Tests Added ✅

File: backend/core/runners/agent_completion_builder_test.py (193 lines of comprehensive tests)
Coverage: Excellent test coverage including:
- Single trace reasoning token aggregation
- Multiple trace aggregation (summing tokens)
- Mixed traces (some with/without reasoning tokens)
- Edge cases (zero tokens, no traces, no usage data)
- Proper null handling when no reasoning tokens exist

3. Documentation Added ✅

File: backend/core/domain/agent_completion.py:38-42
Implementation: Comprehensive docstring explaining:
- When the field is populated (summing from LLMTrace reasoning tokens)
- Difference between None (no reasoning tokens used) vs 0 (reasoning model used but generated zero tokens)
- Source of the data (InferenceUsage.prompt)

4. UI Implementation Verified ✅

File: web/src/components/PriceAndLatencyDisplay.tsx:26,38-43
Implementation: Proper conditional rendering with hasReasoningTokens check
Logic: Only displays when reasoningTokens != null && reasoningTokens > 0

Quality Assessment

The fixes demonstrate excellent attention to detail:

Database: Proper indexing strategy for performance
Testing: Comprehensive edge case coverage with 8 test scenarios
Documentation: Clear, technical documentation explaining field semantics
UI: Clean conditional rendering avoiding UI clutter

No remaining issues found. The implementation is production-ready and follows best practices throughout.

anyacherniss

This passes QA for me, but will not merge until Guillaume reviews

jacekzimonski · 2025-08-20T06:43:36Z

@anyacherniss
Yes agree, too much change on backend to merge it without @guillaq review

jacekzimonski · 2025-10-08T09:53:42Z

The backend changed a lot since this proposed change, so a different implementation if we move with this functionality will be needed. Becasue of that I will be closing the PR

jacekzimonski · 2025-10-13T08:59:05Z

@anyacherniss
Don't have anything in my queue, so will create a new implementation for this one

token count support for the completions and experiment

a602a11

jacekzimonski requested review from anyacherniss and guillaq August 19, 2025 11:01

jacekzimonski assigned jacekzimonski and anyacherniss Aug 19, 2025

Fix for trailing comma

824cd9f

vercel Bot deployed to Preview August 19, 2025 11:05 View deployment

Test fix

021fb67

vercel Bot deployed to Preview August 19, 2025 12:04 View deployment

Adding index for the reasoning token count

d8a5ee2

vercel Bot deployed to Preview August 19, 2025 12:19 View deployment

Fix for tests

f910954

vercel Bot deployed to Preview August 19, 2025 12:43 View deployment

fix for test data

6f1b0f8

vercel Bot deployed to Preview August 19, 2025 12:46 View deployment

anyacherniss approved these changes Aug 19, 2025

View reviewed changes

jacekzimonski closed this Oct 8, 2025

jacekzimonski mentioned this pull request Oct 13, 2025

Reasoning Token Count Support #462

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Token Count Support for the Completions and Experiment#24

Token Count Support for the Completions and Experiment#24
jacekzimonski wants to merge 6 commits intomainfrom
jacek/token-count-support-for-the-completions-and-experiment

jacekzimonski commented Aug 19, 2025 •

edited

Loading

Uh oh!

vercel Bot commented Aug 19, 2025 •

edited

Loading

Uh oh!

jacekzimonski commented Aug 19, 2025

Uh oh!

claude Bot commented Aug 19, 2025 •

edited

Loading

Uh oh!

jacekzimonski commented Aug 19, 2025

Uh oh!

claude Bot commented Aug 19, 2025 •

edited

Loading

Uh oh!

anyacherniss left a comment

Uh oh!

jacekzimonski commented Aug 20, 2025 •

edited

Loading

Uh oh!

jacekzimonski commented Oct 8, 2025

Uh oh!

jacekzimonski commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jacekzimonski commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vercel Bot commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jacekzimonski commented Aug 19, 2025

Uh oh!

claude Bot commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📋 Code Review: Token Count Support for Completions and Experiment

✅ Overall Assessment: Excellent Implementation

🔧 Backend Implementation - Excellent ⭐⭐⭐⭐⭐

Domain Model Changes (backend/core/domain/agent_completion.py:38-42)

Business Logic (backend/core/runners/agent_completion_builder.py:54-74)

Database Design ⭐⭐⭐⭐⭐

API Layer (backend/protocol/api/_api_models.py:320-323)

🧪 Test Coverage - Outstanding ⭐⭐⭐⭐⭐

Comprehensive Test Suite (backend/core/runners/agent_completion_builder_test.py)

🎨 Frontend Implementation - Excellent ⭐⭐⭐⭐⭐

UI Components (web/src/components/PriceAndLatencyDisplay.tsx)

Type Definitions (web/src/types/models.ts)

Integration Points

🔒 Security Assessment - Secure ✅

⚡ Performance Assessment - Optimized ✅

Database Performance

Frontend Performance

🏗️ Code Quality - Excellent ⭐⭐⭐⭐⭐

Best Practices

Architecture

Maintainability

📊 Implementation Highlights

🎯 Final Recommendation: APPROVE ✅

Uh oh!

jacekzimonski commented Aug 19, 2025

Uh oh!

claude Bot commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Minor Suggestions Review Complete

✅ CONFIRMED: All Minor Suggestions Addressed

Quality Assessment

Uh oh!

anyacherniss left a comment

Choose a reason for hiding this comment

Uh oh!

jacekzimonski commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jacekzimonski commented Oct 8, 2025

Uh oh!

jacekzimonski commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jacekzimonski commented Aug 19, 2025 •

edited

Loading

vercel Bot commented Aug 19, 2025 •

edited

Loading

claude Bot commented Aug 19, 2025 •

edited

Loading

Domain Model Changes (`backend/core/domain/agent_completion.py:38-42`)

Business Logic (`backend/core/runners/agent_completion_builder.py:54-74`)

API Layer (`backend/protocol/api/_api_models.py:320-323`)

Comprehensive Test Suite (`backend/core/runners/agent_completion_builder_test.py`)

UI Components (`web/src/components/PriceAndLatencyDisplay.tsx`)

Type Definitions (`web/src/types/models.ts`)

claude Bot commented Aug 19, 2025 •

edited

Loading

jacekzimonski commented Aug 20, 2025 •

edited

Loading