-
Notifications
You must be signed in to change notification settings - Fork 19
feat: generate stats for llms used #127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds persistent usage statistics: new types and a StatsManager, store integration to record/query GenerationEvent data, instrumentation in message generation (cache hit/miss, timing, tokens, cost), and a new Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CreateMsg
participant Store
participant StatsManager
participant Disk
User->>CreateMsg: Run create-commit-msg
CreateMsg->>CreateMsg: startTime = now()
CreateMsg->>Store: Check cache (only on first attempt)
alt Cache Hit (first attempt)
CreateMsg->>Store: RecordGenerationEvent(CacheHit=true, CacheChecked=true)
Store->>StatsManager: RecordGeneration(event)
StatsManager->>Disk: Save stats
CreateMsg-->>User: Return cached message
else Cache Miss
CreateMsg->>CreateMsg: Generate message
CreateMsg->>CreateMsg: Compute generationTime, tokens, cost
CreateMsg->>Store: RecordGenerationEvent(Success=..., details...)
Store->>StatsManager: RecordGeneration(event)
StatsManager->>Disk: Save stats
alt isFirstAttempt
CreateMsg->>Store: Cache message (with computed cost)
end
CreateMsg-->>User: Return generated message
end
sequenceDiagram
participant User
participant StatsCLI
participant Store
participant StatsManager
participant Disk
User->>StatsCLI: Run stats command (--detailed / --reset)
alt Reset flag
StatsCLI->>User: Prompt confirmation
User->>StatsCLI: Confirm
StatsCLI->>Store: ResetUsageStats()
Store->>StatsManager: ResetStats()
StatsManager->>Disk: Write cleared stats
StatsCLI-->>User: Confirm reset complete
else Show stats
StatsCLI->>Store: GetUsageStats()
Store->>StatsManager: GetStats()
StatsManager->>Disk: Read stats file (if present)
StatsManager-->>Store: Return stats copy
Store-->>StatsCLI: Return stats
StatsCLI-->>User: Render overall and provider stats (detailed optional)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cmd/cli/createMsg.go (1)
389-759: Update hard-coded LLM pricing constants to Oct 2025 ratesPricing is stale; current Oct 2025 rates show significant changes:
- OpenAI GPT-4o: $5.00/$20.00/1M (input/output), code has $2.50/$10.00 (+100%)
- Gemini 2.5 Flash: $0.30/$2.50/1M, code has $0.15/$0.60 (input +2x, output +4x)
- Claude Sonnet: $3.00/$15.00/1M (matches code)
- Grok 4: $3.00/$15.00/1M, code has $5.00/$15.00
- Groq: $0.075–$1.00/$0.30–$3.00/1M (model-dependent), code uses flat $2.50/$10.00
Update
estimateCost()in cmd/cli/createMsg.go with official rates. For Groq, consider defaulting to a conservative middle estimate or documenting the variance.
🧹 Nitpick comments (4)
cmd/cli/stats.go (1)
26-41: Reuse global Store when available; fall back to creating a new oneAvoid opening keyring/cache twice and keep state consistent with the rest of the CLI.
- Store, err := store.NewStoreMethods() - if err != nil { - return fmt.Errorf("failed to initialize store: %w", err) - } + var s *store.StoreMethods + if Store != nil { + s = Store + } else { + var err error + s, err = store.NewStoreMethods() + if err != nil { + return fmt.Errorf("failed to initialize store: %w", err) + } + } reset, _ := cmd.Flags().GetBool("reset") if reset { - if err := resetStatistics(Store); err != nil { + if err := resetStatistics(s); err != nil { return err } return nil } detailed, _ := cmd.Flags().GetBool("detailed") - return displayStatistics(Store, detailed) + return displayStatistics(s, detailed)internal/usage/usage.go (3)
56-73: Honor event timestamp when providedUse the event’s timestamp if set; fall back to now. This preserves chronology if events are ever replayed.
- now := time.Now().UTC().Format(time.RFC3339) + now := time.Now().UTC().Format(time.RFC3339) + ts := event.Timestamp + if strings.TrimSpace(ts) == "" { + ts = now + } ... - sm.stats.LastUse = now + sm.stats.LastUse = ts ... - if sm.stats.FirstUse == "" { - sm.stats.FirstUse = now + if sm.stats.FirstUse == "" { + sm.stats.FirstUse = tsNote: add
stringsimport at top.
259-273: Use sort.Slice for clarity and O(n log n) complexityBubble-sort is fine for ≤6 providers, but sort.Slice is simpler and idiomatic.
- // Sort by usage count (descending) - for i := 0; i < len(rankings)-1; i++ { - for j := i + 1; j < len(rankings); j++ { - if rankings[j].uses > rankings[i].uses { - rankings[i], rankings[j] = rankings[j], rankings[i] - } - } - } + // Sort by usage count (descending) + sort.Slice(rankings, func(i, j int) bool { + return rankings[i].uses > rankings[j].uses + })Note: add
sortto imports.
227-239: Optional: write file atomically to prevent partial writesWrite to a temp file and rename to reduce corruption risk on crashes.
- data, err := json.MarshalIndent(sm.stats, "", " ") + data, err := json.MarshalIndent(sm.stats, "", " ") if err != nil { return fmt.Errorf("failed to marshal stats: %w", err) } - - return os.WriteFile(sm.filePath, data, 0600) + tmp := sm.filePath + ".tmp" + if err := os.WriteFile(tmp, data, 0600); err != nil { + return fmt.Errorf("failed to write temp stats file: %w", err) + } + return os.Rename(tmp, sm.filePath)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
cmd/cli/createMsg.go(1 hunks)cmd/cli/root.go(1 hunks)cmd/cli/stats.go(1 hunks)cmd/cli/store/store.go(3 hunks)internal/usage/usage.go(1 hunks)pkg/types/types.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
pkg/types/types.go (2)
cmd/cli/store/store.go (1)
LLMProvider(58-61)internal/llm/provider.go (1)
Provider(24-29)
cmd/cli/stats.go (3)
cmd/cli/root.go (1)
Store(11-11)cmd/cli/store/store.go (3)
NewStoreMethods(26-49)StoreMethods(19-23)LLMProvider(58-61)pkg/types/types.go (2)
ProviderStats(161-172)LLMProvider(5-5)
internal/usage/usage.go (2)
pkg/types/types.go (4)
UsageStats(146-158)ProviderStats(161-172)LLMProvider(5-5)GenerationEvent(175-184)utils/storeUtils.go (1)
GetConfigPath(30-68)
cmd/cli/createMsg.go (3)
pkg/types/types.go (2)
GenerationEvent(175-184)Message(80-83)internal/llm/provider.go (1)
Provider(24-29)pkg/types/prompt.go (1)
BuildCommitPrompt(30-51)
cmd/cli/store/store.go (2)
internal/usage/usage.go (2)
StatsManager(16-20)NewStatsManager(23-49)pkg/types/types.go (3)
GenerationEvent(175-184)UsageStats(146-158)LLMProvider(5-5)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Build Go Binary (windows-latest)
- GitHub Check: Build Go Binary (macos-latest)
- GitHub Check: Test (1.21)
🔇 Additional comments (3)
cmd/cli/root.go (1)
144-144: Stats command wiring looks goodstatsCmd is correctly added to the root command.
cmd/cli/store/store.go (1)
39-48: Good integration of usage stats managerStatsManager is initialized once and attached to StoreMethods; delegate methods keep cmd layer simple.
cmd/cli/stats.go (1)
70-76: Guard against divide-by-zero (future-proofing)You already return early when TotalGenerations==0, so this is safe. Keep as-is.
Please add a tiny unit test covering the zero-stats path to prevent regressions later. I can draft it if helpful.
| // GenerationEvent represents a single commit message generation event for tracking. | ||
| type GenerationEvent struct { | ||
| Provider LLMProvider `json:"provider"` | ||
| Success bool `json:"success"` | ||
| GenerationTime float64 `json:"generation_time_ms"` | ||
| TokensUsed int `json:"tokens_used"` | ||
| Cost float64 `json:"cost"` | ||
| CacheHit bool `json:"cache_hit"` | ||
| Timestamp string `json:"timestamp"` | ||
| ErrorMessage string `json:"error_message,omitempty"` | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a “cache attempted” signal to avoid skewed hit-rate
Today, any non-hit gets counted as a miss, including regenerations where the cache wasn’t even checked. This inflates misses and depresses hit-rate.
Add a CacheChecked flag to GenerationEvent and only update cache hit/miss when it’s true.
type GenerationEvent struct {
Provider LLMProvider `json:"provider"`
Success bool `json:"success"`
GenerationTime float64 `json:"generation_time_ms"`
TokensUsed int `json:"tokens_used"`
Cost float64 `json:"cost"`
- CacheHit bool `json:"cache_hit"`
+ CacheHit bool `json:"cache_hit"`
+ CacheChecked bool `json:"cache_checked"`
Timestamp string `json:"timestamp"`
ErrorMessage string `json:"error_message,omitempty"`
}🤖 Prompt for AI Agents
In pkg/types/types.go around lines 174 to 184, add a new boolean field
CacheChecked (json:"cache_checked") to the GenerationEvent struct so you can
distinguish between cache misses and cases where the cache was never queried;
then update any code paths that record cache metrics to only treat
CacheHit/Cache miss when CacheChecked is true (i.e., set CacheChecked=true when
the cache was actually queried and only count/update hit-rate when that flag is
true).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @aqsaaqeel can u review the coderabbit suggestions also
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
pkg/types/types.go (1)
174-185: Good: added CacheChecked to disambiguate cache missesThis unblocks accurate hit-rate accounting and fixes prior skew.
internal/usage/usage.go (1)
74-81: Correct: count hits/misses only when cache was checkedThis fixes prior miscounting of regenerations as misses.
🧹 Nitpick comments (5)
cmd/cli/createMsg.go (2)
398-407: Prefer provider-call latency; zero tokens/cost on failures
- GenerationTime currently includes pre-call overhead. Measure just provider latency for accuracy.
- When err != nil, set TokensUsed and Cost to 0 to avoid inflating totals.
Apply:
- // Generate new message - message, err := provider.Generate(ctx, changes, opts) - generationTime := float64(time.Since(startTime).Nanoseconds()) / 1e6 // Convert to milliseconds + // Generate new message and measure provider latency only + genStart := time.Now() + message, err := provider.Generate(ctx, changes, opts) + generationTime := float64(time.Since(genStart).Nanoseconds()) / 1e6 // ms @@ - // Estimate tokens and cost - inputTokens := estimateTokens(types.BuildCommitPrompt(changes, opts)) - outputTokens := 100 // Estimate output tokens - cost := estimateCost(providerType, inputTokens, outputTokens) + // Estimate tokens and cost (only on success) + var inputTokens, outputTokens int + var cost float64 + if err == nil { + inputTokens = estimateTokens(types.BuildCommitPrompt(changes, opts)) + outputTokens = 100 // rough estimate + cost = estimateCost(providerType, inputTokens, outputTokens) + } @@ - TokensUsed: inputTokens + outputTokens, - Cost: cost, + TokensUsed: inputTokens + outputTokens, + Cost: cost,Also applies to: 409-417
379-383: Unify warning logs with pterm.Warning for consistencyUse pterm’s warning style instead of fmt.Printf.
- fmt.Printf("Warning: Failed to record usage statistics: %v\n", err) + pterm.Warning.Printf("Failed to record usage statistics: %v\n", err) @@ - fmt.Printf("Warning: Failed to record usage statistics: %v\n", statsErr) + pterm.Warning.Printf("Failed to record usage statistics: %v\n", statsErr) @@ - fmt.Printf("Warning: Failed to cache message: %v\n", cacheErr) + pterm.Warning.Printf("Failed to cache message: %v\n", cacheErr)Also applies to: 414-417, 426-429
internal/usage/usage.go (3)
56-73: Honor event.Timestamp if provided (fall back to now)Keeps persisted first/last usage aligned with event source time, not local clock.
- now := time.Now().UTC().Format(time.RFC3339) + now := time.Now().UTC().Format(time.RFC3339) + // Prefer the event timestamp when valid + ts := now + if t := strings.TrimSpace(event.Timestamp); t != "" { + if _, err := time.Parse(time.RFC3339, t); err == nil { + ts = t + } + } @@ - sm.stats.LastUse = now + sm.stats.LastUse = ts @@ - if sm.stats.FirstUse == "" { - sm.stats.FirstUse = now + if sm.stats.FirstUse == "" { + sm.stats.FirstUse = ts @@ - sm.stats.ProviderStats[event.Provider] = &types.ProviderStats{ - Name: event.Provider, - FirstUsed: now, - } + sm.stats.ProviderStats[event.Provider] = &types.ProviderStats{ + Name: event.Provider, + FirstUsed: ts, + } @@ - providerStats.LastUsed = now + providerStats.LastUsed = tsAdd:
+import "strings"at the top import block to support TrimSpace if not already present.
261-268: Use sort.Slice for simplicity and readabilityCurrent nested loops are fine for small N, but sort.Slice is clearer.
- // Sort by usage count (descending) - for i := 0; i < len(rankings)-1; i++ { - for j := i + 1; j < len(rankings); j++ { - if rankings[j].uses > rankings[i].uses { - rankings[i], rankings[j] = rankings[j], rankings[i] - } - } - } + // Sort by usage count (descending) + sort.Slice(rankings, func(i, j int) bool { return rankings[i].uses > rankings[j].uses })Don’t forget:
+import "sort"Also applies to: 270-276
83-86: Averages include cached events; consider excluding for latency metricsIncluding cached hits (often ~0 ms) can skew AverageGenerationTime down. Optionally compute a separate ProviderLatency average excluding cached events (e.g., when event.TokensUsed > 0).
Would you like a follow-up patch to add a ProviderLatencyMs metric derived only from non-cached events?
Also applies to: 107-110
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
cmd/cli/createMsg.go(1 hunks)internal/usage/usage.go(1 hunks)pkg/types/types.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
pkg/types/types.go (2)
cmd/cli/store/store.go (1)
LLMProvider(58-61)internal/llm/provider.go (1)
Provider(24-29)
cmd/cli/createMsg.go (3)
pkg/types/types.go (2)
GenerationEvent(175-185)Message(80-83)internal/llm/provider.go (1)
Provider(24-29)pkg/types/prompt.go (1)
BuildCommitPrompt(30-51)
internal/usage/usage.go (4)
pkg/types/types.go (4)
UsageStats(146-158)ProviderStats(161-172)LLMProvider(5-5)GenerationEvent(175-185)utils/storeUtils.go (1)
GetConfigPath(30-68)cmd/cli/store/store.go (1)
LLMProvider(58-61)internal/llm/provider.go (1)
Provider(24-29)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Test Build
- GitHub Check: Build Go Binary (windows-latest)
🔇 Additional comments (4)
pkg/types/types.go (2)
145-158: UsageStats schema looks solidFields match aggregator logic; JSON tags are consistent.
160-173: ProviderStats design is coherentCounters + averaged timing + success rate are appropriate for per-provider rollups.
cmd/cli/createMsg.go (2)
367-377: Instrumentation on cache hit is correctRecording CacheHit=true and CacheChecked=true with a timestamp is exactly what's needed.
404-406: Good: CacheChecked reflects actual lookup attemptsSetting CacheChecked to isFirstAttempt avoids counting regenerations as misses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved 🎊
Description
Added comprehensive usage statistics tracking system to monitor LLM provider performance, generation metrics, and user patterns. This enhancement provides insights into which providers are most used, their success rates, average generation times, and token usage patterns.
Type of Change
Related Issue
Fixes #93
Testing
Checklist
For Hacktoberfest Participants
Thank you for your contribution! 🎉
Summary by CodeRabbit