Skip to content

Fix 100% one-shot rate for Gemini/Mistral/Kiro/Goose#353

Closed
iamtoruk wants to merge 3 commits into
mainfrom
fix/oneshot-rate-detection
Closed

Fix 100% one-shot rate for Gemini/Mistral/Kiro/Goose#353
iamtoruk wants to merge 3 commits into
mainfrom
fix/oneshot-rate-detection

Conversation

@iamtoruk
Copy link
Copy Markdown
Member

Summary

  • Root cause: Gemini, Mistral Vibe, Kiro, and Goose providers aggregated all tool calls into a single ParsedProviderCall per session, deduplicating via new Set(). The retry detector (countRetries) needs to see the sequential Edit→Bash→Edit pattern across multiple calls — with only 1 call, it always returned 0 retries → 100% one-shot rate.
  • Fix: Added toolSequence?: string[][] field that preserves per-assistant-message tool ordering through the full pipeline (ParsedProviderCallCachedCallParsedApiCall). The classifier expands toolSequence steps alongside regular assistantCalls, so retry patterns are detected even from aggregated provider calls.
  • Also: Insight pill switcher now scrolls horizontally instead of wrapping/truncating labels.

Changed files

Area Files
Types providers/types.ts, session-cache.ts, types.ts
Pipeline parser.ts (carry-through)
Classifier classifier.ts (expand toolSequence in countRetries)
Providers gemini.ts, mistral-vibe.ts, kiro.ts, goose.ts
UI HeatmapSection.swift (ScrollView pills)
Tests classifier.test.ts (+5 new retry tests)

Test plan

  • 865 tests pass (60 files), including 5 new retry-detection tests
  • Swift builds cleanly
  • Verify Gemini/Mistral one-shot rates change for users with actual retry sessions
  • Verify insight pills scroll without wrapping on narrow popovers

iamtoruk added 3 commits May 18, 2026 05:49
SwiftUI MenuBarExtra with litellm-snapshot pricing, Claude/Codex/Copilot
parsers, session discovery, auto-refresh timer, and dashboard UI matching
the real menubar design.
Private Mac App Store build, not for the public repo.
These providers aggregated all tool calls into a single ParsedProviderCall
per session, losing the sequential Edit→Bash→Edit signal that countRetries
needs. Added toolSequence field that preserves per-message tool ordering
through the pipeline. Also makes insight pills horizontally scrollable.
@ozymandiashh
Copy link
Copy Markdown
Contributor

Thanks for jumping on this. I opened #352 from the same #336/#351 thread, so we now have two overlapping approaches.

The path in #352 keeps Gemini’s per-assistant-message calls from #340 and groups them back under the user turn for retry classification. The behavior proof there is provider-level: a synthetic Gemini Edit -> Bash -> Edit sequence now parses as one turn with retries = 1 and oneShotTurns = 0. For Vibe it also uses meta.json.stats.session_cost when present, since current Vibe logs do not expose cache token fields.

Your toolSequence approach looks useful for providers that still only have aggregate session-level calls, especially Kiro/Goose. I think the cleanest path may be:

  • keep the Gemini/Vibe turn-grouping fix from Fix Gemini and Vibe one-shot rates #352
  • extract or adapt the Kiro/Goose toolSequence part from this PR as a focused follow-up
  • keep the menubar pill scrolling change separate from the parser fix

Happy to rebase/adapt #352 or help split this PR, whichever direction you prefer.

@iamtoruk
Copy link
Copy Markdown
Member Author

Superseded by #355, which combines the toolSequence approach from this PR (for Kiro/Goose) with the turnId grouping from #352 (for Gemini/Vibe).

@iamtoruk iamtoruk closed this May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants