fix: perf batch — model inference bug + 5 perf improvements by microsasa · Pull Request #972 · microsasa/cli-tools

microsasa · 2026-04-18T04:16:39Z

Summary

Fixes a correctness bug in model inference for multi-shutdown sessions, plus five performance improvements identified by the perf-analysis agent.

Changes

Bug fix

[aw][perf] parser: _first_pass calls _infer_model_from_metrics O(N) times for N shutdown events but only the last result is [Content truncated due to length] #931 — _first_pass inferred session model from the last shutdown's metrics ("last wins"), which gave wrong model for display AND pricing when a session spanned multiple models across resume cycles. Now defers to _build_completed_summary where merged metrics across all cycles are available. Explicit currentModel still takes priority.

Performance

[aw][perf] render_detail: _render_shutdown_cycles iterates sd.modelMetrics.values() twice per cycle #903 — Merged two separate sum() passes over sd.modelMetrics.values() into a single loop in _render_shutdown_cycles.
[aw][perf] vscode_parser: _finalize_summary creates redundant dict copies that VSCodeLogSummary.__post_init__ immediately re [Content truncated due to length] #904 — Removed redundant dict() copies in _finalize_summary — was creating dicts from defaultdicts then __post_init__ immediately called dict() again. Now passes defaultdicts directly.
[aw][perf] vscode_parser.discover_vscode_logs: public API bypasses module-level discovery cache #906 — Made public discover_vscode_logs delegate to _cached_discover_vscode_logs instead of running an uncached glob on every call. Updated all 4 platform tests to mock stat() instead of is_dir().
[aw][perf] parser: get_all_sessions rebuilds fingerprint and re-sorts on plan.md-only name changes #911 — Added plan-only fast path in get_all_sessions: when only plan.md changed (no events.jsonl modifications), substitutes updated names into existing sorted order instead of rebuilding the O(n) fingerprint and O(n log n) sort.
[aw][perf] vscode_parser: _update_vscode_summary leaves four accumulator fields as attribute accesses inside the per-request h [Content truncated due to length] #912 — Hoisted total_requests, total_duration_ms, first_timestamp, last_timestamp to locals before the per-request loop in _update_vscode_summary, matching the pattern already used for dict fields.

Won't-fix (closed with explanation)

[aw][perf] vscode_parser: get_vscode_summary stats every log file on every call before checking the global summary cache #898 — get_vscode_summary stats every log file before checking cache. Benchmarked at 0.1–0.2ms for typical file counts. Skipping stats would risk serving stale data when files are modified — freshness over sub-millisecond savings. Added regression test to guard this.

Testing

make check passes: lint ✅, types ✅, security ✅, 99% coverage ✅, 86 e2e ✅
New tests for [aw][perf] parser: _first_pass calls _infer_model_from_metrics O(N) times for N shutdown events but only the last result is [Content truncated due to length] #931: test_dominant_model_wins_over_last_shutdown, test_explicit_current_model_takes_priority
New test for [aw][perf] vscode_parser: get_vscode_summary stats every log file on every call before checking the global summary cache #898 freshness: test_appended_file_detected_without_discovery_change
Updated test for [aw][perf] parser: get_all_sessions rebuilds fingerprint and re-sorts on plan.md-only name changes #911: test_sort_runs_after_plan_change (asserts sort is now skipped)
Updated 4 platform tests for [aw][perf] vscode_parser.discover_vscode_logs: public API bypasses module-level discovery cache #906: mock stat() instead of is_dir()

Review

Reviewed by three adversarial agents (Codex, Opus 4.6, Sonnet 4.6). Sonnet found 3 stale is_dir test mocks — fixed and squashed into #906.

Closes #931
Closes #903
Closes #904
Closes #906
Closes #911
Closes #912

When a session spans multiple shutdown cycles with different models, the session-level model was set to whichever shutdown happened last. Now model inference uses merged metrics across all cycles, picking the model with the highest request count — the actual dominant model. Explicit currentModel from events still takes priority over inference. Removes the per-shutdown _infer_model_from_metrics call from _first_pass (the O(N×M) → O(M) optimization noted in the issue). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

_finalize_summary was calling dict() on each accumulator defaultdict before passing to VSCodeLogSummary, whose __post_init__ immediately called dict() again to wrap in MappingProxyType. Pass the defaultdicts directly — __post_init__ handles the single copy. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

) Merge two separate sum() passes over sd.modelMetrics.values() into one loop that accumulates both total_requests and total_output. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The public discover_vscode_logs ran a full multi-level glob on every call. The private _cached_discover_vscode_logs already implemented identical logic with O(1) steady-state caching. Make the public function delegate to the cached version — one function, one code path. Updated test_default_windows_no_appdata to assert on stat() instead of is_dir() since the cached path uses stat + S_ISDIR. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

When only plan.md changes (no events.jsonl modifications), the sort key (start_time) is unaffected. Add a plan-only fast path that substitutes updated session names into the existing sorted order, skipping both the O(n) fingerprint allocation and O(n log n) sort. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…ary (#912) total_requests, total_duration_ms, first_timestamp, and last_timestamp were accessed as acc.field inside the per-request loop (LOAD_ATTR). Hoist to locals before the loop and write back after, matching the pattern already used for the dict fields. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Fixes incorrect session model inference for multi-shutdown/resumed sessions and applies several targeted performance optimizations in hot paths (session parsing, VS Code log discovery/summarization, and detail rendering).

Changes:

Fix model inference for multi-shutdown sessions by deferring inference to merged shutdown metrics in _build_completed_summary (while keeping explicit currentModel highest priority).
Reduce per-call overhead via multiple micro-optimizations (single-pass shutdown-cycle totals, fewer dict copies when finalizing VS Code summaries, local-variable hoisting in the VS Code aggregation loop).
Improve caching behavior (public discover_vscode_logs now uses the discovery cache; add a plan.md-only fast path in get_all_sessions to avoid unnecessary sort/fingerprint work).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
`src/copilot_usage/parser.py`	Fixes multi-shutdown model inference and adds plan-only cache fast path in `get_all_sessions`.
`src/copilot_usage/vscode_parser.py`	Makes public discovery cached; reduces overhead in summary aggregation and finalization.
`src/copilot_usage/render_detail.py`	Avoids double iteration over per-cycle model metrics when rendering shutdown cycles.
`tests/copilot_usage/test_parser.py`	Adds/updates tests for multi-shutdown model inference and plan-only sort skipping.
`tests/copilot_usage/test_vscode_parser.py`	Updates platform discovery mocks to `stat()` and adds regression test for cache freshness on file append.

Comments suppressed due to low confidence (1)

tests/copilot_usage/test_parser.py:9130

The docstring for test_sort_runs_after_plan_change still describes a plan.md change forcing a fresh sort via the not deferred_sessions guard, but the updated assertion expects session_sort_key to NOT be called. Please update the docstring and/or rename the test to reflect the new plan-only fast-path behavior.


        sort_key_calls: list[int] = []

        def tracking_key(session: SessionSummary) -> datetime:

Copilot AI review requested due to automatic review settings April 18, 2026 04:16

Copilot started reviewing on behalf of microsasa April 18, 2026 04:17 View session

Sasa Junuzovic and others added 6 commits April 17, 2026 21:17

fix: single-pass modelMetrics iteration in _render_shutdown_cycles (#903

0e6db81

) Merge two separate sum() passes over sd.modelMetrics.values() into one loop that accumulates both total_requests and total_output. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

microsasa force-pushed the fix/perf-batch branch from 863f8aa to 0641706 Compare April 18, 2026 04:17

microsasa merged commit a516f29 into main Apr 18, 2026
4 checks passed

microsasa deleted the fix/perf-batch branch April 18, 2026 04:19

Copilot AI reviewed Apr 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: perf batch — model inference bug + 5 perf improvements#972

fix: perf batch — model inference bug + 5 perf improvements#972
microsasa merged 6 commits intomainfrom
fix/perf-batch

microsasa commented Apr 18, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

microsasa commented Apr 18, 2026

Summary

Changes

Bug fix

Performance

Won't-fix (closed with explanation)

Testing

Review

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants