fix: perf batch — model inference bug + 5 perf improvements#972
Merged
fix: perf batch — model inference bug + 5 perf improvements#972
Conversation
When a session spans multiple shutdown cycles with different models, the session-level model was set to whichever shutdown happened last. Now model inference uses merged metrics across all cycles, picking the model with the highest request count — the actual dominant model. Explicit currentModel from events still takes priority over inference. Removes the per-shutdown _infer_model_from_metrics call from _first_pass (the O(N×M) → O(M) optimization noted in the issue). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
_finalize_summary was calling dict() on each accumulator defaultdict before passing to VSCodeLogSummary, whose __post_init__ immediately called dict() again to wrap in MappingProxyType. Pass the defaultdicts directly — __post_init__ handles the single copy. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The public discover_vscode_logs ran a full multi-level glob on every call. The private _cached_discover_vscode_logs already implemented identical logic with O(1) steady-state caching. Make the public function delegate to the cached version — one function, one code path. Updated test_default_windows_no_appdata to assert on stat() instead of is_dir() since the cached path uses stat + S_ISDIR. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When only plan.md changes (no events.jsonl modifications), the sort key (start_time) is unaffected. Add a plan-only fast path that substitutes updated session names into the existing sorted order, skipping both the O(n) fingerprint allocation and O(n log n) sort. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ary (#912) total_requests, total_duration_ms, first_timestamp, and last_timestamp were accessed as acc.field inside the per-request loop (LOAD_ATTR). Hoist to locals before the loop and write back after, matching the pattern already used for the dict fields. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
863f8aa to
0641706
Compare
There was a problem hiding this comment.
Pull request overview
Fixes incorrect session model inference for multi-shutdown/resumed sessions and applies several targeted performance optimizations in hot paths (session parsing, VS Code log discovery/summarization, and detail rendering).
Changes:
- Fix model inference for multi-shutdown sessions by deferring inference to merged shutdown metrics in
_build_completed_summary(while keeping explicitcurrentModelhighest priority). - Reduce per-call overhead via multiple micro-optimizations (single-pass shutdown-cycle totals, fewer dict copies when finalizing VS Code summaries, local-variable hoisting in the VS Code aggregation loop).
- Improve caching behavior (public
discover_vscode_logsnow uses the discovery cache; add a plan.md-only fast path inget_all_sessionsto avoid unnecessary sort/fingerprint work).
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
src/copilot_usage/parser.py |
Fixes multi-shutdown model inference and adds plan-only cache fast path in get_all_sessions. |
src/copilot_usage/vscode_parser.py |
Makes public discovery cached; reduces overhead in summary aggregation and finalization. |
src/copilot_usage/render_detail.py |
Avoids double iteration over per-cycle model metrics when rendering shutdown cycles. |
tests/copilot_usage/test_parser.py |
Adds/updates tests for multi-shutdown model inference and plan-only sort skipping. |
tests/copilot_usage/test_vscode_parser.py |
Updates platform discovery mocks to stat() and adds regression test for cache freshness on file append. |
Comments suppressed due to low confidence (1)
tests/copilot_usage/test_parser.py:9130
- The docstring for
test_sort_runs_after_plan_changestill describes a plan.md change forcing a fresh sort via thenot deferred_sessionsguard, but the updated assertion expectssession_sort_keyto NOT be called. Please update the docstring and/or rename the test to reflect the new plan-only fast-path behavior.
sort_key_calls: list[int] = []
def tracking_key(session: SessionSummary) -> datetime:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a correctness bug in model inference for multi-shutdown sessions, plus five performance improvements identified by the perf-analysis agent.
Changes
Bug fix
_first_passcalls_infer_model_from_metricsO(N) times for N shutdown events but only the last result is [Content truncated due to length] #931 —_first_passinferred session model from the last shutdown's metrics ("last wins"), which gave wrong model for display AND pricing when a session spanned multiple models across resume cycles. Now defers to_build_completed_summarywhere merged metrics across all cycles are available. ExplicitcurrentModelstill takes priority.Performance
_render_shutdown_cyclesiteratessd.modelMetrics.values()twice per cycle #903 — Merged two separatesum()passes oversd.modelMetrics.values()into a single loop in_render_shutdown_cycles._finalize_summarycreates redundant dict copies thatVSCodeLogSummary.__post_init__immediately re [Content truncated due to length] #904 — Removed redundantdict()copies in_finalize_summary— was creating dicts from defaultdicts then__post_init__immediately calleddict()again. Now passes defaultdicts directly.discover_vscode_logsdelegate to_cached_discover_vscode_logsinstead of running an uncached glob on every call. Updated all 4 platform tests to mockstat()instead ofis_dir().get_all_sessionsrebuilds fingerprint and re-sorts on plan.md-only name changes #911 — Added plan-only fast path inget_all_sessions: when onlyplan.mdchanged (noevents.jsonlmodifications), substitutes updated names into existing sorted order instead of rebuilding the O(n) fingerprint and O(n log n) sort._update_vscode_summaryleaves four accumulator fields as attribute accesses inside the per-request h [Content truncated due to length] #912 — Hoistedtotal_requests,total_duration_ms,first_timestamp,last_timestampto locals before the per-request loop in_update_vscode_summary, matching the pattern already used for dict fields.Won't-fix (closed with explanation)
get_vscode_summarystats every log file on every call before checking the global summary cache #898 —get_vscode_summarystats every log file before checking cache. Benchmarked at 0.1–0.2ms for typical file counts. Skipping stats would risk serving stale data when files are modified — freshness over sub-millisecond savings. Added regression test to guard this.Testing
make checkpasses: lint ✅, types ✅, security ✅, 99% coverage ✅, 86 e2e ✅_first_passcalls_infer_model_from_metricsO(N) times for N shutdown events but only the last result is [Content truncated due to length] #931:test_dominant_model_wins_over_last_shutdown,test_explicit_current_model_takes_priorityget_vscode_summarystats every log file on every call before checking the global summary cache #898 freshness:test_appended_file_detected_without_discovery_changeget_all_sessionsrebuilds fingerprint and re-sorts on plan.md-only name changes #911:test_sort_runs_after_plan_change(asserts sort is now skipped)stat()instead ofis_dir()Review
Reviewed by three adversarial agents (Codex, Opus 4.6, Sonnet 4.6). Sonnet found 3 stale
is_dirtest mocks — fixed and squashed into #906.Closes #931
Closes #903
Closes #904
Closes #906
Closes #911
Closes #912