Skip to content

[aw][code health] session command does O(n) full parse to find one session by ID prefix #138

@microsasa

Description

@microsasa

Root Cause

cli.py session command iterates through all session paths returned by discover_sessions(), fully parsing every events.jsonl and building a SessionSummary just to check if s.session_id.startswith(session_id):

# cli.py ~lines 335–345
for events_path in event_paths:
    events = parse_events(events_path)        # full parse
    if not events:
        continue
    s = build_session_summary(events, ...)    # full summary build
    if s.session_id.startswith(session_id):   # only then check prefix
        render_session_detail(events, s)
        return
    if s.session_id:
        available.append(s.session_id[:8])

In real usage, session directories are named with the session UUID (e.g. 0faecbdf-b889-4bca-a51a-5254f5488cb6/events.jsonl). The directory name is the session ID. A user looking up session 0faecbdf is triggering a full parse of every other session file before checking this obvious match.

events.jsonl files grow with every interaction — sessions with hundreds of tool calls can be hundreds of KB. A user with 50 sessions doing copilot-usage session (prefix) incurs O(n) parse work instead of O(1).

Spec

In the session command body, add a fast pre-filter based on directory name:

for events_path in event_paths:
    # Fast path: session dirs are UUID-named in real usage.
    # Skip parsing entirely when the dir name clearly can't match.
    dir_name = events_path.parent.name
    if not dir_name.startswith(session_id):
        # Non-UUID dirs (e.g. fixtures) still need a full scan — fall through
        # only when dir_name gave a potential match or is non-UUID-shaped.
        # Simpler: only skip when the dir_name itself rules out a match.
        pass  # still parse below — see note

Cleaner implementation: skip parsing any session whose directory name does NOT start with the given prefix AND whose directory name looks like a UUID (i.e. contains - and is 36 chars). For non-UUID directory names, fall through to the existing full-parse path.

Alternatively (simpler and safer): just check whether events_path.parent.name.startswith(session_id) is False AND len(session_id) >= 4 — if true, skip that path immediately. This is a conservative optimization that never skips a legitimate match.

for events_path in event_paths:
    # Skip directories that clearly cannot match the given prefix.
    if len(session_id) >= 4 and not events_path.parent.name.startswith(session_id):
        continue  # fast pre-filter; doesn't affect non-UUID named dirs that match
    events = parse_events(events_path)
    ...

The len(session_id) >= 4 guard avoids incorrectly skipping things on very short prefixes that might collide with directory naming.

Note: the available list in the error path still needs to be built from all sessions. After the loop for the match, if not found, do a second (lightweight) pass to collect available IDs for the error message — or always populate it alongside the pre-filter.

Acceptance Criteria

  • copilot-usage session (prefix) resolves correctly for UUID-named session directories (most common case)
  • No regression for non-UUID directory names (e.g. empty-session, corrupt-session fixtures)
  • Unit test: mock parse_events and verify it is only called for the directory matching the prefix when the prefix is ≥ 4 chars and there are ≥ 5 sessions
  • Existing TestSessionE2E tests still pass

Generated by Code Health Analysis ·

Generated by Code Health Analysis ·

Metadata

Metadata

Assignees

No one assigned

    Labels

    awCreated by agentic workflowaw-dispatchedIssue has been dispatched to implementercode-healthCode cleanup and maintenance

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions