Slow derive for large traces: ~335ms for a 1737-step Claude session

## Observation

On the desktop app (`toolpath-desktop`), deriving a single ~1737-step / 609-turn Claude session takes ~335ms of Rust-side work, dominating total click-to-painted latency.

Measured end-to-end using the perf tracer (`frontend/src/lib/perf.svelte.ts`) in the main window's `Select →` flow:

```
derive claude  (total 472ms)
  dispatch                0.0ms  (+0.0ms)
  invoke-start            0.0ms  (+0.0ms)
  invoke-end            335.0ms  (+335.0ms)   ← Rust derive
  model-updated         335.0ms  (+0.0ms)
  buildTree             360.0ms  (+25.0ms)
  buildTree cache-hit   376.0ms  (+16.0ms)
  flattenChatHead       428.0ms  (+52.0ms)
  preview-mounted       470.0ms  (+42.0ms)
  dom-painted           471.0ms  (+1.0ms)
```

The JS side is ~136ms (already optimized with WeakMap memos for `buildTree` / `flattenChatHead`). **The Rust derive is 71% of total latency** at ~193µs/step.

## Scope

The Tauri IPC command is `derive_claude`, which calls:

1. `toolpath_claude::ClaudeConvo::read_conversation(project, session_id)` — reads + merges JSONL segments
2. `toolpath_claude::derive::derive_path(&convo, &config)` — maps entries → steps

We don't know yet which of those two dominates. First step is to add profiling marks inside each to see whether time is spent on JSONL parsing, `ConversationView` assembly, or the step-construction mapping itself.

Likely suspects worth checking once we have sub-step timing:

- Per-step allocations (actor strings, change-artifact keys) — lots of small `String` ownership transitions
- `git_head_content` shell-outs for file-diff before-state (one per file artifact on a step that touches files; could batch or cache)
- Diff generation for `raw` perspectives
- Markdown rendering of text content (if any happens Rust-side)

## Possible approaches (to evaluate once profiled)

- **Profile first.** Add `perf_mark`-equivalent timing inside `derive_path` to split by phase (read, view build, step mapping, file-diff generation).
- **Streaming derive.** Return steps incrementally via Tauri events so the UI can start rendering head-path turns before the full derive completes. Good for perceived latency even if total work stays the same.
- **Parallelize per-step work.** File-diff generation looks parallelisable if it's a non-trivial portion of the time.
- **Batch `git show` calls.** If a session touches many files, one `git cat-file --batch` pipe beats N spawn calls.

## Out of scope

- Pre-derive caching (tried and reverted — added complexity for a smaller win than optimising the derive itself).

## Acceptance

Sub-100ms derive for a 1737-step session on a typical laptop, or a streaming-derive path that paints the first visible turn in under ~100ms.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow derive for large traces: ~335ms for a 1737-step Claude session #53

Observation

Scope

Possible approaches (to evaluate once profiled)

Out of scope

Acceptance

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Slow derive for large traces: ~335ms for a 1737-step Claude session #53

Description

Observation

Scope

Possible approaches (to evaluate once profiled)

Out of scope

Acceptance

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions