Skip to content

[diffshub] stream tree paths incrementally to fix O(N²) publish slowdown#700

Merged
necolas merged 1 commit into
beta-1.2from
nicolas/diffshub-tree-incremental-add
May 18, 2026
Merged

[diffshub] stream tree paths incrementally to fix O(N²) publish slowdown#700
necolas merged 1 commit into
beta-1.2from
nicolas/diffshub-tree-incremental-add

Conversation

@necolas
Copy link
Copy Markdown
Contributor

@necolas necolas commented May 18, 2026

Streaming a large compare (e.g. torvalds/linux v6.0..v7.0) made the Diff Stats counter tick more slowly the longer the stream ran, and stuttered scroll on every tick. Each streamed batch published a fresh CodeViewFileTreeSource, and CodeViewFileTree responded by calling model.resetPaths(source.paths), which constructs a brand new PathStore over the full accumulated path list. Per publish that is O(N log N) where N is total paths streamed so far, so across ~80 publishes the total work is roughly O(N² log N) on the main thread and each publish blocks input proportionally to N. The trees model already exposes a localized mutation fast path via model.batch / model.add; diffshub just wasn't using it.

Link each tree-source snapshot to the prior one through a new previousSource field. The accumulator now also owns a persistent rankByPath map and a stable sort comparator over it, so we stop rebuilding the comparator's rank map on every snapshot. When CodeViewFileTree sees a snapshot whose previousSource matches the one it last applied, it batches the tail of new paths into the model with { type: 'add', path } ops instead of resetting. resetPaths is now reserved for the initial mount and any non-append change (e.g. a new request).

This drops per-publish tree work from O(N) to O(delta), keeps the counter incrementing at a steady rate as the stream progresses, and removes the recurring long task that was starving scroll during streaming.

Streaming a large compare (e.g. torvalds/linux v6.0..v7.0) made the Diff
Stats counter tick more slowly the longer the stream ran, and stuttered
scroll on every tick. Each streamed batch published a fresh
CodeViewFileTreeSource, and CodeViewFileTree responded by calling
`model.resetPaths(source.paths)` — which constructs a brand new
PathStore over the full accumulated path list. Per publish that is
O(N log N) where N is total paths streamed so far, so across ~80
publishes the total work is roughly O(N² log N) on the main thread and
each publish blocks input proportionally to N. The trees model already
exposes a localized mutation fast path via `model.batch` / `model.add`;
diffshub just wasn't using it.

Link each tree-source snapshot to the prior one through a new
`previousSource` field. The accumulator now also owns a persistent
`rankByPath` map and a stable sort comparator over it, so we stop
rebuilding the comparator's rank map on every snapshot. When
CodeViewFileTree sees a snapshot whose `previousSource` matches the one
it last applied, it batches the tail of new paths into the model with
`{ type: 'add', path }` ops instead of resetting. `resetPaths` is now
reserved for the initial mount and any non-append change (e.g. a new
request).

This drops per-publish tree work from O(N) to O(delta), keeps the
counter incrementing at a steady rate as the stream progresses, and
removes the recurring long task that was starving scroll during
streaming.
@necolas necolas requested a review from amadeus May 18, 2026 22:34
@vercel
Copy link
Copy Markdown

vercel Bot commented May 18, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
pierre-docs-diffshub Ready Ready Preview May 18, 2026 10:34pm
pierre-docs-trees Ready Ready Preview May 18, 2026 10:34pm
pierrejs-diff-demo Ready Ready Preview May 18, 2026 10:34pm
pierrejs-docs Ready Ready Preview May 18, 2026 10:34pm

Request Review

@amadeus amadeus requested a review from SlexAxton May 18, 2026 22:37
@amadeus
Copy link
Copy Markdown
Member

amadeus commented May 18, 2026

Also this fixes the folders getting uncollapsed while streaming in results! 🙏

@necolas necolas merged commit ca77495 into beta-1.2 May 18, 2026
12 checks passed
@necolas necolas deleted the nicolas/diffshub-tree-incremental-add branch May 18, 2026 22:54
amadeus pushed a commit that referenced this pull request May 20, 2026
…own (#700)

Streaming a large compare (e.g. torvalds/linux v6.0..v7.0) made the Diff
Stats counter tick more slowly the longer the stream ran, and stuttered
scroll on every tick. Each streamed batch published a fresh
CodeViewFileTreeSource, and CodeViewFileTree responded by calling
`model.resetPaths(source.paths)` — which constructs a brand new
PathStore over the full accumulated path list. Per publish that is
O(N log N) where N is total paths streamed so far, so across ~80
publishes the total work is roughly O(N² log N) on the main thread and
each publish blocks input proportionally to N. The trees model already
exposes a localized mutation fast path via `model.batch` / `model.add`;
diffshub just wasn't using it.

Link each tree-source snapshot to the prior one through a new
`previousSource` field. The accumulator now also owns a persistent
`rankByPath` map and a stable sort comparator over it, so we stop
rebuilding the comparator's rank map on every snapshot. When
CodeViewFileTree sees a snapshot whose `previousSource` matches the one
it last applied, it batches the tail of new paths into the model with
`{ type: 'add', path }` ops instead of resetting. `resetPaths` is now
reserved for the initial mount and any non-append change (e.g. a new
request).

This drops per-publish tree work from O(N) to O(delta), keeps the
counter incrementing at a steady rate as the stream progresses, and
removes the recurring long task that was starving scroll during
streaming.
amadeus pushed a commit that referenced this pull request May 20, 2026
…own (#700)

Streaming a large compare (e.g. torvalds/linux v6.0..v7.0) made the Diff
Stats counter tick more slowly the longer the stream ran, and stuttered
scroll on every tick. Each streamed batch published a fresh
CodeViewFileTreeSource, and CodeViewFileTree responded by calling
`model.resetPaths(source.paths)` — which constructs a brand new
PathStore over the full accumulated path list. Per publish that is
O(N log N) where N is total paths streamed so far, so across ~80
publishes the total work is roughly O(N² log N) on the main thread and
each publish blocks input proportionally to N. The trees model already
exposes a localized mutation fast path via `model.batch` / `model.add`;
diffshub just wasn't using it.

Link each tree-source snapshot to the prior one through a new
`previousSource` field. The accumulator now also owns a persistent
`rankByPath` map and a stable sort comparator over it, so we stop
rebuilding the comparator's rank map on every snapshot. When
CodeViewFileTree sees a snapshot whose `previousSource` matches the one
it last applied, it batches the tail of new paths into the model with
`{ type: 'add', path }` ops instead of resetting. `resetPaths` is now
reserved for the initial mount and any non-append change (e.g. a new
request).

This drops per-publish tree work from O(N) to O(delta), keeps the
counter incrementing at a steady rate as the stream progresses, and
removes the recurring long task that was starving scroll during
streaming.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants