[diffshub] stream tree paths incrementally to fix O(N²) publish slowdown#700
Merged
Merged
Conversation
Streaming a large compare (e.g. torvalds/linux v6.0..v7.0) made the Diff
Stats counter tick more slowly the longer the stream ran, and stuttered
scroll on every tick. Each streamed batch published a fresh
CodeViewFileTreeSource, and CodeViewFileTree responded by calling
`model.resetPaths(source.paths)` — which constructs a brand new
PathStore over the full accumulated path list. Per publish that is
O(N log N) where N is total paths streamed so far, so across ~80
publishes the total work is roughly O(N² log N) on the main thread and
each publish blocks input proportionally to N. The trees model already
exposes a localized mutation fast path via `model.batch` / `model.add`;
diffshub just wasn't using it.
Link each tree-source snapshot to the prior one through a new
`previousSource` field. The accumulator now also owns a persistent
`rankByPath` map and a stable sort comparator over it, so we stop
rebuilding the comparator's rank map on every snapshot. When
CodeViewFileTree sees a snapshot whose `previousSource` matches the one
it last applied, it batches the tail of new paths into the model with
`{ type: 'add', path }` ops instead of resetting. `resetPaths` is now
reserved for the initial mount and any non-append change (e.g. a new
request).
This drops per-publish tree work from O(N) to O(delta), keeps the
counter incrementing at a steady rate as the stream progresses, and
removes the recurring long task that was starving scroll during
streaming.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
amadeus
approved these changes
May 18, 2026
Member
|
Also this fixes the folders getting uncollapsed while streaming in results! 🙏 |
amadeus
pushed a commit
that referenced
this pull request
May 20, 2026
…own (#700) Streaming a large compare (e.g. torvalds/linux v6.0..v7.0) made the Diff Stats counter tick more slowly the longer the stream ran, and stuttered scroll on every tick. Each streamed batch published a fresh CodeViewFileTreeSource, and CodeViewFileTree responded by calling `model.resetPaths(source.paths)` — which constructs a brand new PathStore over the full accumulated path list. Per publish that is O(N log N) where N is total paths streamed so far, so across ~80 publishes the total work is roughly O(N² log N) on the main thread and each publish blocks input proportionally to N. The trees model already exposes a localized mutation fast path via `model.batch` / `model.add`; diffshub just wasn't using it. Link each tree-source snapshot to the prior one through a new `previousSource` field. The accumulator now also owns a persistent `rankByPath` map and a stable sort comparator over it, so we stop rebuilding the comparator's rank map on every snapshot. When CodeViewFileTree sees a snapshot whose `previousSource` matches the one it last applied, it batches the tail of new paths into the model with `{ type: 'add', path }` ops instead of resetting. `resetPaths` is now reserved for the initial mount and any non-append change (e.g. a new request). This drops per-publish tree work from O(N) to O(delta), keeps the counter incrementing at a steady rate as the stream progresses, and removes the recurring long task that was starving scroll during streaming.
amadeus
pushed a commit
that referenced
this pull request
May 20, 2026
…own (#700) Streaming a large compare (e.g. torvalds/linux v6.0..v7.0) made the Diff Stats counter tick more slowly the longer the stream ran, and stuttered scroll on every tick. Each streamed batch published a fresh CodeViewFileTreeSource, and CodeViewFileTree responded by calling `model.resetPaths(source.paths)` — which constructs a brand new PathStore over the full accumulated path list. Per publish that is O(N log N) where N is total paths streamed so far, so across ~80 publishes the total work is roughly O(N² log N) on the main thread and each publish blocks input proportionally to N. The trees model already exposes a localized mutation fast path via `model.batch` / `model.add`; diffshub just wasn't using it. Link each tree-source snapshot to the prior one through a new `previousSource` field. The accumulator now also owns a persistent `rankByPath` map and a stable sort comparator over it, so we stop rebuilding the comparator's rank map on every snapshot. When CodeViewFileTree sees a snapshot whose `previousSource` matches the one it last applied, it batches the tail of new paths into the model with `{ type: 'add', path }` ops instead of resetting. `resetPaths` is now reserved for the initial mount and any non-append change (e.g. a new request). This drops per-publish tree work from O(N) to O(delta), keeps the counter incrementing at a steady rate as the stream progresses, and removes the recurring long task that was starving scroll during streaming.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Streaming a large compare (e.g. torvalds/linux v6.0..v7.0) made the Diff Stats counter tick more slowly the longer the stream ran, and stuttered scroll on every tick. Each streamed batch published a fresh CodeViewFileTreeSource, and CodeViewFileTree responded by calling
model.resetPaths(source.paths), which constructs a brand new PathStore over the full accumulated path list. Per publish that is O(N log N) where N is total paths streamed so far, so across ~80 publishes the total work is roughly O(N² log N) on the main thread and each publish blocks input proportionally to N. The trees model already exposes a localized mutation fast path viamodel.batch/model.add; diffshub just wasn't using it.Link each tree-source snapshot to the prior one through a new
previousSourcefield. The accumulator now also owns a persistentrankByPathmap and a stable sort comparator over it, so we stop rebuilding the comparator's rank map on every snapshot. When CodeViewFileTree sees a snapshot whosepreviousSourcematches the one it last applied, it batches the tail of new paths into the model with{ type: 'add', path }ops instead of resetting.resetPathsis now reserved for the initial mount and any non-append change (e.g. a new request).This drops per-publish tree work from O(N) to O(delta), keeps the counter incrementing at a steady rate as the stream progresses, and removes the recurring long task that was starving scroll during streaming.