Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimal document edits for file content change #157723

Merged
merged 5 commits into from Aug 10, 2022
Merged

Conversation

rebornix
Copy link
Member

Re #157695. Previously, we would replace the whole notebook with the new content whenever there is file change event, even if there is no real change in the file event.

In this PR, we attempt to compute minimal edits required for updating the NotebookTextModel to latest. The content change is translated to

  • If cells at the same position have the same language, content, cellKind, internal metadata, and mime, then we treat them as unchanged, we will not do a replacement
    • Instead, we update its metadata (object) and outputs (buffers) accordingly. To ensure it's performant, we don't compare metadata (as it can be big and nested) or outputs (comparing buffers can take long time)
  • Otherwise, we do a simple replacement, which means the cell is removed and then inserted with new ones, this will lose the cell view states.

@rebornix rebornix self-assigned this Aug 10, 2022
@rebornix
Copy link
Member Author

rebornix commented Aug 10, 2022

On a relatively large document, resetting the document is much more costly than computing the edits first and then applying minimal changes.

Before

image

After
image

From above screenshots, we can see that when we do a big reset (replace all cells with new data), every cell would rebuild their text buffer and trigger outline re-computation (we might need to look into this separately). Now since our changes are minimal, we don't have cell content change anymore but only metadata and output changes.

However, there are still some unexpected computations like below

image

Though we don't have content change, we are still repeatedly calculating estimated editor height and in every call it fetches cell execution and validate if it should include the status bar height. This can be avoided.

@rebornix
Copy link
Member Author

Pushed two optimizations:

  • Avoid broadcast layoutChange when we splice outputs that are not rendered at all. A splice on outputs that are not rendered yet will not trigger any real height change so we can skip them.
  • Batch notebookHasOutputs calculation. Previously we update this whenever there is an ouput change on any cell, now we make sure that if we have multiple cell output changes, we only calculate once.

With above, the applyEdits time went down to ~60 from ~220 with a notebook that contains 1800 cells

image

@rebornix rebornix marked this pull request as ready for review August 10, 2022 18:31
@rebornix rebornix added this to the August 2022 milestone Aug 10, 2022
mjbvz
mjbvz previously approved these changes Aug 10, 2022
@rebornix rebornix merged commit ed3c693 into main Aug 10, 2022
@rebornix rebornix deleted the rebornix/nb-document-hash branch August 10, 2022 22:22
@github-actions github-actions bot locked and limited conversation to collaborators Sep 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants