Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize IPC traffic when saving notebooks (large notebooks slow over Remote-SSH?) #172345

Open
roblourens opened this issue Jan 25, 2023 · 16 comments
Assignees
Labels
bug Issue identified by VS Code Team member as probable bug notebook-perf on-testplan
Milestone

Comments

@roblourens
Copy link
Member

#138784 (comment)

@roblourens roblourens added bug Issue identified by VS Code Team member as probable bug notebook-perf labels Jan 25, 2023
@roblourens roblourens added this to the Backlog milestone Jan 25, 2023
@roblourens roblourens self-assigned this Jan 25, 2023
@rebornix
Copy link
Member

A few ideas from discussions with Kai and a few others:

  • If we can run the serializer in web worker EH and once we have SAB support, we don't have to send buffers back and forth, which can save a lot of time. It does require the serializer to be an UI extension other than a workspace extension.
  • Or, we can probably do a fast "notebookToData" by asking if the EH already has the latest notebook document
    • EH which has the serializer running, has a copy of NotebookDocument, we ask if its version id is the same as the one in the renderer, if so, serialize it and return the buffer back. If not, we go to traditional route.

@jrich100
Copy link

jrich100 commented Apr 7, 2023

Any updates on this?

We have seen situations where large notebooks not only become unusable, but the rest of the UI becomes very slow.

Specifically seems to happen when using tools that embed charts and graphs as js in the cell output/metadata (ex. bokeh).

@baiguoname
Copy link

Any updates on this?

@spinicist
Copy link

I am experiencing this issue too. I accept my current Notebook is pathological (a large amount of plots), but VS Code is unusable like this. I'd be very happy with "slow"!

@rebornix
Copy link
Member

rebornix commented Jun 8, 2023

This can be reproduced easily by remote ssh into another machine in the same local network with following code

for i in range(0,1000000):
    print(i)

When we have auto save turned on (w/ 1 second delay), it will generate upload/download at 10MB/s for several minutes. When the vm is truly remote, it means the bandwidth is fully used by the file saving and output updating.

image

bpasero added a commit that referenced this issue Jun 23, 2023
) (#185988)

* files - allow more file operations to run in the extension host (#172345)

* fix tests

* tests

* tests

* tests
bpasero added a commit that referenced this issue Jun 23, 2023
* working copy - allow to override backup delay (#172345)

* working copy - also use `backupDelay` for untitled
@jrich100
Copy link

jrich100 commented Jul 3, 2023

Hi! @rebornix

I was able to test this out using
Version: 1.80.0-insider (Universal)
Commit: 5b74404

Jupyter: v2023.6.1001821100
Remote - SSH: v0.103.2023062115

The setting "notebook.experimental.remoteSave" does not appear to be registered, so I am not positive that I have applied this change. That being said, the scenario I have been using does appear to be improved (generate large cell output, then immediately run a print() statement).

@jrich100
Copy link

jrich100 commented Jul 20, 2023

@rebornix Any updates here? I am happy to continue testing any changes

@rebornix rebornix modified the milestones: July 2023, August 2023 Jul 24, 2023
@rebornix
Copy link
Member

@jrich100 thanks for testing and sharing updates. I'll have more changes upcoming to improve the perf while cell is running, will share here once I have a build for it.

@bpasero bpasero removed their assignment Aug 30, 2023
@rebornix rebornix modified the milestones: August 2023, September 2023 Aug 30, 2023
@rebornix rebornix modified the milestones: September 2023, October 2023 Sep 26, 2023
@tbenthompson
Copy link

tbenthompson commented Oct 18, 2023

I think I ran into this issue today: I was on a slow-ish connection. A speed test said I was getting 3-5 Mbps. VSCode and all extensions were on the latest stable version (not insiders). I had 3-5 seconds of lag before notebook cells started running regardless of the contents of the cell. After I cleared all the output (plots) from the notebook, there was no longer any perceptible lag.

@rebornix rebornix modified the milestones: October 2023, November 2023 Oct 24, 2023
@rebornix rebornix modified the milestones: November 2023, December 2023 Nov 28, 2023
@rebornix
Copy link
Member

@tbenthompson is "notebook.experimental.remoteSave": true making the experience any better or you don't see any change at all?

@jrich100
Copy link

Generally, I think the experience with large notebooks has improved (though in a way that is hard to quantify)!

However, I am still able to create a scenario that feels very very slow....

  1. Run a cell that creates ~50 large matplotlib plots in a loop
  2. Once the above cell completes, immediately run a cell with a simple print() statement

In this scenario, the notebook hangs for over 90 seconds before executing the print() cell. Presumably, the cell execution is blocking on the notebook file saving/syncing?

@xiaoxuchn
Copy link

xiaoxuchn commented Jan 29, 2024

I have pretty much the same experience with @jrich100 . I plotted out 50 images in a for loop. It was taking extremely long time (about 1 hour) to save the output. I saw my wifi was running full speed uploading at 5Mbps. I've turned the experimental remote save settings on. It doesn't seem to help. Even after I turned auto save off, I still needed to wait that long to run the next cell. I've no idea what it was uploading.

@rebornix rebornix modified the milestones: February 2024, March 2024 Feb 21, 2024
@rebornix rebornix modified the milestones: March 2024, April 2024 Mar 26, 2024
@fratorgano
Copy link

fratorgano commented Mar 27, 2024

I'm having the same problem, whenever I try to save a big notebook, even if I changed a single line of code, it takes a lot. This kills performance, especially when working with a mobile connection.
I tried the notebook.experimental.remoteSave: true setting in the non-insider version of vscode since it was also available there, but that didn't make any noticeable difference for me.
I downloaded the vscode insider version and tried it there, and it was saving much faster.

@rebornix rebornix modified the milestones: April 2024, May 2024 Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue identified by VS Code Team member as probable bug notebook-perf on-testplan
Projects
None yet
Development

No branches or pull requests

10 participants