Ideas on Vastly Improving Pageload Time #14124

mlucool · 2023-03-03T20:27:39Z

TL:DR; I think with some (mostly) smaller design changes in jupyter, we can reduce typical pageload times by 75%. This will vary depending on many factors, but I hope what is proposed here helps (or at least does not hurt) all setups (locally hosted, hosted on a fast/slow network, with/without HTTP2/SSL).

While I am not 100% sure of all the recommendations, I thought starting a thread would be a good place to debate the tradeoffs in one place.

Experiment

Using a combination of Chrome's network tab (disable caching) + performance insights + network throttling (to simulate cross machine - I set it to 30MB/s + 2ms RTT):

python3 -m venv myvenv
source myvenv/bin/activate
pip install jupyterlab # As of this post, its 3.6.1

Timing shows: ~3s to load and ~110 network calls for 7mb of data.

Now add in many common extensions. I did this one by one to prove that none of these are the problem itself but instead this is a problem of each one incrementally making it worse (sub in what you use):

pip install jupyterlab_git plotly jupyterlab-tour jupyterlab-recents jupyterlab-favorites jupyterlab_templates jupyterlab-execute-time jupyterlab-spellchecker jupyterlab_vim ipywidgets jupyterlab-lsp

Now pageload time is ~7s to load and requires ~170 network calls for 12mb of data.

What I noticed

No assets are compressed (@bollwyvl has a good attempt in [PoC] Build and serve brotli-compressed assets #14040)
Each plugin adds on average 6 more network calls
Every call requires the python server to do some amount of work. These are loaded from disk so depending on the server so depending on how fast you can load from disk, can really add up. 10ms per file ends up with an extra 1.7s of page loadtime.
Themes are loaded after the JS is loaded which causes the page to force recalculate all styles (not sure how big of a problem this is, but chrome complained)
Incorrect headers are used. As pointed out in Initial page first loading improvements #10273, we have hashed assets which means they can be aggressively cached. All the remoteEntry. gets a Cache-Control no-cache header. The other assets from extensions get a max-age, but nothing ever sets Cache-Control: immutable (ref).
Jupyterlab server should ask for anything required for pageload in one go and don't use the FS so much (more details in Improving page load time jupyterlab_server#380)

Suggested Fixes

All JS/CSS assets above some threshold should be pre-compressed. Generally, extensions should by default include both .js and .js.br files and the server should honor accept-encoding to determine which asset to send. This is a cost on release build time for extension developers, but should be somewhat minor. [PoC] Build and serve brotli-compressed assets #14040 started this work.
Minimize chunking for anything required on pageload; it's likely better for all users (webpack config ref).
1. HTTP/1.1 - given head of line blocking + limits on resources that can be downloaded at once, the guidance in the HTTP 1.1. days was use fewer larger assets. Limiting chunking does this.
2. HTTP/2 - jupyterlab by default has 70+ calls while with a reasonable setup (see Host a JupyterLab metapackage jupyterlab-contrib/jupyterlab-contrib.github.io#23 (comment)) you can have 200+ calls. This is much more than typical HTTP/2 recommendations of no more than ~50 files ((ref, ref, great ref).) By limiting the number of chunks, HTTP/2 users should see benefits as I would expect them to also use extensions, pushing them closer to the goldilocks number of requests.
3. Bonus! By compressing larger files you'll get higher compression ratios, thus you may end up shipping less bytes (plus less boiler plate that webpack includes in each file).
4. Warning: You can't force everything to chunk size one. Things with webworkers will require a least one chunk per worker.
Find ways to only ship bytes needed for pageload on pageload and delay all other calls. Even a call that does not block pageload will call into the tornado server and potentially block one that does. (I'm a bit less sure this is the problem, but it's hard to tell; jupyter_server has some nice ideas on how not block here, but I think most of these calls are fast that the thread approach does not help).
Don't use the FS so often; jupyterlab can (optionally?) cache all data it needs on load. Loading everything ahead of time and (maybe) watching for changes could lead to faster API replies (Improving page load time jupyterlab_server#380).

Of course we can think about larger changes like find a way to do SSR/remix like rendering, but I suspect those are a bit far-off ideas.

What do people think?

CC @afshin

The text was updated successfully, but these errors were encountered:

mlucool added the enhancement label Mar 3, 2023

jupyterlab-probot bot added the status:Needs Triage Applied to new issues that need triage label Mar 3, 2023

krassowski added tag:Performance status:Needs Discussion and removed status:Needs Triage Applied to new issues that need triage labels Mar 4, 2023

brichet mentioned this issue Apr 21, 2023

Set Cache-control immutable for static files jupyterlab/jupyterlab_server#394

Merged

brichet mentioned this issue May 2, 2023

Allows immutable cache for static files in a directory jupyter-server/jupyter_server#1268

Merged

krassowski mentioned this issue Nov 1, 2023

Adopt server early-hints to accelerate startup time #15335

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ideas on Vastly Improving Pageload Time #14124

Ideas on Vastly Improving Pageload Time #14124

mlucool commented Mar 3, 2023

Ideas on Vastly Improving Pageload Time #14124

Ideas on Vastly Improving Pageload Time #14124

Comments

mlucool commented Mar 3, 2023

Experiment

What I noticed

Suggested Fixes