You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TL:DR; I think with some (mostly) smaller design changes in jupyter, we can reduce typical pageload times by 75%. This will vary depending on many factors, but I hope what is proposed here helps (or at least does not hurt) all setups (locally hosted, hosted on a fast/slow network, with/without HTTP2/SSL).
While I am not 100% sure of all the recommendations, I thought starting a thread would be a good place to debate the tradeoffs in one place.
Experiment
Using a combination of Chrome's network tab (disable caching) + performance insights + network throttling (to simulate cross machine - I set it to 30MB/s + 2ms RTT):
python3 -m venv myvenv
source myvenv/bin/activate
pip install jupyterlab # As of this post, its 3.6.1
Timing shows: ~3s to load and ~110 network calls for 7mb of data.
Now add in many common extensions. I did this one by one to prove that none of these are the problem itself but instead this is a problem of each one incrementally making it worse (sub in what you use):
Every call requires the python server to do some amount of work. These are loaded from disk so depending on the server so depending on how fast you can load from disk, can really add up. 10ms per file ends up with an extra 1.7s of page loadtime.
Themes are loaded after the JS is loaded which causes the page to force recalculate all styles (not sure how big of a problem this is, but chrome complained)
Incorrect headers are used. As pointed out in Initial page first loading improvements #10273, we have hashed assets which means they can be aggressively cached. All the remoteEntry. gets a Cache-Control no-cache header. The other assets from extensions get a max-age, but nothing ever sets Cache-Control: immutable (ref).
All JS/CSS assets above some threshold should be pre-compressed. Generally, extensions should by default include both .js and .js.br files and the server should honor accept-encoding to determine which asset to send. This is a cost on release build time for extension developers, but should be somewhat minor. [PoC] Build and serve brotli-compressed assets #14040 started this work.
Minimize chunking for anything required on pageload; it's likely better for all users (webpack config ref).
HTTP/1.1 - given head of line blocking + limits on resources that can be downloaded at once, the guidance in the HTTP 1.1. days was use fewer larger assets. Limiting chunking does this.
HTTP/2 - jupyterlab by default has 70+ calls while with a reasonable setup (see Host a JupyterLab metapackage jupyterlab-contrib/jupyterlab-contrib.github.io#23 (comment)) you can have 200+ calls. This is much more than typical HTTP/2 recommendations of no more than ~50 files ((ref, ref, great ref).) By limiting the number of chunks, HTTP/2 users should see benefits as I would expect them to also use extensions, pushing them closer to the goldilocks number of requests.
Bonus! By compressing larger files you'll get higher compression ratios, thus you may end up shipping less bytes (plus less boiler plate that webpack includes in each file).
Warning: You can't force everything to chunk size one. Things with webworkers will require a least one chunk per worker.
Find ways to only ship bytes needed for pageload on pageload and delay all other calls. Even a call that does not block pageload will call into the tornado server and potentially block one that does. (I'm a bit less sure this is the problem, but it's hard to tell; jupyter_server has some nice ideas on how not block here, but I think most of these calls are fast that the thread approach does not help).
Don't use the FS so often; jupyterlab can (optionally?) cache all data it needs on load. Loading everything ahead of time and (maybe) watching for changes could lead to faster API replies (Improving page load time jupyterlab_server#380).
Of course we can think about larger changes like find a way to do SSR/remix like rendering, but I suspect those are a bit far-off ideas.
TL:DR; I think with some (mostly) smaller design changes in jupyter, we can reduce typical pageload times by 75%. This will vary depending on many factors, but I hope what is proposed here helps (or at least does not hurt) all setups (locally hosted, hosted on a fast/slow network, with/without HTTP2/SSL).
While I am not 100% sure of all the recommendations, I thought starting a thread would be a good place to debate the tradeoffs in one place.
Experiment
Using a combination of Chrome's network tab (disable caching) + performance insights + network throttling (to simulate cross machine - I set it to 30MB/s + 2ms RTT):
Timing shows: ~3s to load and ~110 network calls for 7mb of data.
Now add in many common extensions. I did this one by one to prove that none of these are the problem itself but instead this is a problem of each one incrementally making it worse (sub in what you use):
Now pageload time is ~7s to load and requires ~170 network calls for 12mb of data.
What I noticed
Cache-Control no-cache
header. The other assets from extensions get a max-age, but nothing ever setsCache-Control: immutable
(ref).Suggested Fixes
Of course we can think about larger changes like find a way to do SSR/remix like rendering, but I suspect those are a bit far-off ideas.
What do people think?
CC @afshin
The text was updated successfully, but these errors were encountered: