Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas on Vastly Improving Pageload Time #14124

Open
mlucool opened this issue Mar 3, 2023 · 0 comments
Open

Ideas on Vastly Improving Pageload Time #14124

mlucool opened this issue Mar 3, 2023 · 0 comments

Comments

@mlucool
Copy link
Contributor

mlucool commented Mar 3, 2023

TL:DR; I think with some (mostly) smaller design changes in jupyter, we can reduce typical pageload times by 75%. This will vary depending on many factors, but I hope what is proposed here helps (or at least does not hurt) all setups (locally hosted, hosted on a fast/slow network, with/without HTTP2/SSL).

While I am not 100% sure of all the recommendations, I thought starting a thread would be a good place to debate the tradeoffs in one place.

Experiment

Using a combination of Chrome's network tab (disable caching) + performance insights + network throttling (to simulate cross machine - I set it to 30MB/s + 2ms RTT):

python3 -m venv myvenv
source myvenv/bin/activate
pip install jupyterlab # As of this post, its 3.6.1

Timing shows: ~3s to load and ~110 network calls for 7mb of data.

Now add in many common extensions. I did this one by one to prove that none of these are the problem itself but instead this is a problem of each one incrementally making it worse (sub in what you use):

pip install jupyterlab_git plotly jupyterlab-tour jupyterlab-recents jupyterlab-favorites jupyterlab_templates jupyterlab-execute-time jupyterlab-spellchecker jupyterlab_vim ipywidgets jupyterlab-lsp

Now pageload time is ~7s to load and requires ~170 network calls for 12mb of data.

What I noticed

  1. No assets are compressed (@bollwyvl has a good attempt in [PoC] Build and serve brotli-compressed assets #14040)
  2. Each plugin adds on average 6 more network calls
  3. Every call requires the python server to do some amount of work. These are loaded from disk so depending on the server so depending on how fast you can load from disk, can really add up. 10ms per file ends up with an extra 1.7s of page loadtime.
  4. Themes are loaded after the JS is loaded which causes the page to force recalculate all styles (not sure how big of a problem this is, but chrome complained)
  5. Incorrect headers are used. As pointed out in Initial page first loading improvements #10273, we have hashed assets which means they can be aggressively cached. All the remoteEntry. gets a Cache-Control no-cache header. The other assets from extensions get a max-age, but nothing ever sets Cache-Control: immutable (ref).
  6. Jupyterlab server should ask for anything required for pageload in one go and don't use the FS so much (more details in Improving page load time jupyterlab_server#380)

Suggested Fixes

  1. All JS/CSS assets above some threshold should be pre-compressed. Generally, extensions should by default include both .js and .js.br files and the server should honor accept-encoding to determine which asset to send. This is a cost on release build time for extension developers, but should be somewhat minor. [PoC] Build and serve brotli-compressed assets #14040 started this work.
  2. Minimize chunking for anything required on pageload; it's likely better for all users (webpack config ref).
    1. HTTP/1.1 - given head of line blocking + limits on resources that can be downloaded at once, the guidance in the HTTP 1.1. days was use fewer larger assets. Limiting chunking does this.
    2. HTTP/2 - jupyterlab by default has 70+ calls while with a reasonable setup (see Host a JupyterLab metapackage jupyterlab-contrib/jupyterlab-contrib.github.io#23 (comment)) you can have 200+ calls. This is much more than typical HTTP/2 recommendations of no more than ~50 files ((ref, ref, great ref).) By limiting the number of chunks, HTTP/2 users should see benefits as I would expect them to also use extensions, pushing them closer to the goldilocks number of requests.
    3. Bonus! By compressing larger files you'll get higher compression ratios, thus you may end up shipping less bytes (plus less boiler plate that webpack includes in each file).
    4. Warning: You can't force everything to chunk size one. Things with webworkers will require a least one chunk per worker.
  3. Find ways to only ship bytes needed for pageload on pageload and delay all other calls. Even a call that does not block pageload will call into the tornado server and potentially block one that does. (I'm a bit less sure this is the problem, but it's hard to tell; jupyter_server has some nice ideas on how not block here, but I think most of these calls are fast that the thread approach does not help).
  4. Don't use the FS so often; jupyterlab can (optionally?) cache all data it needs on load. Loading everything ahead of time and (maybe) watching for changes could lead to faster API replies (Improving page load time jupyterlab_server#380).

Of course we can think about larger changes like find a way to do SSR/remix like rendering, but I suspect those are a bit far-off ideas.

What do people think?

CC @afshin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants