Compress static assets served by Jupyter Lab #13189

suganya-sk · 2022-10-05T14:32:31Z

Problem

Jupyterlab's static assets are marked in Chrome's Lighthouse tests for text compression, even when there are no extensions installed. With the prevalent push in the Jupyterlab community to improve performance, this might be a candidate worth looking into.

I have been trying to compress the static assets associated with a few custom extensions I have, which are pretty large in size and I noticed that the static assets served from static/lab/ are flagged, even without any extensions. I'm wondering if compression might help here.

Proposed Solution

Consider compressing the static assets served by Jupyterlab.

I dug through the build process of Jupyterlab a bit and noticed that compression is set to false here. Is there a specific reason this is so or does compression occur anywhere else? Any pointers to relevant discussions is helpful! I have a high-level understanding of the build process; pls redirect if I should be looking anywhere else.

The text was updated successfully, but these errors were encountered:

suganya-sk · 2022-10-06T09:17:07Z

I played around with this a bit more, setting -

compress: {
    defaults: true
}

in jupyterlab/staging/webpack.prod.minimize.js and running pip install . and jlpm install before starting a Jupyterlab instance in the venv. However, this does not create any difference in the size of the assets.

Could I be looking at the wrong spot for the webpack config? I set a few simple log statements in the file mentioned above and confirmed that it is executed during the pip install.

Any pointers here would be helpful.

krassowski · 2022-10-06T16:52:43Z

I can reproduce this; it indeed includes assets such as JavaScript or settings JSON. It seems really important as the Lighhouse claims we could save many seconds on page load here, though the question is how much time the extra compression will take.

This can be addressed in jupyter-server by enabling compress_response in tornado request handler. I will check it out locally and open a PR if it works well. I don't know if there is a reason for it not being currently enabled, but we can ask over at jupyter-server.

mlucool · 2022-10-06T19:55:02Z

FWIW, we also have thought about jupyter-server/jupyter_server#312 (comment). This may be better to not compress in Jupyter Server and instead give us a way to "eject it" and put a high perfomrance webserver in front (e.g. nginx, caddy, apache). This server would take care of compression + serving.

Alternatively, we can precompress assets and if a client accepts a compression type server that with the right headers. That is, we gzip/brotli everything at a high level of compression and store the compressed file in memory/on disk. Then if a client accepts this type, we use that, if not we don't compress that assets or maybe just compress on the fly. This can be done for all extensions.

suganya-sk · 2022-10-07T06:00:50Z

I should have added this in my first update, apologies.

I did try starting Jupyterlab with --LabApp.tornado_settings="{'compress_response': True}" locally. This reduced the size of the assets significantly but did not have much of a difference in loading time. But I still think compression is worth the effort as long as we are in HTTP 1.

mlucool · 2022-10-07T14:31:02Z

Alternatively, we can precompress assets and if a client accepts a compression type server that with the right headers

If tornado compresses on the fly, most of the time savings could be negated by doing compression (dependent on a variety of factors specific to one's environment). What we care about the most here is pageload time. Since jupyterlab (and most plugins) hash assets, we could compress them at a very high level of compression and serve them (assuming the request accepts that encoding type) when much better caching headers. This is similar to gzip static.

I'll note that the many small assets generated by lab are actually bad for the default settings of HTTP/1.1 and only really get benefits from HTTP/2 (which requires another server in front due to lack of support in tornado). An interesting article about this. FWIW, I don't think we should optimize for HTTP/1.1 but make it easy to run a server that supports HTTP/2.

suganya-sk · 2022-10-08T04:20:01Z

To make sure we've considered all angles here, I've logged jupyter-server/jupyter_server#1016. If the concern is time spent on compressing on the fly, it might be worthwhile to understand the possibility of static compression as well, just to evaluate all options.

I also like Marc's alternative of caching compressing assets.

On a related note,

FWIW, I don't think we should optimize for HTTP/1.1 but make it easy to run a server that supports HTTP/2.

Regardless of where/how compression occurs, jfyi, I'm trying out the nginx + symlink setup Marc brought up in jupyter-server/jupyter_server#312 (comment) to understand how HTTP 2 helps with overall page load time. If there are any insights on compression along the way, I'll update here.

krassowski · 2022-10-09T00:20:42Z

I looked into the highlighted compress setting and realised that it is not what we assumed it was: compress in terser-webpack-plugin does not perform gzip compression but transforms JavaScript to reduce the character count, e.g. {a: a} to {a}), as listed in Terser documentation. This also suggests a potential reason for why it was disabled: some of the terser compression transformations are unsafe (although currently all of those are disabled by default).

To compress static assets we could use compression-webpack-plugin. I got it working locally but this is only half of the way as we still don't have a way to serve the pre-compressed files (the plugin just adds a .gz copy to the static files). We would need to modify jupyter-server to serve the contents of the .gz file (with appropriate header modification) if the browser's request indicates support for it. There is no easy way for that in tornado (tornadoweb/tornado#2957) but it is definitely possible. I will take another look tomorrow to see how much performance benefit overall we would get with a simple tweak in FileFindHandler.

bollwyvl · 2023-02-19T22:07:33Z

Yep, this is still a good idea, and something I brought up on #14038 (ironically, binder is one of the places we could guarantee HTTP2 and SSL).

I don't think the general case of HTTP2 on the desktop is reasonable, as it requires acquiring an SSL cert, which is still a bridge too far for most end user, even if LetsEncrypt makes it very easy. From a privacy perspective, that would be yet another third-party system that security-minded folks would have to disable in the name of "performance" for some theoretical enterprise user.

The work is really probably "just" overriding StaticFileHandler.get. There's no reason jupyterlab couldn't claim /static/lab/ (and /static/labextensions/) on another handler subclass, so nothing would even have to change upstream.

As to compressing with gzip: we can do better: brotli is built into nodejs, has 96% browser availability, and is directly supported by compression-webpack-plugin. As the built assets would already be compressed, the python-based backend wouldn't need a dedicated brotli dependency. The gains are pretty real: on that binder PR, the whole dev_mode/static would naively go down to 8.4mb (vs 34mb) when maximally compressed, but of course each release might end up having to tune the behavior. Among the most promising wins: the huge core style bundle, which blocks the main page loading, would be an order of magnitude less data on the wire:

699K Feb 19 15:29 6660.f480ae3a17d71fc749b1.js
 65K Feb 19 15:29 6660.f480ae3a17d71fc749b1.js.br

But indeed: all of this is just wind if it's not measured, in real browsers, over time.

…into jupyterlabgh-13189-brotli

suganya-sk added the enhancement label Oct 5, 2022

jupyterlab-probot bot added the status:Needs Triage Applied to new issues that need triage label Oct 5, 2022

krassowski added the tag:Performance label Oct 5, 2022

krassowski removed the status:Needs Triage Applied to new issues that need triage label Oct 6, 2022

This was referenced Oct 8, 2022

Improving Network Performance jupyter-server/jupyter_server#312

Open

Why is compression false in prod webpack config? jupyter-server/jupyter_server#1016

Open

andrii-i mentioned this issue Nov 3, 2022

Page load performance degradation on jupyter/datascience-notebook:lab-3.5.0 #13363

Closed

JasonWeill mentioned this issue Nov 3, 2022

Weekly Triage meetings: Jul-Dec 2022 jupyterlab/frontends-team-compass#151

Closed

bollwyvl mentioned this issue Feb 19, 2023

[PoC] Build and serve brotli-compressed assets #14040

Closed

5 tasks

bollwyvl added a commit to bollwyvl/jupyterlab that referenced this issue Feb 20, 2023

Merge remote-tracking branch 'origin/jupyterlabgh-14033-binder-hack' …

22673c3

…into jupyterlabgh-13189-brotli

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compress static assets served by Jupyter Lab #13189

Compress static assets served by Jupyter Lab #13189

suganya-sk commented Oct 5, 2022

suganya-sk commented Oct 6, 2022

krassowski commented Oct 6, 2022

mlucool commented Oct 6, 2022

suganya-sk commented Oct 7, 2022

mlucool commented Oct 7, 2022

suganya-sk commented Oct 8, 2022

krassowski commented Oct 9, 2022

bollwyvl commented Feb 19, 2023 •

edited

Compress static assets served by Jupyter Lab #13189

Compress static assets served by Jupyter Lab #13189

Comments

suganya-sk commented Oct 5, 2022

Problem

Proposed Solution

suganya-sk commented Oct 6, 2022

krassowski commented Oct 6, 2022

mlucool commented Oct 6, 2022

suganya-sk commented Oct 7, 2022

mlucool commented Oct 7, 2022

suganya-sk commented Oct 8, 2022

krassowski commented Oct 9, 2022

bollwyvl commented Feb 19, 2023 • edited

bollwyvl commented Feb 19, 2023 •

edited