Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compress static assets served by Jupyter Lab #13189

Open
suganya-sk opened this issue Oct 5, 2022 · 8 comments
Open

Compress static assets served by Jupyter Lab #13189

suganya-sk opened this issue Oct 5, 2022 · 8 comments

Comments

@suganya-sk
Copy link

Problem

Jupyterlab's static assets are marked in Chrome's Lighthouse tests for text compression, even when there are no extensions installed. With the prevalent push in the Jupyterlab community to improve performance, this might be a candidate worth looking into.

I have been trying to compress the static assets associated with a few custom extensions I have, which are pretty large in size and I noticed that the static assets served from static/lab/ are flagged, even without any extensions. I'm wondering if compression might help here.

Proposed Solution

Consider compressing the static assets served by Jupyterlab.

I dug through the build process of Jupyterlab a bit and noticed that compression is set to false here. Is there a specific reason this is so or does compression occur anywhere else? Any pointers to relevant discussions is helpful! I have a high-level understanding of the build process; pls redirect if I should be looking anywhere else.

@jupyterlab-probot jupyterlab-probot bot added the status:Needs Triage Applied to new issues that need triage label Oct 5, 2022
@suganya-sk
Copy link
Author

I played around with this a bit more, setting -

compress: {
    defaults: true
}

in jupyterlab/staging/webpack.prod.minimize.js and running pip install . and jlpm install before starting a Jupyterlab instance in the venv. However, this does not create any difference in the size of the assets.

Could I be looking at the wrong spot for the webpack config? I set a few simple log statements in the file mentioned above and confirmed that it is executed during the pip install.

Any pointers here would be helpful.

@krassowski
Copy link
Member

I can reproduce this; it indeed includes assets such as JavaScript or settings JSON. It seems really important as the Lighhouse claims we could save many seconds on page load here, though the question is how much time the extra compression will take.

Screenshot from 2022-10-06 17-32-14

This can be addressed in jupyter-server by enabling compress_response in tornado request handler. I will check it out locally and open a PR if it works well. I don't know if there is a reason for it not being currently enabled, but we can ask over at jupyter-server.

@krassowski krassowski removed the status:Needs Triage Applied to new issues that need triage label Oct 6, 2022
@mlucool
Copy link
Contributor

mlucool commented Oct 6, 2022

FWIW, we also have thought about jupyter-server/jupyter_server#312 (comment). This may be better to not compress in Jupyter Server and instead give us a way to "eject it" and put a high perfomrance webserver in front (e.g. nginx, caddy, apache). This server would take care of compression + serving.

Alternatively, we can precompress assets and if a client accepts a compression type server that with the right headers. That is, we gzip/brotli everything at a high level of compression and store the compressed file in memory/on disk. Then if a client accepts this type, we use that, if not we don't compress that assets or maybe just compress on the fly. This can be done for all extensions.

@suganya-sk
Copy link
Author

I should have added this in my first update, apologies.

I did try starting Jupyterlab with --LabApp.tornado_settings="{'compress_response': True}" locally. This reduced the size of the assets significantly but did not have much of a difference in loading time. But I still think compression is worth the effort as long as we are in HTTP 1.

@mlucool
Copy link
Contributor

mlucool commented Oct 7, 2022

Alternatively, we can precompress assets and if a client accepts a compression type server that with the right headers

If tornado compresses on the fly, most of the time savings could be negated by doing compression (dependent on a variety of factors specific to one's environment). What we care about the most here is pageload time. Since jupyterlab (and most plugins) hash assets, we could compress them at a very high level of compression and serve them (assuming the request accepts that encoding type) when much better caching headers. This is similar to gzip static.

I'll note that the many small assets generated by lab are actually bad for the default settings of HTTP/1.1 and only really get benefits from HTTP/2 (which requires another server in front due to lack of support in tornado). An interesting article about this. FWIW, I don't think we should optimize for HTTP/1.1 but make it easy to run a server that supports HTTP/2.

@suganya-sk
Copy link
Author

To make sure we've considered all angles here, I've logged jupyter-server/jupyter_server#1016. If the concern is time spent on compressing on the fly, it might be worthwhile to understand the possibility of static compression as well, just to evaluate all options.

I also like Marc's alternative of caching compressing assets.

On a related note,

FWIW, I don't think we should optimize for HTTP/1.1 but make it easy to run a server that supports HTTP/2.

Regardless of where/how compression occurs, jfyi, I'm trying out the nginx + symlink setup Marc brought up in jupyter-server/jupyter_server#312 (comment) to understand how HTTP 2 helps with overall page load time. If there are any insights on compression along the way, I'll update here.

@krassowski
Copy link
Member

I looked into the highlighted compress setting and realised that it is not what we assumed it was: compress in terser-webpack-plugin does not perform gzip compression but transforms JavaScript to reduce the character count, e.g. {a: a} to {a}), as listed in Terser documentation. This also suggests a potential reason for why it was disabled: some of the terser compression transformations are unsafe (although currently all of those are disabled by default).

To compress static assets we could use compression-webpack-plugin. I got it working locally but this is only half of the way as we still don't have a way to serve the pre-compressed files (the plugin just adds a .gz copy to the static files). We would need to modify jupyter-server to serve the contents of the .gz file (with appropriate header modification) if the browser's request indicates support for it. There is no easy way for that in tornado (tornadoweb/tornado#2957) but it is definitely possible. I will take another look tomorrow to see how much performance benefit overall we would get with a simple tweak in FileFindHandler.

@bollwyvl
Copy link
Contributor

bollwyvl commented Feb 19, 2023

Yep, this is still a good idea, and something I brought up on #14038 (ironically, binder is one of the places we could guarantee HTTP2 and SSL).

I don't think the general case of HTTP2 on the desktop is reasonable, as it requires acquiring an SSL cert, which is still a bridge too far for most end user, even if LetsEncrypt makes it very easy. From a privacy perspective, that would be yet another third-party system that security-minded folks would have to disable in the name of "performance" for some theoretical enterprise user.

The work is really probably "just" overriding StaticFileHandler.get. There's no reason jupyterlab couldn't claim /static/lab/ (and /static/labextensions/) on another handler subclass, so nothing would even have to change upstream.

As to compressing with gzip: we can do better: brotli is built into nodejs, has 96% browser availability, and is directly supported by compression-webpack-plugin. As the built assets would already be compressed, the python-based backend wouldn't need a dedicated brotli dependency. The gains are pretty real: on that binder PR, the whole dev_mode/static would naively go down to 8.4mb (vs 34mb) when maximally compressed, but of course each release might end up having to tune the behavior. Among the most promising wins: the huge core style bundle, which blocks the main page loading, would be an order of magnitude less data on the wire:

699K Feb 19 15:29 6660.f480ae3a17d71fc749b1.js
 65K Feb 19 15:29 6660.f480ae3a17d71fc749b1.js.br

But indeed: all of this is just wind if it's not measured, in real browsers, over time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants