Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support other loader mechanisms? #1

Open
bollwyvl opened this issue Jan 22, 2020 · 15 comments
Open

Support other loader mechanisms? #1

bollwyvl opened this issue Jan 22, 2020 · 15 comments

Comments

@bollwyvl
Copy link

Cool stuff!

We've had a lot of fun with requirejs over the years, and there's a lot out there that uses it (sometimes due to us fighting for upstreams to publish UMD), but ES modules are supported in all browsers where jupyterlab will otherwise work, work really well, and don't require (har!) learning a different pattern than all the other stuff out there. I want the hours of my life back dealing with requirejs shims, and paths, modules, etc.

Anyhow, in the interim, perhaps a jlab-dynext-core which handled the Token resolution, a jlab-dynext-requirejs, and a jlab-dynext-esmodules for The Future (Today). Indeed, once Lab 2.0 drops, looking to 3.0 it would be interesting indeed to see if https://github.com/pikapkg/snowpack could provide a standards-forward approach to get the whole shooting match out of bundlerizing on the end user's machine, and likely making local development more pleasant, to boot.

Anyhow: of note on the extension editor feature: we're still plugging along on https://github.com/krassowski/jupyterlab-lsp and would very much like to see a typescript/javascript language server be part of that editor experience. In this particular case, we may well want a language server wired right into Lab so that we can override file-discovery-based things (eek, node_modules) to pull stuff directly off the working lab, rather than delegating to node over a websocket...

@wolfv
Copy link
Member

wolfv commented Jan 22, 2020

Hey,

thanks for commenting! As far as I know, ES 6 modules can't be loaded from an external URL (eg. jsdelivr)?
I am referencing this stackoverflow post: https://stackoverflow.com/questions/34607252/es6-import-module-from-url . This would be pretty key for this extension, e.g. to dynamically load widget libraries from a CDN or internal repo.

I agree that all the different ways of bundling + loading JS is as confusing as it can be, and the new module syntax seems nicer.

@bollwyvl
Copy link
Author

bollwyvl commented Jan 22, 2020 via email

@wolfv
Copy link
Member

wolfv commented Jan 22, 2020

yeah, one idea that is obvious is to make a companion jupyter-server extension which serves jlab-extensions from some location like: ~/.local/share/jupyter/jlab_extensions/..., so that, indeed, they could be conda / pip installed (or through the snippet manager jlab-extension that yet needs to be developed.

I am not sure wether attaching extensions to a notebook is a very good idea (there are also huge security implications as JS extensions in the JLab context can do anything to the host system (basically run arbitrary code with user permissions...). So that should be scrutinized!

@bollwyvl
Copy link
Author

attaching extensions to a notebook is a very good idea

Not to be flippant, but how is the approach in this extension more secure? It brings back the "normal" of loading random stuff from CDN (which will eventually disappear, be firewalled, or worse, subverted), with full access to the Lab instance and all the plugins, so i don't think the security-conscience would install any of this anyway.

Anyhow: my point is: People understand putting (individual) files in places. We have a container format that could provide a virtual file system. An "extension notebook" (which maybe might only allow markdown cells) would be its own documentation, and a little wee Word-like icon could be clicked and pop a dialog saying:

|--------------------|
| Extension Notebook |__________________________________
|                                                       |
| Untitled.ipynb contains the following extensions:     |
|                                                       |
| |---------------------------------------------|       |
| | super duper not malicious extension   v1.0  |       |
| |                                             |       |
| | I promise this isn't evil                   |       |
| |                                             |       |
| | [ view 47 files ]                           |       |
| |                                             |       |
| | SHA: abcedf  Signed by: <foo> [check key]   |       |
| |----------------------------------------------       |
|                                                       |
| To install, copy it to your application directory and |
| reload the JupyterLab application                     |
|                                                       |
|-------------------------------------------------------|

Viewing the files could pop a new filebrowser, and you could copy/paste stuff into it, make new files, and then when you're ready, you could sign it.

@wolfv
Copy link
Member

wolfv commented Jan 22, 2020

My point is that when you install something like bqplot from a CDN you can be somewhat sure about what you're getting, and you're making a (somewhat) conscious decision.

When you execute a notebook that you downloaded from somewhere and it auto-loads some extension, it would be less transparent to me what's going on. But if we make it visible enough and show the user what kind of extension get's loaded from the notebook ... why not?

Anyways, I am trying to advocate to jail conda environments stronger since a while, and e.g. on a system level allow a jupyterlab instance only access to the current work directory and the read-only container file system.

I have some / a lot of ideas in this direction but lack the time to execute them :)

@bollwyvl
Copy link
Author

I guess it's about defense in depth, and magic trusted hosts and containers can't be the endgame. I People need to be able to build their own tools without hot access to The Internet, or perhaps The Internet last year.

from a CDN you can be somewhat sure

https://justi.cz/security/2018/05/23/cdn-tar-oops.html

Only way I'd feel safe about that is by hard-pinning to a specific, non-broken crypto hash, and caching it locally, but then what if their upstreams (because require/system) can pull random stuff? Then there's the js vm in js/iframe/worker/whatever approach. Some good writeups from figma about that somewhere..

auto-loads some extension,

I'm not suggesting it get auto-loaded, but that, like a tarball of random stuff from npm, a notebook is a container that could be used to distribute things for lab, except people could build/test them in lab, and they would be made of local sources. People already run random notebooks from the internet, and kernels can do a sight more damage than JS, usually: there's no private browsing unless, to your next point...

only access to the current work directory and the read-only container file system.

At some point, and i'd say this is ongoing, i would imagine kernel management needs to go the way of the hub spawnsers, and lightweight solutions need to be found to the local sandboxed execution story, reasonably, on many platforms, without getting in the way of peoples' ability to Do Work. Snaps on most linux distros, sanboxie on windows, probably some special thing on OSX, as they continue to ratchet down on your freedom. Even if containers are part of the story, any of the above might still needs to be wrapped in another VM, and even those hypervisors still have vulnerabilities (USB seems to be a favorite).

Super off topic at this point, and again, I really like where the extension is going!

@wolfv
Copy link
Member

wolfv commented Jan 22, 2020

Yep, I am aware of the Figma approach. I think the Figma, or VSCode approach to have very narrow interfaces, and permission system and sandbox the JS worker to only communicate via IPC or something liek that is a good option for untrusted extensions, but ... that's major work :)

Sandboxing would still be great to have, it would probably give me some peace of mind. :)

@krassowski
Copy link
Member

Two years later:

System JS together with SystemJS Babel Extension and systemjs-webpack-interop could allow for a fully dynamic module-type-independent loading of external modules and re-using of what webpack already loaded. The latter would require modifications to webpack entry point which makes it difficult to add in a prebuilt extension (or maybe even impossible to do without modifying the JupyterLab core?); I also don't know how it interacts with things loaded via module federation.

But even without webpack integration SystemJS + Babel looks interesting when it comes to the capabilities they could bring (loading basically anything useful which is out there on npm). I am not convinced if it improves security by much though.

@krassowski
Copy link
Member

In #28 I am building on the existing approach of encapsulating the user code in a function call, to expand the capability of this extension, thinking mostly about the "novice extension author" experience, where ability to execute code as it is it would work if it was a standalone extension is important. I am achieving that by transpilation via TypeScript compiler and making the requirejs calls an implementation detail (for the most part). I am also exposing @jupyterlab and @lumino modules loaded by webpack using a custom TypeScript transformer (which also allows to load user code via function). The requirejs loading could be swapped for SystemJS + Babel to expand the range of modules which can be loaded and in that case we might replace the TypeScript Transformers with Babel Transformers - my understanding is that both would work equally well.

@krassowski
Copy link
Member

krassowski commented Jan 2, 2022

The need for sandboxing seems to be a consensus, but it looks like that there are now two separate scenarios which need different degree (or lack of) isolation:

  • "novice extension developer who wants to modify the UI and have full control over lab" - they may be running on Binder and don't care about sandboxing; they want full access to JupyterLab API and want the external modules to just work.
  • "advanced user who wants to add module for visualisation which is only needed in a specific notebook" - they don't want their real work environment to be at risk ever, they don't care about JupyterLab API apart from things directly relating to notebook, or maybe even widget outputs.

This sounds like two different extensions to me. Or one extension with very different settings to accommodate both use cases.

@krassowski
Copy link
Member

So there is also ESM.sh which compiles NPM packages to ES6 modules (against a chosen target) and also provides TypeScript types via X-TypeScript-Types for the LSP (we would need to do something to support them though). It looks popular (maybe due to lack of alternatives) and well thought through, but it is not a major player like jsdelivr so there is a question of what are the prospects of this long term (and also where their main server is located).

The bigger issue with ESM is that there is no support for SRI. Even requirejs has it, but ESM.sh does not; we could fetch the resource and manually compute the integrity, but this does not solve the fundamental issue of them not guaranteeing that the builds are reproducible (though we could pin the build number to alleviate the issue).

@krassowski
Copy link
Member

krassowski commented Jan 8, 2022

On the issue of SRI, and providing multiple loaders, I was looking up if we could structure the settings around the the unofficial import maps standard. It does not currently support integrity, but there are some discussions around this topic: WICG/import-maps#221. SystemJS adopted the integrity extension to import map.

Still, adding multiple loaders on top of the import map, integrity and scopes makes it overly complicated - opposite of what this extension was meant to be. At this point running npm install would be easier for the user than figuring out which loader they should use and how to configure it.

One reasonable solution would be to require that all import statements (other than the ones that import builtins/tokens) include a prefix which defines what kind of a loader should be used e.g.:

// loads from esm.sh or other CDN configured in the settings
import React from 'es:react'
// loads using require.js from a CDN configured in the settings 
import React from 'amd:react'
// takes the module from webpack to use exact same object as the JupyterLab
import React from 'builtin:react'   // or jupyterlab:

Or, this could be defined in the settings as a loadMap: {"react": "es"}.

@krassowski
Copy link
Member

Or am I overthinking and there should be just a single setting "loader": "requirejs" or "loader": "es" and mixing of the two should not be allowed? This would simplify it for the user I think...

@bollwyvl
Copy link
Author

bollwyvl commented Jan 8, 2022

Yeah, as the linked jupyterlite thing suggests, at the end of the day the dependency resolution mechanism (if not the actual syntax) eventually defines what a "kernel-like" runtime really is, and expands to become its own distribution. Hence the original suggestion of pluggable loaders, where you would select/document the one your extension uses, and would need to ensure it gets installed by others to be portable, and won't work if it's not installed.

Making the dependencies declarative, outside of the code itself (wherever that loader gets set), might provide a way to do this before the code actually gets parsed... and would get someone 80% of the way to the hard parts of filling in a cookiecuttter-generated package.json... which I hope is a still a maturation target for someone using this. This is not to say one couldn't suggest this from the typing import 'a<TAB>, but having an at-rest state that doesn't require anticipating how these things work would open up other avenues.

@meeseeksmachine
Copy link

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/what-is-the-2022-way-to-display-javascript-in-a-python-notebook-in-jupyter-lab/12318/5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants