-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proof point: Show Jupyter Notebook launching kernels remotely #16
Comments
I'll take a look at this. |
It looks like there would be backend changes necessary to make this work. The sessionmanager.py starts a kernel on the local notebook server using the configured kernel manager. We would have to write our own kernel manager that talks REST to a kernel_gateway to manage kernels remotely. In addition, we would need to make sure that the web socket URL option is set on the frontend, so kernel.js will use that for the web socket connection instead of
|
It turns out that the websocket URL can be set as an option to the Notebook app, so no changes to the frontend JS are needed. I just ran my notebook server using
I'm also using a
|
Turns out the Websocket check is supported by Tornado as an optional way to prevent XSS, even though it's not part of the Websocket standard: The implementation in the notebook handlers doesn't provide security for non-notebook clients which can set Origin to anything. For cross-domain security, we need more than Origin checks anyway and is one reason for the auth token support that went in. That said, we're still going to need to think security through in the long run for a notebook server requesting a remote kernel. Putting the kernel gateway key in the frontend for the client JS to pass to the remote server is a bad, bad idea. (We may wind up proxying Websockets from the notebook server to the kernel gateway after all for this very reason.) |
Another related problem: there's no way in the browser Websocket API to pass through additional headers. So a direct-from-browser websocket connection is not going to be able to take advantage of token auth via headers down the line. It'll need to switch to passing a token through the URL as a query param or some such. But, as noted above, we don't want a single shared token appearing in the HTML / JS sent down to a browser. It's gotta be the equivalent of a one-time CSRF-token. I'm hesitant to hack in changes for this quick proof of concept. I can push a branch that turns off the origin check on Websocket for now. After this exploration, and we have something working, we need to sit and think through how to to do this securely. Options on the table that I see:
|
Working from @parente's branch to disable WS CORS on the kernel_gateway, I have remote kernels pretty much working. I can launch new kernels when notebooks are opened or created, and delete them on notebook close or nbserver shutdown. The biggest issue is restarting kernels. The SessionManager creates a kernel when a session is created, and stores the kernel_id as part of the session (kernel_id is part of the session SQL schema), which means kernel_id gets out of sync unless the kernel manager updates the session with a new kernel_id on a kernel restart (which DELETEs the old kernel_id, then POSTs to get a new kernel_id). |
Sorry I had not followed up on this one yet. How are we doing auth on websockets now? |
Token auth in headers, which doesn't square with the capabilities of the browser WebSocket APIs (but works fine in all other clients). We need to sit and think through the options in #16 (comment) or other options for how to design this properly for the long haul. At the moment, @jtyberg is just working on a quick hack to flush out these issues. |
With regard to kernel restarts, interrupts, etc. from the notebook browser UI, kernel.js is hitting the kernel REST endpoints directly, and assumes the endpoints are on the same server as the notebook server. So it looks like frontend JS changes will also be necessary to support kernels that are remote from the notebook server. Also, as expected, we have to set Access-Control-Allow-Origin on the kernel_gateway to enable this to work. |
In prep for a convo about this with others, maybe a small wiki page attached to this project with the three options for how to proceed with a real impl with their pros/cons is in order. |
Well, I got the remote kernel restarts working, but I had to muck with the JS a bit. There doesn't appear to be a NotebookApp option similar to I created a branch here: https://github.com/jtyberg/notebook/tree/remote_kernels I'll take a stab at a wiki page to summarize decision points. |
This seems like something that @zischwartz had to patch in Thebe. |
Aforementioned wiki page: https://github.com/jupyter-incubator/kernel_gateway/wiki/notebook_kernel_gateway I think the experiment here fleshed out pain points and options for remote kernels in the notebook if we want to go there one day. We're not the only ones, btw, and certainly not the only approach (e.g., https://github.com/danielballan/remotekernel). I don't think there's anything else to do here for now. |
@jtyberg has implemented a demo flavor of option 3 from the wiki page from way-back-when over in jupyter/kernel_gateway_demos#21 Going to close this issue out. We can continue the convo over in the PR. |
To prove out the concept, try the following in a personal fork somewhere and let's see how it goes. The changes are something like so:
It sounds straightforward, but there are unknowns around how tightly coupled sessions are to kernels in the local notebook backend. Can the two be divorced so that the sessions are maintained locally in the notebook server and the kernels remotely in the kernel gateway?
/cc @jtyberg
The text was updated successfully, but these errors were encountered: