-
-
Notifications
You must be signed in to change notification settings - Fork 714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dask in pyscript #6257
Comments
I'd be excited to see this (but don't know yet how useful it will be). I
think that the first question is really just about comms. We can remove
tornado if we need to, but we'll need to find some real and installable
websockets library that plays well with wasm. My guess is that that is the
crux of the problem, and I would encourage anyone interested in this
problem to start there.
…On Mon, May 2, 2022, 11:50 AM Martin Durant ***@***.***> wrote:
For those that didn't hear the news, pyscript <https://pyscript.net/> is
a new project from Anaconda, to run a complete CPython runtime in the
browser via wasm. I will post the pycon keynote when it becomes available.
There are already impressive demos interacting with existing pydata tools
(numpy, pandas) as well as hooks into JS display tools (d3...) *without
any server at all*.
One thing pyscript does not have is a good data story, because the browser
environment doesn't support (TCP) "sockets" , only HTTP(s) as provided by
the browser runtime, and limited by CORS. That means that fsspec has a real
hard time doing anything useful.
On the other hand, dask is able to talk to a distributed cluster over
websockets - this already works today. My question is, is there any
interest in pushing dask forward as an in-browser, async-mode client?
Interestingly, this would enable full fsspec operation by doing all
operations via a worker. From the point of view of Coiled (or others that
run a remote scheduler), it would get around the tricky/jhub issue of where
to run the python kernel.
Progress would require a complex build chain to wasm-ify dask's
dependencies, but I think that so long as we steer away from explicit
socket-level stuff by relying on websockets, it should be doable. (noting
that the current python websocket stack relies on python's builtin
httplib/sockets, so would need rewriting to use the browser's internal
JS-facing interface).
—
Reply to this email directly, view it on GitHub
<#6257>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTE3AZBEEZF5C7FJ5U3VIABVTANCNFSM5U4QC2UA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
I believe it is as simple as wrapping the JS native one in python, ideally with the same interface as the existing websocket used by the ws:// comm. The pyscript examples already show (async) http fetching using this method in lieu of python builtins. |
I should have answered this bit too:
from Coiled and dask's POV in general, it removes the need for the python kernel as a separate piece of the infrastructure. So you can run those nice panel rendering apps or other front-end stuff with just a remote dask cluster, no jupyter/lab/hub, etc. Of course, those technologies come with nice benefits too (persistent file storage!), but this model is much simpler. Now anyone with credentials can kick off compute without any local python installation at all, and also no always-on hub thing; also, the browser is really good at storing local state and auth information. |
Also related ( dask/dask#7764 ) |
This is a small step towards #7764 and dask/distributed#6257. It's basically just defensively importing `threading` and `multiprocessing` and defaulting to the synchronous scheduler if those fail. So this is currently mostly be useful for demos and training around the dask collections API. But it *does* work. This is distinct from actually getting a `distributed.Client` working and talking to a remote cluster, which will require some actual networking work.
For those that didn't hear the news, pyscript is a new project from Anaconda, to run a complete CPython runtime in the browser via wasm. I will post the pycon keynote when it becomes available. There are already impressive demos interacting with existing pydata tools (numpy, pandas) as well as hooks into JS display tools (d3...) without any server at all.
One thing pyscript does not have is a good data story, because the browser environment doesn't support (TCP) "sockets" , only HTTP(s) as provided by the browser runtime, and limited by CORS. That means that fsspec has a real hard time doing anything useful.
On the other hand, dask is able to talk to a distributed cluster over websockets - this already works today. My question is, is there any interest in pushing dask forward as an in-browser, async-mode client? Interestingly, this would enable full fsspec operation by doing all operations via a worker. From the point of view of Coiled (or others that run a remote scheduler), it would get around the tricky/jhub issue of where to run the python kernel.
Progress would require a complex build chain to wasm-ify dask's dependencies, but I think that so long as we steer away from explicit socket-level stuff by relying on websockets, it should be doable. (noting that the current python websocket stack relies on python's builtin httplib/sockets, so would need rewriting to use the browser's internal JS-facing interface).
The text was updated successfully, but these errors were encountered: