-
-
Notifications
You must be signed in to change notification settings - Fork 788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invoke pyodide from python / sandboxing #869
Comments
(I looked at https://cdn.jsdelivr.net/pyodide/v0.15.0/full/pyodide.asm.wasm to see how easiy it is to wrap it, but based on its list of imports it seems to require a significant amount of support form the embedder, i.e. the JS code in https://cdn.jsdelivr.net/pyodide/v0.15.0/full/pyodide.asm.js. Are there alternatives to that?) |
See also #558 |
@nomeata Wouldn't it be easier just to use a throwaway docker instance? |
Dunno, that’s a lot of complexity, compared to an in-process sandbox. Docker wasn’t created for security isolation; wasm was. But maybe RustPython might be more suitable for this; another project uses it for that: https://github.com/robot-rumble/logic/ |
Since C-based python modules are based on dynamic linking, it would be fairly hard to use pyodide without a javascript runtime. If you are willing to use, say, node, #972 should get us a long way. It makes it fairly easy to not expose all javascript functions to the wasm sandbox. You would still have to audit the rest of the code to ensure that one cannot someone get their hands on, say, eval by cleverly exploiting existing importing functions (which again are necessary for python to run properly). |
@dalcde I've recently wanted to run pyodide outside the browser (server side), and was looking into running in wasmer. I want to be able to treat browser pyodide as "local dev", and then deploy the same python to be run server-side (which ideally would be sandboxed) Wasmer does have a sample repo compiling for an older version of pyodide (0.12), but it no longer runs/compiles, since pyodide references an older version of emsdk. I assume #558 from a year ago is still true. So following the thread here, it seems like you need javascript to run pyodide. How would it be run with node.js? I didn't understand what was written in #972. |
On Mon, Jan 11, 2021 at 01:16:00AM -0800, Wil Chung wrote:
@dalcde I've recently wanted to run pyodide outside the browser (server side), and was looking into running in wasmer. I want to be able to treat browser pyodide as "local dev", and then deploy the same python to be run server-side (which ideally would be sandboxed)
What exactly do you want out of pyodide? I see pyodide as comprising of several pieces:
1. Patches to emscripten to make compiliation work. We hope to upstream all the patches and use stock emscripten soon.
2. Patches to cpython to make cpython compile
3. Patches to various packages to make them work with wasm (mostly for C-based packages)
4. A build system to build packages (mostly for C-based packages again)
5. A bunch of proxies for javascript/python interop.
If you don't need 5, and only need a select subset of packages, it might be easier to cherry-pick the bits you need and do a custom build.
So following the thread here, it seems like you need javascript to run pyodide. How would it be run with node.js? I didn't understand what was written in #972.
This was a typo. I wanted to refer to #792.
|
This is a nice explanation. Is it in the docs somewhere? Maybe it'd be good to expand on this a bit and put it somewhere early on in the docs, maybe with a title like "What is Pyodide?" it'd probably be helpful to other people. |
I'd like to be able to run pyodide browser-side to write a program. And then when I'm done, I can deploy the python program to a server, ideally using the same pyodide runtime server-side. Based on what I understand (just started looking into WASM), you should be able to compile python into WASM to run on the browser, and on the server. The only reasons I'm looking into WASM on server is:
I could set up a docker image running the same stack as pyodide, but I was hoping that I could ideally just compile pyodide for the server and have it just work, so I don't have separate moving pieces to keep in sync. Originally, I was thinking I could just use wasmer to run pyodide. But their example repo is outdated, doesn't work anymore, and the team is unresponsive. Hence, I'm trying to find a way to run pyodide on a server. I'm open to leaving the js in, if it's required to run pyodide on the server. It's just that I don't if know if it's possible, and what might be involved in making it run on node.js. If it's a no go, then I can either try to compile RustPython to WASM and try to run the same wasm file on browser and wasmer (or some other runtime)--and deal with an incomplete implementation. Or just abandon python as a language for this altogether. Hopefully, that gives you an idea of what I'm trying to do, and any advice you have to achieve the above. |
What I'm looking for (and I think many others are interested in) is the capability to run python in wasmer (or equivalent) as a sandboxed environment. It would have the advantage of fast startup, bullet proof sandboxing and no extra (virtual) infrastructure. Common packages working too would be an extra plus. Is there any appetite to split the cpython patches out into a separate package to make it easier to use? |
@rth said there were a couple of blocking issues for this in #558:
Yesterday @joemarshall opened PR #1102 to resolve #531, so there is some movement here. I looked at the emscripten thread linked and it has not been marked resolved. (I don't know much about these things so someone with more experience should maybe give a better status report.) |
There was a demo in #183 (comment) so it's definitely possible, and would be great (#160). We just need to integrate #792 first, then see how we could make it work without maintaining a fork of
It could be worth pursuing, analyzing what was done in https://github.com/wapm-packages/pyodide is probably a start. I haven't really followed the Wasmer side of things, if there is anything we could do to make it easier please let us know. I'm not sure that splitting cpython patches in a separate repo is really necessary at this point (they didn't do it in wapm-packages/pyodide, and it would increase our maintenance burden) -- if there is a working prototype with motivation why it's necessary, we could certainly re-discuss that. It could be an interesting project, but if usage in a browser is of any indication, be prepared to encounter some occasional errors, and get 0 results in a search engine when you search for them :) So in the case of sanboxing CPython, easier/more reliable than Docker, at the moment I'm a bit skeptical, maybe in some number of years... |
Pyodide is still a prototype, do not use right now for ease or reliability. We are working hard to improve it though =) |
I'm really interested in this. My use-case is sandboxing: I want to be able to run user-provided Python code safely on my server, with robust memory and CPU limits and with zero chance that malicious code could "break out" and access my filesystem or network or perform other malicious actions. Here's one example of something I'd like to build: My software lets users upload CSV files to create database tables. I want to provide advanced tools for "transforming" those tables in some way - convert a column to lowercase, extract a zip code from an addresses column, that kind of thing. One option I'd like to offer is to enter some Python code to be run against every value in a column - a web application equivalent of this CLI tool I built: https://simonwillison.net/2021/Aug/6/sqlite-utils-convert/ Running their Python snippet in WASM via Pyodide feels like a lightweight, safe way that I could build this. |
@simonw you may be interested in https://github.com/gristlabs/grist-core. It's an open source web application that lets you build spreadsheet/database hybrid documents, backed by SQLite, with formula columns that run Python in a secure sandbox. |
@simonw I agree that sandboxing is an important use case, that we could maybe explore more. As far as I know, the options are currently,
Personally I think, 3 or 2 might be the most promising. |
Given the great sandboxing properties, I wonder if pyodid could be used as a safe sandbox execution environment, e.g. for educational websites that want to execute untrustworthy code.
What would it take to be albe to do something like
https://pyodide.readthedocs.io/en/latest/using_pyodide_from_javascript.html
but directly from Python (maybe using https://github.com/bytecodealliance/wasmtime-py to execute the wasm)?
The text was updated successfully, but these errors were encountered: