SSCSM execution #13046

TurkeyMcMac · 2022-12-15T15:30:55Z

SSCSM code is executed in a separate process. Process sandboxing is done on Linux with seccomp and on Mac with Seatbelt. Windows uses the Windows Integrity Mechanism, but I don't know if it's working. No sandboxing is done for other platforms yet. The SSCSM process is designed to be restartable in the event of an error.

SSCSM TODO list (not all in this PR)

FossFanatic · 2022-12-15T18:46:20Z

Is this a draft for the SSCSM?

I have heard many good things about SSCSM. What are some examples of the things that become possible or easier with it?

TurkeyMcMac · 2022-12-15T18:49:46Z

This is just a proof of concept for SSCSM (yes, the SSCSM) execution. There's a lot more work to do. This proof of concept may even be discarded.

Once implemented, SSCSMs will let mods do client-side entity movement prediction, for example.

sfan5 · 2022-12-15T21:05:03Z

👍 Good work

Worth noting that there's no reason to only sandbox SSCSM and not the "normal" CSM too.

SSCSM code can block the main program indefinitely. This can probably be fixed with a Lua instruction count hook.

Can't you fix this by adding timeouts to IPC calls? That'd be much cleaner.

How is the "sscsm asks a question back to the client" direction handled, are these processed synchronously while waiting for the current callback to end or does that run in a separate thread? The latter is bound to cause problems and doesn't fit our model.

Lastly, for IPC I think it'd save us a lot of work if we didn't reinvent the wheel but use an existing library that hopefully can make exchanging classes and entire structures seamless.

TurkeyMcMac · 2022-12-15T21:38:51Z

A timeout could probably be set with setitimer to interrupt the read. Perhaps on Windows SetCommTimeouts could be used?

Here is how callbacks are called:

The controlling process (C) sends an initiating message to the script process (S), e.g. C2S_RUN_STEP.
Receiving this message, S runs its callbacks:
When a callback in S calls a function such as get_node, S sends S2C_GET_NODE to C with a position as data and waits.
Receiving this message, C gets the node and sends C2S_GET_NODE to S with the node as data.
S resumes and get_node returns.
When all callbacks have finished executing on S, S sends S2C_DONE to C and stops.
Receiving this message, C resumes executing other parts of the program.

I haven't used IPC libraries before, so I'm up for suggestions. Ideally there shouldn't be a limit on message length.

TurkeyMcMac · 2022-12-15T21:58:32Z

I think the implementation of regular CSMs should be kept separate, as they have a different API. For example, it doesn't make sense to give SSCSMs access to the same mod storage. Best to avoid rewriting the regular CSM API, IMO.

sfan5 · 2022-12-15T22:52:31Z

A timeout could probably be set with setitimer to interrupt the read. Perhaps on Windows SetCommTimeouts could be used?

select should work with a pipe just fine (at least on Unix), so no need for global timers.

Here is how callbacks are called: [...]

Okay that sounds good.

I haven't used IPC libraries before, so I'm up for suggestions.

I don't have any. Worth looking into.

I think the implementation of regular CSMs should be kept separate, as they have a different API. [...] Best to avoid rewriting the regular CSM API, IMO.

CSM isn't the goal, it's a byproduct of SSCSM. CSM has to serve the needs of games/mods to offload suitable tasks to the client (including some which only make sense on the client), and then the parts of that that can be safely exposed to client mods will be.
If we keep the regular CSM API we have to maintain the same stuff twice, nobody wants to do that.

TurkeyMcMac · 2022-12-15T23:31:56Z

Right now pipe inputs are wrapped in a FILE, but I suppose that can be changed. I will wait to see what IPC solution is chosen.

TurkeyMcMac · 2022-12-16T00:48:56Z

I guess I can use fileno.

TurkeyMcMac · 2022-12-16T15:09:12Z

How about having two processes, one for SSCSM and one for client-provided CSM. That would allow for sharing most of the code.

Desour · 2022-12-16T22:08:00Z

CPCSM is only as risky as server side modding. It's not worth putting effort in sandboxing that too.

The only relevant performance issues that we get is the latency of the IPC calls, right?

Some suggestions here:

Make the API asynchronous where it doesn't make it more difficult to use. I.e. functions that return nothing (like set_node) should not wait until they are finished.
(Not important now: As an optimization, it might be worth doing some caching (e.g. of get_node).)
Don't always use pipes. Most of the times the data will be of constant or very limited size, so shared memory works fine. And if a message really contains really much data, you can send just that data over the pipe.
Busy waiting can sometimes make sense for our use-case, I think. I.e. for waiting on a function that does very little, like get_node.

Btw. I've once tried different IPC mechanisms to get IPC calls done with as low latency as possible. The best I was able to come up with was using futexes and doing busy waiting (I've just adjusted the example from the futex man page), resulting in an average latency of about ~~500 ns~~ 200 ns per call+wait. In comparison, with luajit a lua to C call takes ca. 30 ns, c to lua 80 ns, and a c to c call (without PLT) (plus other things like looping) 4 ns.
Obviously take these numbers with grains of salt. My point here is that IPC will probably always be a magnitude slower than lua calls.

You might want to put the sscsm process code into a separate, smaller executable, so there's fewer (dead) code, globals and other stuff that an attacker could use.

Do we want an option to opt-out of the process sandbox, i.e. for more performance in singleplayer?

((Another use-case: Maybe we will move some code from C++ to SSCSM client builtin (SSCSMCB) for dehardcoding (i.e. player physics, or many things from game.cpp). If then SSCSM is turned off, SSCSMCB would still need to run, but it doesn't need sandboxing.))

TurkeyMcMac · 2022-12-16T22:42:16Z

I've thought about using shared memory for communication. However, the SSCSM process is not trusted. One advantage of a pipe is that the process on one end doesn't have to trust the process on the other. With shared memory, this would have to verified by looking at the communication implementation. As far as I know, this concern is not addressed by most synchronization libraries, including pthreads, so Minetest might have to provide its own implementation for each platform based on the raw system calls. This would be easy to get wrong.

TurkeyMcMac · 2022-12-16T22:44:09Z

By the way, when you were benchmarking IPC, did you benchmark pipes?

Desour · 2022-12-17T00:21:13Z

By the way, when you were benchmarking IPC, did you benchmark pipes?

No. (I've read somewhere that they had hight latency (well, obviously, if you block on every wait), and therefore didn't try.)

TurkeyMcMac · 2022-12-17T01:59:28Z

Here's a benchmark that tests both pipes and shared memory: https://stackoverflow.com/questions/1235958/ipc-performance-named-pipe-vs-socket. I don't know the exact setup, but shared memory is indeed much faster.

sfan5 · 2022-12-17T15:09:56Z

How about having two processes, one for SSCSM and one for client-provided CSM. That would allow for sharing most of the code.

Sure, whether one or two processes does not matter much. The important point is that SSCSM and CPCSM share the parts of the code they can.

Make the API asynchronous where it doesn't make it more difficult to use. I.e. functions that return nothing (like set_node) should not wait until they are finished.

Too complex, you immediately run into problems because you still need to serialize the calls that have interdependent side-effects (what if someone calls get_node after set_node?)

You might want to put the sscsm process code into a separate, smaller executable, so there's fewer (dead) code, globals and other stuff that an attacker could use.

While you could argue that this is defense-in-depth this is pretty much irrelevant if we sandbox the process in any reasonable way.

Do we want an option to opt-out of the process sandbox, i.e. for more performance in singleplayer?

For platforms like Android we need to think of an alternative anyway since you can't launch processes there (e.g. running in a thread, unsandboxed) and we'll probably have a development option for that, but this must never be marketed to users as a performance option.

I've thought about using shared memory for communication. However, the SSCSM process is not trusted. [...] With shared memory, this would have to verified by looking at the communication implementation.

FYI there are mechanisms to do this safely on Linux even with an untrusted client. Wayland uses them, some material: https://lwn.net/Articles/591108/

TurkeyMcMac · 2022-12-17T15:33:53Z

Is it actually impossible to create a new process in Android? It seems like posix_spawn is not available, but is the same true of fork/exec?

rubenwardy · 2022-12-17T16:04:41Z

Android apps can have multiple processes (in Java at least). Not sure how it works

Desour · 2022-12-17T16:35:58Z

Make the API asynchronous where it doesn't make it more difficult to use. I.e. functions that return nothing (like set_node) should not wait until they are finished.

Too complex, you immediately run into problems because you still need to serialize the calls that have interdependent side-effects (what if someone calls get_node after set_node?)

My thought was that the main process always first completes all asynchronous messages before doing a synchronous one.
(There's always a queue of asynchronous calls and at most one synchronous call. And during a synchronous call, the sscsm process waits (and hence makes no new calls).)
So, if there's a set_node and afterwards a get_node, it needs to wait once (at get_node) on both calls.

set_node(pos, node1) -- lua calls to c++, c++ enqueues the `set_node` msg (and signals main thread) and directly returns 
set_node(pos, node2) -- same as above
get_node(pos) -- lua calls to c++, c++ writes the `get_node` into shared mem, signals main thread, and waits for all pending calls to finish, then it returns

I guess I should've written "implement the API asynchronously", as the API still looks and works as if it was synchronous.

FYI there are mechanisms to do this safely on Linux even with an untrusted client. Wayland uses them, some material: https://lwn.net/Articles/591108/

It looks to me like sealing is useful if one wants to transfer large amounts of data with zero-copy. But in our case, copying the data is much cheaper than doing syscalls to create a file, seal it, send it, and close it.

sfan5 · 2022-12-17T16:36:42Z

Is it actually impossible to create a new process in Android? It seems like posix_spawn is not available, but is the same true of fork/exec?

postix_spawn can be implemented through fork & exec but there's no Minetest binary you could execute plus you can't extract and run anything either.
Just forking might work.

But in our case, copying the data is much cheaper than doing syscalls to create a file, seal it, send it, and close it.

You'd only do that once and then keep the shared buffer.

I looked for suitable libraries a bit.

IPC:

Zeromq
- supports platform native IPC (not sure if shared memory)
- supports in-process if we need it
- can only use TCP on windows: https://stackoverflow.com/questions/15386121/does-zeromq-support-ipc-as-a-transport-channel-on-windows
Boost.Interprocess
- seems to be comprehensive
- 👎 it's boost
own solution with pipes
- easy and portable
- maybe not performant

That's just for pushing bytes to another process. For serialization I had hoped there was a library that can use template magic to make serializing structs/classes painless but this doesn't seem to exist (I didn't look very hard).

Since this is quite complex affair IMO we should be thinking about the kinds of objects we want to transfer/access from the other process and how that could work:

map (single access & vmanips)
objects
formspec state
- this is going to be a problem for Formspec/HUD Replacement #12926
input state

Most of it will have to be transferred on demand anew each step, maybe we can think of some caching mechanisms.

TurkeyMcMac · 2022-12-17T17:05:22Z

Oh yeah, I hadn't thought about the lack of a native executable on Android. I suppose C++ could call to Java which could launch the process. That can be tackled later.

For objects, I imagine a solution similar to file descriptors: the SSCSM process keeps track of objects with numeric handles and the main process keeps a mapping from these handles to the objects. VoxelManips, on the other hand, could probably be kept entirely within the SSCSM process.

TurkeyMcMac · 2022-12-17T17:41:05Z

I think that file sealing thing is not necessary for us. The SSCSM process can just map the file into memory then close the file descriptor before relinquishing its privileges.

TurkeyMcMac · 2022-12-17T20:05:15Z

Apparently ZeroMQ uses Unix domain sockets for IPC: http://api.zeromq.org/4-3:zmq-ipc. Therefore, it probably won't be faster than regular pipes.

It might be best to use a custom ring buffer implementation. I think the pthreads API will suffice. There may be some risk of a malicious program breaking stuff, but probably not a lot.

Desour · 2022-12-17T21:07:56Z

But in our case, copying the data is much cheaper than doing syscalls to create a file, seal it, send it, and close it.

You'd only do that once and then keep the shared buffer.

~~Isn't it then the same as just using mmap?~~ Edit: I'm stupid, I forgot that we're execing, so mmaps won't be kept.

Implementing the IPC stuff ourselves isn't actually too hard. Libraries are opaque, too generic, include stuff that we don't need, may contain bugs, and are possibly incompatible with our threat model (sandboxed arbitrary code execution).
I'd therefore propose to use pipes on most platforms, and use optimized backends on common platforms, like linux.

For serialization, we have similar security constraints as with our network packets. I'd therefor be careful with serialization libs.

I think that file sealing thing is not necessary for us. The SSCSM process can just map the file into memory then close the file descriptor before relinquishing its privileges.

~~You don't need a file.~~

Btw. I've prettified my futex code stuff enough that you can take a look: https://gist.github.com/Desour/5351f1fe74a88f539ab909b6a7a254ed

To decide what our IPC thing needs, we also need to look at what messages we have:

Calls from main to sscsm are always callbacks. It probably makes sense to coalesce multiple callbacks into one call (same holds for server modding actually), so we have some queue where we insert callback data (can be large), and we then need to signal to the sscsm process that it can continue. Callbacks don't necessarily need return values.
Calls from sscsm to main can either be synchronous or asynchronous (as suggested above). Both can have very small or large argument and return data.

(Apart from asynchronous calls) I think we can satisfy our needs with one synchronous call channel and two queues for large data:

At sscsm process spawn, it is in its first callback. The sscsm callback can then do a synchronous call end_callback to return control flow to main. The next callback is then started by "returning" from this synchronous call. This way we don't need separate channels and stuff for callback, as it's all just synchronous calls.
The synchronous call channel can be something like in my gist above (or pipes as fallback). If we want to send large argument data, we can include in the small message that there's something in the pipe to main. For returned data, we can do the same, but with the other pipe.
(If we have really large data that we don't want to copy, we can later do an optimization to send it with zero-copy via sealing.)

Asynchronous calls probably need a ringbuffer. It can also have its own pipe for its large data. If the ringbuffer is filled, we can wait for it to become empty by doing a synchronous call.

About transfer/serialization of specific things:
Found nothing that would influence IPC interface. (Things that can be referred to need ids, that's all.)

map (single-access):
Simply transfer pos and node. (Maybe cache nearby nodes.)
(Actually, a cpu-cache like cache might be suitable here.)
map (meta):
If the whole meta is of small size, transfer the whole thing. If it's large, fetch on :get(). Cache on write and get. Maybe add API to fetch multiple keys.
For very common strings we could at some point try our if a string pool is beneficial.
map (vmanip):
Send the node data (via bulk data channel).
objects:
Objects are a server-side thing. SSCSM can't freely mess with it without causing inconsistencies of the CAO and SAO, causing headache. IMO SSCSM should have thing like 3d scene nodes, collision box objs, selection box objs, and similar. CAO implementation should then be moved to SSCSM builtin, which can use these new concepts like 3d scene nodes. (The reason we need objs server-side is mainly to have interpolation, and keep collision box, visual stuff and other things attached to each other.) => We don't need to transfer objects, as they don't exist in the main process.
things like collision boxes, sounds and other things in the 3d scene:
These things are persistent, so we should use ids. When a id times out, that should be communicated with a callback.
input state:
Cache on get.
Or keep state on sscsm process. We want key events anyway.

All (or most) caches need to be cleared at callback exit.

TurkeyMcMac · 2022-12-18T01:57:22Z

I think it would be possible to avoid the fallback pipes in synchronous communication like so: When one end needs to send large amounts of data, it sets some flag in the message (indicating that there is more data to send) and waits for a synchronous acknowledgment from the other end. At this point it can send another message with more data. This process repeats until all the data is sent.

IMO the initial implementation should use only synchronous communication. The performance should be OK as long as the message passing is fast. Afterward, asynchronous communication can be added. It is possible that the speedup will not be great if synchronous and asynchronous calls are interspersed.

Desour · 2022-12-18T23:10:46Z

I think it would be possible to avoid the fallback pipes in synchronous communication like so: [...]

Nice idea.

Btw. I've tried doing IPC channels with memfd (for execve): https://github.com/Desour/mt_futex_ipc/blob/master/mt_futex_ipc.cpp (Feel free to copy stuff.)
It's slower than the thing in my gist, I guess due to more decoding and copying.

Just doing synchronous calls (and doing potential asynchronousable calls synchronously) is probably fine for now.

sfan5 · 2022-12-20T09:55:39Z

I'd therefore propose to use pipes on most platforms, and use optimized backends on common platforms, like linux.

(Windows is also a common platform)

Calls from main to sscsm are always callbacks. It probably makes sense to coalesce multiple callbacks into one call (same holds for server modding actually), so we have some queue where we insert callback data (can be large), and we then need to signal to the sscsm process that it can continue. Callbacks don't necessarily need return values.

This sounds flawed to me again. Callbacks are strictly ordered because whatever things can happen inbetween them.
Almost all callbacks will want to return something too (except on_step) so not much to coalesce either.

I think we should start simple (callback-reply, like how the PR is now) and see if adding different queues or asynchronicity can be beneficial later.

Log timestamps aren't working.

TurkeyMcMac · 2023-01-28T21:48:17Z

Good news: Android is basically working. That means the first step is done, though the code has not been reviewed. I put some bug fixes in later steps.

Hopefully I got everything correct.

This matches how posix_spawn is used.

TurkeyMcMac · 2023-01-30T00:26:15Z

At this point the code so far is ready for review/testing (if people want to review this huge PR in smaller parts.) Here are some tips:

CSMs interface with the rest of the code through script/csm/csm_controller.cpp.
The CSM main function is in script/csm/csm_script_process.cpp.
Communication is implemented in threading/ipc_channel.cpp.
Serialization for communication is done with util/struct_serialize.h.
Process sandboxing is implemented in process_sandbox.cpp.
Before trying to load client-side mods, create ~/.minetest/client/csmbuiltin as a symlink to builtin.
To load client-side mods, put them in ~/.minetest/client/csm/ with their names prefixed with five-digit numbers like so: ~/.minetest/client/csm/00000mod1/. The number specifies a place in the load order.

Desour · 2023-04-16T18:37:50Z

I've made a small diagram on how we could split this PR up, and implement the other missing things needed for SSCSM:
(A box can be a PR or commit. Arrows depict dependencies. Note: It may be inaccurate.)

When I find time, I'll adopt the ipc and/or process launching stuff.

MisterE123 · 2023-04-18T15:57:40Z

could someone pick this up, please?

TurkeyMcMac added WIP The PR is still being worked on by its author and not ready yet. @ Client Script API Discussion Issues meant for discussion of one or more proposals labels Dec 15, 2022

TurkeyMcMac force-pushed the sscsm-execution branch from 4c7d017 to cb78d6a Compare December 20, 2022 00:31

TurkeyMcMac added Feature ✨ PRs that add or enhance a feature and removed Help needed labels Jan 28, 2023

TurkeyMcMac added 2 commits January 28, 2023 16:19

Fix CSM metadata messages

fdb4d03

Get SSCSM mostly working on Android

21c2496

Log timestamps aren't working.

TurkeyMcMac force-pushed the sscsm-execution branch from 863902b to 21c2496 Compare January 28, 2023 21:19

TurkeyMcMac added 6 commits January 28, 2023 18:45

Install linux headers in dockerfile

6ae5852

Update README instructions

3ec9f5f

Hopefully I got everything correct.

Allow setting of debug.getinfo

906e515

Fix __linux__ being written as __linux

106e4fb

Expand android self exec receive buffer

cbae673

Try to fix GCC 5 build

6f1fc1a

TurkeyMcMac force-pushed the sscsm-execution branch from 7518097 to 6f1fc1a Compare January 29, 2023 00:59

TurkeyMcMac added 10 commits January 29, 2023 07:24

Add self exec mutex

10b8070

Allow more time functions in sandbox

90c20c8

Set timezone in CSM process

494641e

Clear android self exec environment

740cb40

This matches how posix_spawn is used.

Make extra time syscall checks conditional

e8f0c2e

Add include

02f194a

Add IPC test

ff496fe

Add struct serialize tests

dc2df0b

Limit CSM run time

28926e8

Simplify code

895b954

SmallJoker added the Adoption needed The pull request needs someone to adopt it. Adoption welcomed! label Feb 5, 2023

Zughy mentioned this pull request Apr 17, 2023

Roadmap brainstorm (5.8+) #12746

Closed

Zughy closed this May 14, 2023

Desour self-assigned this Jan 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SSCSM execution #13046

SSCSM execution #13046

TurkeyMcMac commented Dec 15, 2022 •

edited

FossFanatic commented Dec 15, 2022

TurkeyMcMac commented Dec 15, 2022

sfan5 commented Dec 15, 2022 •

edited

TurkeyMcMac commented Dec 15, 2022

TurkeyMcMac commented Dec 15, 2022

sfan5 commented Dec 15, 2022

TurkeyMcMac commented Dec 15, 2022

TurkeyMcMac commented Dec 16, 2022

TurkeyMcMac commented Dec 16, 2022

Desour commented Dec 16, 2022 •

edited

TurkeyMcMac commented Dec 16, 2022

TurkeyMcMac commented Dec 16, 2022

Desour commented Dec 17, 2022

TurkeyMcMac commented Dec 17, 2022

sfan5 commented Dec 17, 2022

TurkeyMcMac commented Dec 17, 2022

rubenwardy commented Dec 17, 2022

Desour commented Dec 17, 2022

sfan5 commented Dec 17, 2022 •

edited

TurkeyMcMac commented Dec 17, 2022

TurkeyMcMac commented Dec 17, 2022

TurkeyMcMac commented Dec 17, 2022

Desour commented Dec 17, 2022 •

edited

TurkeyMcMac commented Dec 18, 2022

Desour commented Dec 18, 2022 •

edited

sfan5 commented Dec 20, 2022

TurkeyMcMac commented Jan 28, 2023

TurkeyMcMac commented Jan 30, 2023 •

edited

Desour commented Apr 16, 2023

MisterE123 commented Apr 18, 2023

SSCSM execution #13046

SSCSM execution #13046

Conversation

TurkeyMcMac commented Dec 15, 2022 • edited

SSCSM TODO list (not all in this PR)

FossFanatic commented Dec 15, 2022

TurkeyMcMac commented Dec 15, 2022

sfan5 commented Dec 15, 2022 • edited

TurkeyMcMac commented Dec 15, 2022

TurkeyMcMac commented Dec 15, 2022

sfan5 commented Dec 15, 2022

TurkeyMcMac commented Dec 15, 2022

TurkeyMcMac commented Dec 16, 2022

TurkeyMcMac commented Dec 16, 2022

Desour commented Dec 16, 2022 • edited

TurkeyMcMac commented Dec 16, 2022

TurkeyMcMac commented Dec 16, 2022

Desour commented Dec 17, 2022

TurkeyMcMac commented Dec 17, 2022

sfan5 commented Dec 17, 2022

TurkeyMcMac commented Dec 17, 2022

rubenwardy commented Dec 17, 2022

Desour commented Dec 17, 2022

sfan5 commented Dec 17, 2022 • edited

TurkeyMcMac commented Dec 17, 2022

TurkeyMcMac commented Dec 17, 2022

TurkeyMcMac commented Dec 17, 2022

Desour commented Dec 17, 2022 • edited

TurkeyMcMac commented Dec 18, 2022

Desour commented Dec 18, 2022 • edited

sfan5 commented Dec 20, 2022

TurkeyMcMac commented Jan 28, 2023

TurkeyMcMac commented Jan 30, 2023 • edited

Desour commented Apr 16, 2023

MisterE123 commented Apr 18, 2023

TurkeyMcMac commented Dec 15, 2022 •

edited

sfan5 commented Dec 15, 2022 •

edited

Desour commented Dec 16, 2022 •

edited

sfan5 commented Dec 17, 2022 •

edited

Desour commented Dec 17, 2022 •

edited

Desour commented Dec 18, 2022 •

edited

TurkeyMcMac commented Jan 30, 2023 •

edited