-
Notifications
You must be signed in to change notification settings - Fork 62
Description
If you have two separate Rust programs that are both running under an OS, it seems pretty clear they are two separate instances of the Rust AM (Abstract Machine).
If you have two threads in the same program, there is one instance of the Rust AM my understanding is that there is a single instance of the AM.
However, what about two separate programs, but there is shared memory between then (e.g. using mmap
for IPC)? It has to be two separate instances to not be UB according to this reply by Ralph Jung on IRLO. The shared memory will usually not be at the same address in both processes. Since we presumably want shared memory to NOT be UB they must be separate instances of the AM.
So the "border" is somewhere between "threads sharing all of memory" and "memory mapped between processes". But where exactly?
Let's consider some hypothetical scenarios to see where they would land:
- Two processes with shared memory, but now I start a thread set up to have its stack in the shared memory.
- The above scenario (1) but I then use a hypothetical IPC-based scoped-threads to start threads in the other process that borrows data from my stack in the shared memory (in this case both processes would have to map the shared memory at the same virtual address for the pointers to be valid in both programs).
- A kernel written in Rust, reading/writing to user memory of a program written in Rust. According to the same reply by Ralph Jung linked above, we cannot have one AM, as the mappings aren't the same. The kernel has a superset of the user space mappings available to it.
- An embedded system where you have an MPU (memory protection unit) but not an MMU. Here you can protect memory regions, but you always have physical addresses. This can be used to implement a basic OS, but all code is built and distributed together as one program image. Are there multiple instances of the AM here when the memory protection changes that lets you access different parts of memory depending on which thread you are currently executing in? A concrete example is hubris by Oxide Computer.
- A multi-core kernel written in Rust (for a normal desktop computer), As it executes syscalls on behalf of the user space, it will have user space memory mappings corresponding to whatever the currently executing user space program on that specific CPU core. That means there can concurrently be two system calls executing on different cores, each with their own user space mappings.
An example of this is the Linux kernel, but I would expect this to be an universal thing. Is the kernel code on each core a separate instance of the AM then? Since the kernel stack lives in kernel space (which is all shared), case 2 above also applies.
There are many other edge cases that you could come up with. I would ideally like a clear definition of what exactly constitutes an instance of the Rust Abstract Machine vs several, as this heavily impacts what sort of virtual memory tricks are valid in Rust, in particular around mmap-accelerated ring buffers, embedded systems, and kernels. Or is this still an open question, and if so what parts are decided?
I looked through the Rust reference but wasn't able to find any definition.