Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composite memory maps currently impossible #103

Closed
mthom opened this issue Dec 5, 2023 · 7 comments
Closed

Composite memory maps currently impossible #103

mthom opened this issue Dec 5, 2023 · 7 comments

Comments

@mthom
Copy link

mthom commented Dec 5, 2023

I want to create a thread-local memory map (could be file-based or anonymous, I'd prefer anonymous) in each of N threads where the first M bytes of each thread-local mapping are mapped to the same file (which is itself mapped into memory). I know this allocation scheme is possible using plain mmap in Linux but I can't find a way to do it in memmap2-rs. That is, there is no way to map the buffer of a MMapMut into the buffer of another MMap* struct.

Am I wrong to believe this or is it truly impossible under the current API?

@adamreichold
Copy link

Can you please illustrate the C or at least plain mmap call sequence you would use?

From your description, I wonder why you just don't pass the byte slice itself to the other threads? And what kind of synchronization exists between them as multiple threads writing into the same mapping without synchronization is likely to imply a data race.

@adamreichold
Copy link

(It is also not clear to me what "thread-local memory map" means here as a memory mapping is by design process-wide as all threads share a single address space in which all memory maps are contained.)

@mthom
Copy link
Author

mthom commented Dec 5, 2023

Here's some code from the slice_deque crate that calls to the POSIX mmap:

let ptr = mmap(
            ptr::null_mut(),
            size,
            PROT_READ | PROT_WRITE,
            MAP_SHARED,
            fd,
            0,
        );
        if ptr == MAP_FAILED {
            print_error("@first: mmap failed");
            if close(fd) == -1 {
                print_error("@first: close failed");
            }
            return Err(AllocError::Oom);
        }

        let ptr2 = mmap(
            (ptr as *mut u8).offset(half_size as isize) as *mut c_void,
            half_size,
            PROT_READ | PROT_WRITE,
            MAP_SHARED | MAP_FIXED,
            fd,
            0,
        );

In ptr2, a second map is created by offsetting into the first map at ptr.

The reason for this is because I'm implementing a concurrent Prolog runtime in Rust with a consolidated heap. The first section of the heap (the memory map shared among all threads) is the atom table. The second section that comes immediately after it is the Prolog heap. Cells in the Prolog heap compose Prolog terms. Prolog terms can point down into the atom table.

From your description, I wonder why you just don't pass the byte slice itself to the other threads?

Because of the Prolog VM design, it's most convenient to map the atom table contiguously in memory next to the thread-local Prolog heap. In particular, converting atoms to strings (lists of characters) is very efficient this way.

(It is also not clear to me what "thread-local memory map" means here as a memory mapping is by design process-wide as all threads share a single address space in which all memory maps are contained.)

By that I mean that every VM thread will have its own local MMapMut object that will never traverse threads but that contains the atom table MMap* as a sub-map.

And what kind of synchronization exists between them as multiple threads writing into the same mapping without synchronization is likely to imply a data race.

Absolutely, I plan to use RCU to manage that. But that's outside the scope of the memmap-rs library.

Also, I'm aware that memory mapping is process-wide and not per-thread. I mention threads just to give context to my use case. I don't expect memmap-rs to be aware of threading and synchronization.

@adamreichold
Copy link

So this appears to be a duplicate of #35, i.e. support for MAP_FIXED.

To be honest, using MAP_FIXED safely appears even harder to just plain non-overlapping mappings and it does not appear to be supported on Windows, so I am unsure whether this crate will be particularly helpful compared to lower-level wrappers like libc, nix or rustix.

Finally, but that is besides questions on this crate's API, why could the terms not just point into a global atom table wherever that is allocated, i.e. why must the two sections of the heap be contiguous?

@mthom
Copy link
Author

mthom commented Dec 5, 2023

Heap cells are tagged 64-bit pointers, so their addresses are relative rather than absolute. There would then need to be further tags to distinguish the locations, which would complicate the VM design.

@mthom
Copy link
Author

mthom commented Dec 5, 2023

If PRs are welcome for this, I'll try to implement MAP_FIXED. I need it to work on Windows too, so I'll be looking at that as well.

@mthom
Copy link
Author

mthom commented Dec 8, 2023

You are right, fixing memory maps is not possible in Windows. Closing this issue.

@mthom mthom closed this as completed Dec 8, 2023
@mthom mthom reopened this Dec 8, 2023
@mthom mthom closed this as not planned Won't fix, can't repro, duplicate, stale Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants