Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safe API for RO memmap construction #73

Closed
Plecra opened this issue May 7, 2023 · 18 comments · Fixed by #76
Closed

Safe API for RO memmap construction #73

Plecra opened this issue May 7, 2023 · 18 comments · Fixed by #76

Comments

@Plecra
Copy link

Plecra commented May 7, 2023

I'd like to be able to call a safe version of MmapRaw::from(unsafe { Mmap::map(&file) }), as a sibling API for map_raw.

Maybe it should be possible to pass flags for read/write/exec to MmapOptions::map_raw?

@RazrFalcon
Copy link
Owner

I'm not sure I understood your question. MmapRaw already implements from Mmap.

@Plecra
Copy link
Author

Plecra commented May 7, 2023

Yep, the code I wrote works, however it requires that unsafe block, and I was hoping we could get a safe API. The API would be safe for the same reason the current map_rawis safe.

@RazrFalcon
Copy link
Owner

Can you provide a code example?

@Plecra
Copy link
Author

Plecra commented May 8, 2023

Sure thing :)

// this is some (abridged) code in my project
pub struct FileContent(MmapRaw);
impl FileContent {
    pub fn from_file(file: &std::fs::File) -> std::io::Result<Self> {
        // I would like this code to be forbid_unsafe, but currently need an unsafe block to call `Mmap::map`
        Ok(Self(unsafe { memmap2::Mmap::map(file) }?.into()))
        // Instead, I'd rather write:
        Ok(Self(memmap2::MmapRaw::map_raw_read_only(file)?))
    }
}

@RazrFalcon
Copy link
Owner

memmap is inherently unsafe. Even in a read-only mode, because other process can modify the file content, which violates Rust safety rules. It cannot be made safe.

@Plecra
Copy link
Author

Plecra commented May 8, 2023

If you believe that, you're going to need to fix the https://docs.rs/memmap2/latest/memmap2/struct.MmapRaw.html#method.map_raw API 😉

However I think that API is perfectly valid, because it doesn't allow any UB and is perfectly sound to call within Rust. At least, I'd be incapable of triggering UB with it.

@adamreichold
Copy link

// I would like this code to be forbid_unsafe, but currently need an unsafe block to call Mmap::map

But what would you do with the raw pointer access enabled via MmapRaw?

@RazrFalcon
Copy link
Owner

You would need unsafe to access MmapRaw anyway, so there is no point in making the constructor "safe", even assuming that this is theoretically possible.

If you believe that, you're going to need to fix

Fix what? Everything works as expected.

@Plecra
Copy link
Author

Plecra commented May 8, 2023

You have an API MmapRaw::map_raw, which is safe for justified reasons. I would like to use an equivalent API that doesn't include the WRITE flag - I'm comparing them because "You would need unsafe to access MmapRaw anyway, so there is no point in making the constructor "safe"" equally applies to the MmapRaw::map_raw API. I would prefer to avoid writing unsafe blocks where there is no safety justification necessary, because I believe that unnecessarily makes it harder to review the soundness of my crate.

That's all I came to ask :) If you feel the additional API in memmap2 would be superfluous, you can close the issue

@Plecra
Copy link
Author

Plecra commented May 8, 2023

To maybe clarify the original issue, I would like an extra parameter in the MmapRaw::map_raw API, which allows me to control the configuration flags for the memory map set here...:

libc::PROT_READ | libc::PROT_WRITE,

...without changing the safety properties of the API

@adamreichold
Copy link

adamreichold commented May 8, 2023

I would prefer to avoid writing unsafe blocks where there is no safety justification necessary, because I believe that unnecessarily makes it harder to review the soundness of my crate.

I think the aim is understandable, but sadly and especially with mmap, the effects of unsafe code are usually not localized to single unsafe blocks. Sometimes, there are not even localized to single modules.

For example, let's you safely create a read-only MmapRaw which means you now have access to raw pointers which can read the underlying memory using e.g. std::ptr::read. However, for these operations to be defined, you need to ensure that no other process is concurrently modifying the mapped file so that e.g. std::ptr::read tries to load something that is not a properly initialized value of its type.

So in this case, the safety requirements are not just not localized in the same process as the unsafe call to std::ptr::read, they even span multiple processes accessing the same files. Hence, the justification for your unsafe block with have to rely on global reasoning in any case and one could argue that the constructor is a better place to centralize that reasoning about the operating system environment that the individual accesses.

So in summary, while technically the API you envision can be added and would be sound as far I can see, I don't think it will really benefit your initial aim for introducing it.

@Plecra
Copy link
Author

Plecra commented May 8, 2023

I'm specifically working with shared memory and pointer reads are reasoned about in the presence of it, so I definitely don't agree there 😄 I still think you might as well be talking about the API that's already present in the crate (do you think it should be deprecated?) but I can't add any more to this discussion.

@adamreichold
Copy link

adamreichold commented May 8, 2023

I still think you might as well be talking about the API that's already present in the crate (do you think it should be deprecated?) but I can't add any more to this discussion.

Yes, the existing MmapRaw::map_raw and MmapOptions::map_raw could have be left out for the From implementations. But it is a much harder proposition to remove something that might not carry its weight, than to not add something new entirely. And not being overly useful does not seem to warrant the ecosystem-wide churn of deprecating (and eventually removing) them.

EDIT: Add a missing "not" before "add".

@adamreichold
Copy link

@RazrFalcon We do have MmapOptions::map_copy_read_only which I think is similarly questionable w.r.t global invariants. I think the implementation of MmapOptions::map_raw_read_only would be as simple as adding

pub fn map_raw_read_only<T: MmapAsRawDesc>(&self, file: T) -> Result<MmapRaw> {
    let desc = file.as_raw_desc();

    MmapInner::map(self.get_len(&file)?, desc.0, self.offset, self.populate)
        .map(|inner| MmapRaw { inner })
}

and some documentation. Meaning the implementation is basically free. So maybe we just add it?

@RazrFalcon
Copy link
Owner

@adamreichold But map_copy_read_only is unsafe, so it's fine. No?

I honestly do not have any input here. I personally use this crate just to read files faster. This is the only thing I care about. I have zero knowledge and experience with any other use cases.

@Plecra I'm sorry, but I'm genuinely have no idea what problem are you having. If this all just about a slightly simpler API, then I think it can be ignored, since memmap is inherently unsafe. There is no point in pretending it is. It would just confuse the user.
If not, then please explain your request better.

@Plecra
Copy link
Author

Plecra commented May 10, 2023

Nope, that's it :) It's about extending the safe map_raw API to avoid creating the unsound Mmap instance in my code. In principle it's unsound in my case, because another process might be writing to the backing data. My immediate downcast into MmapRaw makes it sound in practice though, because the invalid Mmap doesn't ever actually break Rust's invariants themselves.

I want to use map_raw for my situation because my usage is inherently safe, and using the current API forces me to write unsound code w.r.t. the memmap2 crate's API invariants. I want the more appropriate api

@Plecra Plecra closed this as completed May 10, 2023
@adamreichold
Copy link

because the invalid Mmap doesn't ever actually break Rust's invariants themselves.

Note that this is not unsound: It is common for unsafe code to temporarily lift invariants so that even calling safe methods of the involved types would be UB (the Deref impl of Mmap in this case) which is fine as long as these "broken" objects are not exposed to safe code in that state. So your FileContents::from_file is sound.

@RazrFalcon
Copy link
Owner

@Plecra Published. Hopefully this is what you wanted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants