Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating a slice of uninit memory is unsound (equivalent to mem::uninitialized) #220

Closed
Shnatsel opened this issue Nov 1, 2019 · 21 comments · Fixed by #373
Closed

Creating a slice of uninit memory is unsound (equivalent to mem::uninitialized) #220

Shnatsel opened this issue Nov 1, 2019 · 21 comments · Fixed by #373

Comments

@Shnatsel
Copy link
Member

Shnatsel commented Nov 1, 2019

flate2 creates slices of uninitialized memory in several places. The only two places where it happens when using the Rust backend are here and here.

This is equivalent to the use of the now-deprecated mem::uninitialized. Instead, either a slice of MaybeUninit<T> should be constructed and passed to the backends, or the backends should receive a structure that does not expose uninitialized memory by design such as Vec or a Vec-like fixed-capacity view of memory.

If the backend does not overwrite the entire slice, this can become an exploitable security vulnerability.

@fintelia
Copy link

fintelia commented May 8, 2021

What is the status of this? Are these concerns purely hypothetical, or is there potentially unsoundness in the library?

@Manishearth
Copy link
Member

This is potential unsoundness; the current Rust memory model (and the current LLVM memory model) treats uninits specially.

@Shnatsel
Copy link
Member Author

You can find more details about the way unitialized memory is treated by C and Rust here: https://www.ralfj.de/blog/2019/07/14/uninit.html

flate2 itself does not read from the backing slice, but the compiler is allowed to insert spurious reads for optimization purposes. Some examples (although for different kinds of UB) can be found here.

The nightly-only BorrowedCursor would make fixing this easier, but alas, it's nightly-only. The reasonable fix I can see is calling .assume_init() only after the slice is filled and only on the filled parts for C libraries, and using miniz_oxide's output to Vec to avoid going through unsafe in the pure-rust codepath.

@oyvindln
Copy link
Contributor

oyvindln commented Aug 29, 2023

For the rust back-end we shouldn't really need to be using unsafe stuff unless there is some very good reason for it. Not sure if say avoiding zero-initializing a vector is worth it. miniz_oxide already avoids using unsafe without any drastic performance implications, so not sure why flate2 would need to either.

anforowicz added a commit to anforowicz/flate2-rs that referenced this issue Aug 29, 2023
@anforowicz
Copy link
Contributor

FWIW I've attempted to address this by switching to naive safe code (using vec.resize(capacity, 0) to fill the spare capacity with zeros). See the PR here: #373

Going via &mut [MaybeUninit<u8>] may also be possible, but it would require some extra considerations:

@Shnatsel
Copy link
Member Author

If you are creating a vector from scratch, using vec![0; capacity] is faster than vec.resize(capacity, 0) because it requests already-zeroed memory from the OS.

@anforowicz
Copy link
Contributor

anforowicz commented Aug 30, 2023

If you are creating a vector from scratch, using vec![0; capacity] is faster than vec.resize(capacity, 0) because it requests already-zeroed memory from the OS.

Ack.

Here (e.g. in compress_vec) we are not creating a Vec from scratch, but instead we are working with the extra space in the spare capacity of an existing Vec. (OTOH, I guess that the caller of compress_vec may very well call Vec::with_capacity to crate a Vec from scratch.)

Still, maybe using vec.resize(capacity, 0) is unnecessarily pessimizing the performance here. One alternative would be to use Vec::spare_capacity_mut and then (somehow... waving hands...) use unsafe code to fill the spare capacity with zeros and treat it as &mut [u8].

So, maybe I should change the PR to use a little bit of unsafe as follows?:

fn write_to_spare_capacity_of_vec<T>(
    output: &mut Vec<u8>,
    writer: impl FnOnce(&mut [u8]) -> (usize, T),
) -> T {
    let zeroed_spare_space: &mut [u8] = {
        let uninited_spare_space = output.spare_capacity_mut();
        let ptr = uninited_spare_space.as_mut_ptr() as *mut u8;
        let len = uninited_spare_space.len();
        unsafe {
            // Safety of `write_bytes` and `from_raw_parts_mut`:
            // * `ptr` is non-null and points to `len` bytes within a single allocated object
            //   (guaranteed by `spare_capacity_mut` returning `&mut [MaybeUninit<u8>]`).
            // * Alignment of `u8` is 1 (so, not a concern)
            core::ptr::write_bytes(ptr, 0, len);

            // Safety of `from_raw_parts_mut`:
            // * `ptr points to `len` consecutive properly initialized values of type T
            //   (guaranteed by our call to `ptr::write_bytes`).
            // * The memory referenced by the returned slice must not be accessed through any other
            //   pointer (not derived from the return value) for the duration of lifetime 'a
            //   (guaranteed by exclusivity of `&mut [T]` returned by `spare_capacity_mut`).       
            core::slice::from_raw_parts_mut(ptr, len)
        }
    };

    let (bytes_written, ret) = writer(zeroed_spare_space);

    // Safety: Usage of `core::cmp::min` sanitizes `bytes_written` (making `set_len` safe even if
    // `writer` misbehaves and returns an arbitrary `bytes_written`).  Note that above we have
    // already made sure that all spare-capacity-bytes are initialized via `ptr::write_bytes`).
    let new_len = core::cmp::min(output.len() + bytes_written, output.capacity());
    unsafe { output.set_len(new_len); }

    ret
}

WDYT?

@Shnatsel
Copy link
Member Author

I don't see why any of those solutions would be better than vec.resize(capacity, 0).

We could avoid re-initializing the same part over and over by keeping track what part of the Vec has already been initialized in previous iterations, but that relies on no other code running mem::replace on our Vec and also the Vec never reallocating, which is very fragile.

BorrowedBuf + BorrowedCursor could help here, but that's nightly-only.

@Shnatsel
Copy link
Member Author

Shnatsel commented Aug 30, 2023

The approach of writing to the spare capacity without zeroing it first sounds good, provided that the C library wrappers are adjusted to write to a &mut [MaybeUninit<u8>].

@anforowicz
Copy link
Contributor

The approach of writing to the spare capacity without zeroing it first sounds good, provided that the C library wrappers are adjusted to write to a &mut [MaybeUninit<u8>].

Yes, that would work for the C backend.

What about the Rust backend though? The Rust backend would also need to be adjusted to write to a &mut [MaybeUninit<u8>], right? If so, then a prerequisite for this approach would be to also change the miniz_oxide crate as outlined in #220 (comment). Do you think that we should pursue such changes to the miniz_oxide crate?

@Shnatsel
Copy link
Member Author

miniz_oxide always zeroes the output buffer for safety even when writing to a Vec: https://docs.rs/miniz_oxide/0.7.1/src/miniz_oxide/deflate/mod.rs.html#121-157

I think it's best to just zero the buffer when using miniz_oxide instead of revamping its internals to use MaybeUninit.

@anforowicz
Copy link
Contributor

Using MaybeUninit requires changing the public API of https://docs.rs/flate2/latest/flate2/struct.Compress.html, right? I see two options:

  • Option1: Add a new pub fn compress_uninit(..., output: &mut [MaybeUninit<u8>], ...)
  • Option2: Change the type of output in the existing pub fn compress

I guess option1 sounds better / less disruptive? Can you please provide feedback on this?


And then let me also mention that we would use MaybeUninit in the C/Rust-agnostic InflateBackend::decompress (and in DeflateBackend::compress). And C would rely on C-side initialization, while Rust would zero-initialize before calling into miniz_oxide. Please shout if this doesn't sound right.

I note that this means that Rust may repeatedly initialize the same bytes over and over again as you point out in #220 (comment). Is this okay?

@oyvindln
Copy link
Contributor

oyvindln commented Aug 30, 2023

compress_vec and decompress_vec functions are really just convenience wrappers for using compress/decompress which operates on a slice with Vecs wrapping some unsafe code to use the allocated unused capacity of the vec so not sure if there is strictly a need for them to be there in their current form either. It may be better to leave the output buffer handling to the caller or suggest simply using compress/decompress which writes directly to a slice if they need the maximum flexibility if zero-initialization in compress_vec is too much of an impact.

Whether a new pub fn compress_uninit(..., output: &mut [MaybeUninit<u8>], ...) would have a benefit beyond the current functions is a separate question.

@anforowicz
Copy link
Contributor

Let me try to summarize the options we have identified so far (focusing on compress_vec but the notes should also carry over to decompress_vec):

The performance impact of Fix1 and breaking impact of Fix3 depend on how flate2 is used in practice. The impact may not be that great based on the top 10 crates from https://crates.io/crates/flate2/reverse_dependencies which are listed below:

  • syn - no calls to compress_vec nor decompress_vec - see the search query here
  • object - calls decompress_vec on a vector created with Vec::with_capacity (see here)
  • rustix - no hits - see here
  • tonic - no hits
  • zip - no hits
  • png - no hits
  • tower-http - no hits
  • async-compression - no hits
  • activ-http - no hits
  • cxxbridge-macro - no hits

@Shnatsel
Copy link
Member Author

If calls to compress_vec and decompress_vec are uncommon, then I'd rather make them slightly slower through zero-initialization than remove these functions outright.

@anforowicz
Copy link
Contributor

anforowicz commented Sep 1, 2023

If calls to compress_vec and decompress_vec are uncommon, then I'd rather make them slightly slower through zero-initialization than remove these functions outright.

Ack. Both Fix1 and Fix3 are easy ways to avoid UB. I can put up a PR for either one. OTOH, I think that I have a slight preference toward Fix3 (removal), because I don't see the benefit of keeping these functions. It seems that making these functions safe (Fix1) will make them somewhat performance-hostile (i.e. they can lead to repeatedly reinitializing a slice of memory, which means that using compress_vec can regress, rather than improve performance compared to compress). Maybe I missed some other benefits of keeping these functions around?

@anforowicz
Copy link
Contributor

/cc @Byron

Can you please provide your feedback on this issue? Which of the fix options discussed above would you prefer? (e.g. out of the ones in #220 (comment))

It may be worth continuing the community discussion on this issue, but I also wonder who the decision maker here is (i.e. who can review, approve, and merge PRs)? I see that you've merged most of the recent PRs, so I hope you won't mind being CC-ed / summoned into this issue :-).

@Byron
Copy link
Member

Byron commented Sep 2, 2023

Thanks for chiming me in - I was too busy in the past days but am glad I join now since all information has been gathered so neatly. It took an hour to catch up on all that was said, with a considerable portion spent on reading Ralf Jung's excellent articles (again, probably, I keep forgetting).

With that said, here is my amateur perspective on uninitialized memory:

  • it's like Option<u8> and hence observing it is forbidden until it becomes Some(v). If it's observed anyway, it's UB.
  • Creating a slice from uninitialized memory is UB because it allows reading. Here it's less about whether it's actually read, but that it can be read, and strange optimizations can happen if this is done. The compiler can also insert spurious reads for optimization, which may cause said memory to be read even though it looks like it will only be written to.

So for me it's also established that the current behaviour of compress_vec needs fixing.

I think it worth spelling out that correctness always trumps performance, so performance considerations shouldn't hold back a simple fix - optimizations, possibly like discussed, can be made later if there is demand.

As a general stance, we will avoid breaking changes as these will ripple through the ecosystem with a lot of churn, so any fix here must be none-breaking.

From the above it seems clear that #373 is a suitable and welcome fix.

Regarding performance

gitoxide is an avid user of flate2. The most performance sensitive code is stream decompression which happens to use decompress() with its own buffer management. I spent some time profiling the code to arrive there, and it appears that those who truly are dependent on the fastest possible implementation wouldn't use (de)compress_vec() anyway. gitoxide settled for reusing allocated buffers.

It's notable that gitoxide still doesn't re-use inflate state which it truly should for even more performance, as this code is called a whole lot, but I digress.

Conclusion

Yes, let's merge the fix and not worry about performance as those with performance concerns will have suitable alternatives. Correctness is more important.

Maybe there is more spots of code that have soundness issues like this? I'd love to see them rooted out. Thanks everyone for your help and engagement, it's much appreciated. I am grateful too as I feel like I became a little more aware and could adjust my mental model towards uninitialized memory. I think gitoxide also has those issues and I will revisit them (i.e. v = Vec::with_capacity(x); v.set_len(x); &mut v).

@Byron Byron closed this as completed in #373 Sep 2, 2023
@anforowicz
Copy link
Contributor

Any plans for releasing a new version of the crate? (if so, then it would be greatly appraciated as it would help me get approval to import the crate into my project, without having to carry a patch on top of the last released version)

@Manishearth
Copy link
Member

Manishearth commented Oct 11, 2023

@Byron

it's like Option<u8> and hence observing it is forbidden until it becomes Some(v). If it's observed anyway, it's UB.

This is incorrect, and a common misapprehension. Having uninitialized memory is instantly UB even if you never read from it. I.e. MaybeUninit::new().assume_init() is always UB even if you ignore the value.

The exception to this is MaybeUninit itself and certain aggregates of it (like arrays of MaybeUninit), which are allowed to inhabit the uninit state.

Similarly,

Creating a slice from uninitialized memory is UB because it allows reading

it's UB automatically, regardless of what it allows

@Byron
Copy link
Member

Byron commented Oct 11, 2023

it's UB automatically, regardless of what it allows

Uninitialised memory is even meaner than I thought 😅.

I will prepare a new release now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants