Skip to content

Switch BackingBuffer::VirtualMemory to Safe Box<[u8]> Backing #392

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

Barca545
Copy link

@Barca545 Barca545 commented Jun 2, 2025

Motivation

This PR addresses concerns raised in #359 about the safety and maintainability of manually managed virtual memory using NonNull<u8> and pointer arithmetic. By switching to Box<[u8]>, we maintain similar memory semantics while gaining safety and reducing code complexity.

Changes

Main Changes

  • Used a Box<[u8]> for the contents of BackingBuffer::VirtualMemory instead of a NonNull.
  • Unsafe blocks are now limited to system-level allocation/deallocation and bounds-checked access is the default.
  • Removed most manual pointer arithmetic and unsafe dereferencing in favor of slice indexing.
  • Switching to Box<[u8]> also allowed me to remove the text field from GapBuffer which reduces the size of the struct.

Miscellaneous changes

  • Pinned down the toolchain to nightly-2025-04-15 as not all features required to build this crate are available on all versions of nightly.
  • Added line to clarify virtual_release returns memory which will error if accessed before committed.

Results

Benchmarks were obtained by running cargo bench --bench lib -- --save-baseline before on main and cargo bench --bench lib -- --save-baseline box-raw-alloc on my branch. Results were then exported to JSON and compared using critcmp before.json box-raw-alloc.json --list. See before.json and box-raw-alloc.json.

  • Benchmarks show no statistically significant regression in performance, despite the shift to safe abstractions.
  • Binary size is 1kb smaller than the current release build on my machine and 20kb smaller than the download.
  • All existing tests pass. No regressions were found during local testing.

# Motivation
This PR addresses concerns raised in [microsoft#359](microsoft#359) about the safety and maintainability of manually managed virtual memory using `NonNull<u8>` and pointer arithmetic. By switching to `Box<[u8]>`, we maintain similar memory semantics while gaining safety and reducing code complexity.

## Main Changes
- Used a `Box<[u8]>` for the contents of `BackingBuffer::VirtualMemory` instead of a `NonNull`.
- Unsafe blocks are now limited to system-level allocation/deallocation and bounds-checked access is the default.
- Removed most manual pointer arithmetic and unsafe dereferencing in favor of slice indexing.
- Switching to `Box<[u8]>` also allowed me to remove the `text` field from `GapBuffer` which reduces the size of the struct.

## Miscellaneous changes
- Pinned down the toolchain to `nightly-2025-04-15` as not all features required to build this crate are available on all versions of nightly.
- Added line to clarify `virtual_release` returns memory which will error if accessed before [committed](https://learn.microsoft.com/en-us/windows/win32/Memory/page-state).

# Results
Benchmarks were obtained by running `cargo bench --bench lib -- --save-baseline before` on main and `cargo bench --bench lib -- --save-baseline box-raw-alloc` on my branch. Results were then exported to JSON and compared using `critcmp before.json box-raw-alloc.json --list`. See [before.json (https://drive.google.com/file/d/1krzmBTmidI8jgXu08EAo_FelmFmKP2i-/view?usp=drive_link) and [box-raw-alloc.json](https://drive.google.com/file/d/1rwud47HCDTiGTh9jqEPtWSChIXiVJqst/view?usp=sharing).

- Benchmarks show no statistically significant regression in performance, despite the shift to safe abstractions.
- Binary size is 1kb smaller than the current release build on [my machine](https://drive.google.com/file/d/1Ey73e8h9G1L4fmR0cyWgGshcr7QfyGWX/view?usp=drive_link) and 20kb smaller than the download.
- All existing tests pass. No regressions were found during local testing.
@Barca545
Copy link
Author

Barca545 commented Jun 2, 2025

@microsoft-github-policy-service agree

Copy link
Member

@lhecker lhecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm rather skeptical of this change. The largest benefit seems to stem from the match &self.buffer blocks which are responsible for removing most of the unsafe calls. I think it may be worth exploring that further.

...but doing it like that is not safe. The gap in the gap buffer is akin to the uninitialized end in a Vec. While accessing it is safe (the memory is committed in the entire buf range), this has to be done carefully. It's completely uninitialized memory in release builds and contains previous buffer contents. This seems like an improvement, but I'm skeptical if it's a safe improvement.

That aside, there's a bug in the implementation. This PR prompted me to add benchmarks for the GapBuffer which should help in the future. BTW: There are no buffer benchmarks and there is a 3% perf regression in my testing. But a 3% change is completely fine, because this is still 600% faster than ropey.

@@ -1,2 +1,3 @@
[toolchain]
channel = "nightly"
# Doesn't have to be this toolchain but I pinned it down to a concrete one because not all features required to build this crate are availible on all versions of nightly
channel = "nightly-2025-04-15"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd prefer not pinning the nightly version as of yet.

src/lib.rs Outdated
Comment on lines 13 to 15
os_string_truncate,
box_vec_non_null,
ptr_metadata
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you opened the issue, Linux package maintainers have come forward asking us to build on stable Rust. This means we need to remove all nightly features sooner or later and it would be preferable if this PR didn't introduce new dependencies.

Copy link
Author

@Barca545 Barca545 Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can easily remove box_vec_non_null and ptr_metadata. However, removing os_string_truncate causes the path to error because os_string_truncate is required for the truncate method of OS string.

I can remove it from my final submission but I am unsure if this issue is present in other versions of the repo/for other people.

I was also unaware it constituted a new dependency since it is present in the binary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant it as in "taking dependency on a nightly feature", not as an actual cargo dependency. 😅

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks for clarifying. In this case, I was referring to this feature being enabled in the binary's root.

At any rate, I am happy to remove the dependency from my next update as my code has no reliance on it. I only included it because otherwise the build fails for me due to path's reliance on the feature.

Comment on lines 260 to 269
BackingBuffer::VirtualMemory(buf) => {
for i in start..end {
buf[i] = 0;
}
}
BackingBuffer::Vec(buf) => {
for i in start..end {
buf[i] = 0;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is unacceptable from a debug performance perspective. It needs to continue using a memset fill. It also doesn't set the contents to 0xCD.

Comment on lines 63 to 68
let ptr = unsafe {
VALLOCATOR
.allocate(Layout::from_size_align_unchecked(LARGE_CAPACITY, align_of::<u8>()))
.unwrap()
};
let buf = unsafe { Box::from_non_null_in(ptr, VALLOCATOR) };
Copy link
Member

@lhecker lhecker Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Box<[T]> only provides new_uninit_slice too... I wonder how you're supposed to do this. It seems kind of sketchy if I'm honest.

Copy link
Author

@Barca545 Barca545 Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is a good point. As I understand it, Box provides uninit_slice because it has no guarantees the memory will be initialized. But since by the time this memory is accessed it has been committed (so it is at least zero initialized this should be ok). MaybeUninit is mostly for the compiler so it knows not to do certain optimizations but as long as the memory is initialized to some value it should be fine. From the 'nomicon:

Safe Rust doesn't permit you to partially initialize an array. When you initialize an array, you can either set every value to the same thing with let x = [val; N], or you can specify each member individually with let x = [val1, val2, val3]. Unfortunately this is pretty rigid, especially if you need to initialize your array in a more incremental or dynamic way.

My only question/concern is if it needs to be initialized when the memory is allocated or when the array is initialized. As I understand it this array is not really initialized or accessed until the underlying allocation is committed so it should be ok to do this because we never actually access data which is possibly uninitialized in the sense the Rust abstract machine means?

This concern could be alleviated by committing the page alongside allocation as this would guarantee it was initialized at the time the [u8] was created however my understanding was this was avoided for resource management/performance reasons.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it this array is not really initialized or accessed until the underlying allocation is committed so it should be ok to do this because we never actually access data which is possibly uninitialized in the sense the Rust abstract machine means?

In my reading this means the array is partially initialized, because the uncommitted parts have undetermined value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example, where this may theoretically break (even if contrived), is if Rust implements copy_within by reading beyond the bounds of the specified range, as long as it ensures the reads are within the slice bounds. Since the slice bounds are partially uncommitted, that may segfault.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, my mistake, I was under the impression the memory for the entire buffer was committed. Yes this change would break if the array is only partially committed and an attempt was made to access beyond the committed bounds.

In that case, I could implement an additional check the memory is committed before attempts to access it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we added checks, I believe that would defeat the purpose of the gap buffer managing its own gap. Rather than trying to build it around slices and Box, it's possible that an entirely different approach may yield a better abstraction.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this makes sense, I will think on this to see if there is a better solution more guaranteed to be sound. Very likely, I can't come up with something better than you had originally.

In the meantime, I'm experimenting with a smart pointer similar to Box<[u8]> but with bounds checking altered to check against self.committed instead of self.len.

Is there a way to access the GapBuffer bench you mentioned further up?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just opened #403 which should merge some time soon.

unsafe { slice::from_raw_parts_mut(self.text.add(self.gap_off).as_ptr(), self.gap_len) }

match &mut self.buffer {
BackingBuffer::VirtualMemory(buf) => &mut buf[off..off + self.gap_len],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to use self.gap_off here.

Barca545 added 6 commits June 3, 2025 13:27
# Motivation
This PR addresses concerns raised in [microsoft#359](microsoft#359) about the safety and maintainability of manually managed virtual memory using `NonNull<u8>` and pointer arithmetic. By switching to `Box<[u8]>`, we maintain similar memory semantics while gaining safety and reducing code complexity.

## Main Changes
- Used a `Box<[u8]>` for the contents of `BackingBuffer::VirtualMemory` instead of a `NonNull`.
- Unsafe blocks are now limited to system-level allocation/deallocation and bounds-checked access is the default.
- Removed most manual pointer arithmetic and unsafe dereferencing in favor of slice indexing.
- Switching to `Box<[u8]>` also allowed me to remove the `text` field from `GapBuffer` which reduces the size of the struct.

## Miscellaneous changes
- Pinned down the toolchain to `nightly-2025-04-15` as not all features required to build this crate are available on all versions of nightly.
- Added line to clarify `virtual_release` returns memory which will error if accessed before [committed](https://learn.microsoft.com/en-us/windows/win32/Memory/page-state).

# Results
Benchmarks were obtained by running `cargo bench --bench lib -- --save-baseline before` on main and `cargo bench --bench lib -- --save-baseline box-raw-alloc` on my branch. Results were then exported to JSON and compared using `critcmp before.json box-raw-alloc.json --list`. See [before.json (https://drive.google.com/file/d/1krzmBTmidI8jgXu08EAo_FelmFmKP2i-/view?usp=drive_link) and [box-raw-alloc.json](https://drive.google.com/file/d/1rwud47HCDTiGTh9jqEPtWSChIXiVJqst/view?usp=sharing).

- Benchmarks show no statistically significant regression in performance, despite the shift to safe abstractions.
- Binary size is 1kb smaller than the current release build on [my machine](https://drive.google.com/file/d/1Ey73e8h9G1L4fmR0cyWgGshcr7QfyGWX/view?usp=drive_link) and 20kb smaller than the download.
- All existing tests pass. No regressions were found during local testing.
@Barca545
Copy link
Author

Barca545 commented Jun 4, 2025

The only relevant changes for my PR are in gap_buffer and virtual_memory. I accidentally uploaded more files than I intended. My apologies. I will fix the commits later today.

I created a VirtualMemory type that exposes bounds checked methods for reading and writing to the allocation. If this is a good approach there are some ergonomic tweaks I can make to improve the final implementation.

Barca545 added 4 commits June 5, 2025 15:27
Only changes which needed to be tracked were these 3.

os_string_truncate was removed from the `lib.rs` file on `main` but building fails on this commit because it was forked from an earlier version of `main`.
@lhecker lhecker marked this pull request as draft June 17, 2025 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants