-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use zeroed allocations in the mir interpreter instead eagerly touching the memory #87777
Conversation
Currently there is no API that allows fallible zero-allocation of a Vec. Vec.try_reserve is not appropriate for this job since it doesn't know whether it should zero or arbitrary uninitialized memory is fine. Since Box currently holds most of the zeroing/uninit/slice allocation APIs it's the best place to add yet another entry into this feature matrix.
Some changes occured to the CTFE / Miri engine cc @rust-lang/miri |
(rust-highfive has picked a reviewer for you, use r? to override) |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 83b01b9 with merge 03a355da4d3c576621f95121eda74d06b5d46623... |
This comment has been minimized.
This comment has been minimized.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
@@ -589,6 +589,38 @@ impl<T> Box<[T]> { | |||
pub fn new_zeroed_slice(len: usize) -> Box<[mem::MaybeUninit<T>]> { | |||
unsafe { RawVec::with_capacity_zeroed(len).into_box(len) } | |||
} | |||
|
|||
/// Constructs a new boxed slice with uninitialized contents, with the memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Saying 'uninitialized contents' sounds confusing when we also state that the memory is zeroed. I think it would be better to write out MaybeUninit<T>
, to make it clear that the allocated memory is initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied that from another _zeroed
function, should I fix those too?
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Oh wait, I've used the wrong conversion API to turn it into a Vec. |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit f408d4b with merge 59d8565f8dd0e5f33615fac8de3d585ceb47b7fe... |
☀️ Try build successful - checks-actions |
Queued 59d8565f8dd0e5f33615fac8de3d585ceb47b7fe with parent 2ddb65c, future comparison URL. |
Finished benchmarking try commit (59d8565f8dd0e5f33615fac8de3d585ceb47b7fe): comparison url. Summary: This benchmark run did not return any significant changes. If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. @bors rollup=never |
That yields the expected improvements in page faults and max-rss. The max-rss results have a question mark next to them but the results are larger than the normal variance range, so it should be a non-spurious improvement. |
CI is failing in a fulldeps test due to cfged out imports. Other than that, lgtm. Independently: maybe the also cc @rust-lang/wg-const-eval the allocation scheme of |
Just wanted to get a perf report first. Should be fixed now.
Done. |
lgtm now, but someone from libs should look at the allocator API addition r? rust-lang/libs-api |
looks like highfive doesn't like @rust-lang/libs-api sorry to ping y'all, but can someone take a quick look a the (unstable) box API changes? |
josh is part of the libs team ;) |
👍 for the changes to unstable allocator APIs. We can decide later if we want to stabilize those APIs, but adding them to nightly for use in the compiler itself seems fine here. |
@bors r=oli-obk,joshtriplett |
📌 Commit 1c21373 has been approved by |
☀️ Test successful - checks-actions |
@@ -28,7 +28,7 @@ use crate::ty; | |||
pub struct Allocation<Tag = AllocId, Extra = ()> { | |||
/// The actual bytes of the allocation. | |||
/// Note that the bytes of a pointer represent the offset of the pointer. | |||
bytes: Vec<u8>, | |||
bytes: Box<[u8]>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out changing this from Vec
to Box
introduced UB into the interpreter itself... also see Zulip
…li-obk interpret/allocation: fix aliasing issue in interpreter and refactor getters a bit That new raw getter will be needed to let Miri pass pointers to natively executed FFI code ("extern-so" mode). While doing that I realized our get_bytes_mut are named less scary than get_bytes_unchecked so I rectified that. Also I realized `mem_copy_repeatedly` would break if we called it for multiple overlapping copies so I made sure this does not happen. And I realized that we are actually [violating Stacked Borrows in the interpreter](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/I.20think.20Miri.20violates.20Stacked.20Borrows.20.F0.9F.99.88).^^ That was introduced in rust-lang#87777. r? `@oli-obk`
…li-obk interpret/allocation: fix aliasing issue in interpreter and refactor getters a bit That new raw getter will be needed to let Miri pass pointers to natively executed FFI code ("extern-so" mode). While doing that I realized our get_bytes_mut are named less scary than get_bytes_unchecked so I rectified that. Also I realized `mem_copy_repeatedly` would break if we called it for multiple overlapping copies so I made sure this does not happen. And I realized that we are actually [violating Stacked Borrows in the interpreter](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/I.20think.20Miri.20violates.20Stacked.20Borrows.20.F0.9F.99.88).^^ That was introduced in rust-lang#87777. r? ``@oli-obk``
…li-obk interpret/allocation: fix aliasing issue in interpreter and refactor getters a bit That new raw getter will be needed to let Miri pass pointers to natively executed FFI code ("extern-so" mode). While doing that I realized our get_bytes_mut are named less scary than get_bytes_unchecked so I rectified that. Also I realized `mem_copy_repeatedly` would break if we called it for multiple overlapping copies so I made sure this does not happen. And I realized that we are actually [violating Stacked Borrows in the interpreter](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/I.20think.20Miri.20violates.20Stacked.20Borrows.20.F0.9F.99.88).^^ That was introduced in rust-lang#87777. r? ```@oli-obk```
Rollup merge of rust-lang#122537 - RalfJung:interpret-allocation, r=oli-obk interpret/allocation: fix aliasing issue in interpreter and refactor getters a bit That new raw getter will be needed to let Miri pass pointers to natively executed FFI code ("extern-so" mode). While doing that I realized our get_bytes_mut are named less scary than get_bytes_unchecked so I rectified that. Also I realized `mem_copy_repeatedly` would break if we called it for multiple overlapping copies so I made sure this does not happen. And I realized that we are actually [violating Stacked Borrows in the interpreter](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/I.20think.20Miri.20violates.20Stacked.20Borrows.20.F0.9F.99.88).^^ That was introduced in rust-lang#87777. r? ```@oli-obk```
#86255 introduced a 30% regression in page faults and a 3% regression in max-rss in the ctfe-stress benchmarks.
That's most likely happened because it separated allocation from initialization of the vec which defeats the zero-optimization.
Currently there's no allocation API that is fallible, zeroing and returns a slice, so this PR introduces one and then uses that to solve the problem. In principle
vec.resize(len, 0)
could be optimized to usealloc::grow_zeroed
where appropriate but that would require new specializations and new plumbing inRawVec
.