Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vec/api: fix insert for index == len #104

Merged
merged 3 commits into from
Feb 13, 2021

Conversation

ordian
Copy link
Contributor

@ordian ordian commented Feb 9, 2021

The docs says "Panics if index > len", not >=.

@ordian ordian force-pushed the fix-insert-edge-case branch 2 times, most recently from ad3a368 to bb94cc0 Compare February 9, 2021 08:29
@ordian ordian reopened this Feb 9, 2021
@@ -1265,7 +1271,7 @@ where
&mut self,
mid: usize,
) -> (&mut BitSlice<O, T::Alias>, &mut BitSlice<O, T::Alias>) {
self.assert_in_bounds(mid, 0..self.len());
self.assert_in_bounds(mid, 0..=self.len());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not sure whether split_at_unchecked_mut is UB-free in case mid == self.len()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not willing to commit firmly to an answer in either direction but here's my current understanding of address validity rules:

  • Due to the half-open iterator semantics required by C++, LLVM and therefore Rust permit that for any allocation of base B and length N bytes, the integer address value B + N is considered "within" the allocation at B *when computed by B + N. This pointer is not equivalent to a hypothetical address-equivalent C allocation base address that the allocator places immediately after the last live byte (B + N - 1) of the B allocation
  • The address production rules within bitvec always compute from the base address of the allocation. A BitSlice which extends up to the last bit of the last byte in an allocation will thus cause the arithmetic B + N for a base element address B and byte-count N. This new slice address points to the one-past byte that the C++ (and therefore LLVM (and therefore Rust)) require to be "within bounds", even though it is not dereferencable. The produced bit-slice has length zero, forbidding dereference.

Currently, Miri does not complain about accessing bytes out of bounds of allocations on any of my tests. I am unwilling to take Miri's silence as an affirmative proof of being fully defined; however, I am willing to take Miri's silence as a "good-enough" and not bother writing more defensively until either it complains or the Unsafe Code Guidelines working group cites me a chapter and verse about why my code will become undefined in the future.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case the tone is unclear: I absolutely appreciate double-checks on any part of my code, especially involving pointer arithmetic. I am writing that much detail because I have already found myself needing to recreate my own work from scratch after forgetting what I knew when I wrote it, so I prefer to have written records of my thought processes wherever it seems useful to find them in the future.

Thank you for commenting; please feel free to do so on any other sites that strike you as suspect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I appreciate your elaborate answer!

I agree with the "the one-past byte" argument, but the reason I asked is because the docs for get_unchecked_mut says

/// Calling this method with an
/// out-of-bounds index is [undefined behavior] even if the resulting
/// reference is not used.

However, it seems to be talking about discrete indexes, not range, and I missed that detail.

Copy link
Collaborator

@myrrlyn myrrlyn Feb 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could speculate or I could ask the originals.

The definition of [T]::split_at_mut does the same pointer arithmetic, so .split_at_mut(.len()) produces (self, &mut []) where the specific empty slice returned is at the one-past-last address, not the dangling address.

I used to actually produce the dangling-pointer empty slice rather than the one-past-last empty slice, but somebody filed a bug saying that my behavior diverged from std.

It's still probably UB to use .get_unchecked_mut to produce a live slice reference beyond bounds, but empty slices are not live references.

@myrrlyn myrrlyn merged commit c71ea23 into ferrilab:develop Feb 13, 2021
@ordian ordian deleted the fix-insert-edge-case branch February 13, 2021 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants