Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilize copy_within #61398

Merged
merged 3 commits into from
Jun 13, 2019
Merged

Stabilize copy_within #61398

merged 3 commits into from
Jun 13, 2019

Conversation

kennytm
Copy link
Member

@kennytm kennytm commented May 31, 2019

Closes #54236.

@kennytm kennytm added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. S-blocked Status: Marked as blocked ❌ on something else such as an RFC or other implementation work. needs-fcp This change is insta-stable, so needs a completed FCP to proceed. labels May 31, 2019
@rust-highfive
Copy link
Collaborator

r? @joshtriplett

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 31, 2019
@kennytm kennytm added I-nominated S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). and removed S-blocked Status: Marked as blocked ❌ on something else such as an RFC or other implementation work. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 31, 2019
@kennytm
Copy link
Member Author

kennytm commented May 31, 2019

Stabilization report

The copy_within function is a safe wrapper of std::ptr::copy within the slice's bound. This is an important building block in writing a fast and safe code involving moving multiple chunks of bytes in a Vec<u8>.

let mut sl = [0, 1, 2, 3, 4, 5, 600, 700, 800, 900, 1000];
sl.copy_within(7..=9, 2);
assert_eq!(sl, [0, 1, 700, 800, 900, 5, 600, 700, 800, 900, 1000]);

The copy_within method was introduced in #53652, merged 8 months ago. Several APIs are explored in that PR, and the final form remains the same as today. This function is also available on crates.io as the copy_in_place package, and its API also did not see any change after #53652 was merged. This shows an implicit consensus that the current function signature is good enough to stabilize as-is.

Test cases can be found in src/libcore/tests/slice.rs. It tests the normal behavior and error reporting when given invalid indices.

#[test]
fn test_copy_within() {
// Start to end, with a RangeTo.
let mut bytes = *b"Hello, World!";
bytes.copy_within(..3, 10);
assert_eq!(&bytes, b"Hello, WorHel");
// End to start, with a RangeFrom.
let mut bytes = *b"Hello, World!";
bytes.copy_within(10.., 0);
assert_eq!(&bytes, b"ld!lo, World!");
// Overlapping, with a RangeInclusive.
let mut bytes = *b"Hello, World!";
bytes.copy_within(0..=11, 1);
assert_eq!(&bytes, b"HHello, World");
// Whole slice, with a RangeFull.
let mut bytes = *b"Hello, World!";
bytes.copy_within(.., 0);
assert_eq!(&bytes, b"Hello, World!");
}
#[test]
#[should_panic(expected = "src is out of bounds")]
fn test_copy_within_panics_src_too_long() {
let mut bytes = *b"Hello, World!";
// The length is only 13, so 14 is out of bounds.
bytes.copy_within(10..14, 0);
}
#[test]
#[should_panic(expected = "dest is out of bounds")]
fn test_copy_within_panics_dest_too_long() {
let mut bytes = *b"Hello, World!";
// The length is only 13, so a slice of length 4 starting at index 10 is out of bounds.
bytes.copy_within(0..4, 10);
}
#[test]
#[should_panic(expected = "src end is before src start")]
fn test_copy_within_panics_src_inverted() {
let mut bytes = *b"Hello, World!";
// 2 is greater than 1, so this range is invalid.
bytes.copy_within(2..1, 0);
}

There is a minor suggestion for a non-panicking version. However, all other slice methods like rotate_left and split_at_mut having panicking versions only, so this PR is not going to address that. If a non-panicking version is truly desired, we could introduce a try_copy_within method in the future.

@kennytm kennytm mentioned this pull request May 31, 2019
4 tasks
@SimonSapin
Copy link
Contributor

@rfcbot fcp merge

@rfcbot
Copy link

rfcbot commented May 31, 2019

Team member @SimonSapin has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels May 31, 2019
@SimonSapin
Copy link
Contributor

Thanks for the stabilization report @kennytm !

@Centril Centril added the relnotes Marks issues that should be documented in the release notes of the next release. label May 31, 2019
@Centril Centril added this to the 1.37 milestone May 31, 2019
@oconnor663
Copy link
Contributor

This shows an implicit consensus that the current function signature is good enough to stabilize as-is.

I'd caution that my crate has seen very little usage since it was published. I only use it in a couple places, and I only see one other calling crate on crates.io. So if someone did have other ideas about how the API should look, I wouldn't feel like we have a ton of usage evidence to argue either way. But on the other hand, it's a pretty simple API :)

I also notice that we might want to add tests for the overflow-panic cases that are possible with nonstandard ranges.

@rfcbot rfcbot added the final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. label Jun 1, 2019
@rfcbot
Copy link

rfcbot commented Jun 1, 2019

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Jun 1, 2019
/// let mut bytes = *b"Hello, World!";
///
/// bytes.copy_within(1..5, 8);
///
/// assert_eq!(&bytes, b"Hello, Wello!");
/// ```
#[unstable(feature = "copy_within", issue = "54236")]
#[stable(feature = "copy_within", since = "1.37.0")]
pub fn copy_within<R: ops::RangeBounds<usize>>(&mut self, src: R, dest: usize)
where
T: Copy,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive-by unrelated to stabilization: seems like we shouldn't be monomorphizing this whole function on R, given that we only need .start_bound() and .end_bound(), which we can call first and pass into a non-generic version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn’t the method still be generic over T?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM will deduplicate identical functions anyway. Hard to see this coming at a compilation time cost and if it ever does in practice, it would be great to fix it in general rather than implementing hacks for specific functions in libcore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nagisa the point is that they wouldn't be identical functions because they'd have different types for R. So we'd generate separate copies for Range, RangeFrom, RangeTo, RangeInclusive, etc...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cramertj if the generated IR after inlining is identical, it will get merged. LLVM does not care about the type system.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is that I don't think the generated IR would be identical. RangeFrom<T> and Range<T> don't have the same representation, so converting them to start and end bounds would need to be split out into a separate function in order for the inner body of the functions to be unified.

@kennytm
Copy link
Member Author

kennytm commented Jun 2, 2019

Before the end of FCP I'd like to confirm one thing: would restricting the bound from RangeBounds<usize> to SliceIndex<[T], Output = [T]> be better?

I don't think we're interested in supporting code like

slice.copy_within((Bound::Excluded(1), Bound::Excluded(3)), 6);

The implementation could then also be shortened as:

pub fn copy_within<R: SliceIndex<[T], Output=[T]>>(&mut self, src: R, dest: usize) 
where
    T: Copy,
{
    let src = &self[src];
    let count = src.len();
    assert!(dest <= self.len() - count, "dest is out of bounds");
    unsafe {
        ptr::copy(
            src.as_ptr(),
            self.get_unchecked_mut(dest),
            count,
        );
    }
}

@SimonSapin
Copy link
Contributor

This shortened code looks like it might have UB in case the source and destination ranges have overlap, since &[T] and &mut [T] exist at the same time. I’m not sure.

Regardless, being used for arguments like this is the point of RangeBounds. Even if we’re not interested in some of what it supports, there’s no need to actively exclude it.

@kennytm
Copy link
Member Author

kennytm commented Jun 2, 2019

I've tried checking with Miri but no errors were reported about overlapping ranges (unlike &mut self[dest] which was rejected).

The code should be changed to self.as_mut_ptr().add(dest) anyway, since both the existing code and the replacement above failed this in Miri:

    let mut x = [1, 2, 3];
    x.copy_within(1..1, 3);

This ensures we won't accidentally read *src or *dest even when count = 0.
Co-Authored-By: Jack O'Connor <oconnor663@gmail.com>
@oconnor663
Copy link
Contributor

I agree with @SimonSapin that putting RangeBounds in the public API is nice, since that's the stable trait that 3rd party types are allowed to implement. But I wish there was a helper function somewhere that converted from a T: RangeBounds<usize> to a &[T], so that copy_within didn't need to do so many checks itself. Would it be worth factoring out such a function? Would it be worth exposing such a function from the standard library in a separate PR?

@oconnor663
Copy link
Contributor

This doesn't block stabilization, but something like this refactoring (example branch in the copy_in_place repo) could simplify things. In particular, it leaves us with just a single assert in front of the unsafe block. Here it is tweaked to be a method on slices:

pub fn copy_within<R: ops::RangeBounds<usize>>(&mut self, src: R, dest: usize)
where
    T: Copy,
{
    let (src_ptr, src_len) = {
        let src_slice = self.get_range(src); // <-- here's where a lot gets factored out
        (src_slice.as_ptr(), src_slice.len())
    };
    assert!(dest <= self.len() - src_len, "dest is out of bounds");
    unsafe {
        let dest_ptr = self.as_mut_ptr().add(dest);
        ptr::copy(src_ptr, dest_ptr, src_len);
    }
}

Here's a godbolt output suggesting that the optimizer is still happy.

@SimonSapin
Copy link
Contributor

I wish there was a helper function somewhere that converted from a T: RangeBounds<usize> to a &[T]

Is the get method of slices not that function?


Maybe forcing early conversion to raw pointers would could be enough to avoid UB?

let src: *const [T] = self.get(src);
let dest: *mut T = self.get_mut(dest);

@oconnor663
Copy link
Contributor

Is the get method of slices not that function?

I don't think there's a blanket impl of SliceIndex for R: RangeBounds<usize>, no. Unless I'm doing it wrong, I get "the type [T] cannot be indexed by R."

@Centril Centril removed the needs-fcp This change is insta-stable, so needs a completed FCP to proceed. label Jun 5, 2019
@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Jun 11, 2019
@rfcbot
Copy link

rfcbot commented Jun 11, 2019

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

@SimonSapin
Copy link
Contributor

I assume the nominated label was a request for FCP, which has happened.

@bors r+

@bors
Copy link
Contributor

bors commented Jun 12, 2019

📌 Commit 427f1a4 has been approved by SimonSapin

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). labels Jun 12, 2019
Centril added a commit to Centril/rust that referenced this pull request Jun 12, 2019
Centril added a commit to Centril/rust that referenced this pull request Jun 12, 2019
bors added a commit that referenced this pull request Jun 13, 2019
Rollup of 9 pull requests

Successful merges:

 - #60376 (Stabilize Option::xor)
 - #61398 (Stabilize copy_within)
 - #61629 (Hygienize macros in the standard library)
 - #61675 (Include frame pointer for bare metal RISC-V targets)
 - #61750 (Fix x.py install)
 - #61761 (Add an alias for x86_64-sun-solaris target tuple)
 - #61762 (rustbuild: fix libtest_stamp)
 - #61763 (ci: fix ci stats upload condition)
 - #61776 (Fix typos in error_codes)

Failed merges:

r? @ghost
@bors bors merged commit 427f1a4 into rust-lang:master Jun 13, 2019
@kennytm kennytm deleted the stabilize-copy-within branch June 13, 2019 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. relnotes Marks issues that should be documented in the release notes of the next release. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tracking issue for copy_within
10 participants