Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reuse RHS allocation for vec.extend(vec.into_iter()) when they do not fit into the LHS #77496

Closed
wants to merge 12 commits into from
89 changes: 75 additions & 14 deletions library/alloc/src/vec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2199,19 +2199,12 @@ impl<T> SpecFromIter<T, IntoIter<T>> for Vec<T> {
// But it is a conservative choice.
let has_advanced = iterator.buf.as_ptr() as *const _ != iterator.ptr;
if !has_advanced || iterator.len() >= iterator.cap / 2 {
unsafe {
let it = ManuallyDrop::new(iterator);
if has_advanced {
ptr::copy(it.ptr, it.buf.as_ptr(), it.len());
}
return Vec::from_raw_parts(it.buf.as_ptr(), it.len(), it.cap);
}
// Safety: passing 0 is always valid
return unsafe { iterator.into_vec_with_uninit_prefix(0) };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be clearer to add another helper method that just converts the iterator into a Vec without talking about an uninitialized prefix here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the implementation of into_vec_with_uninit_prefix I think it would be better if we disallowed a prefix of 0 and added a separate into_vec method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As commented below, shifting is still needed even if the prefix is 0 so the methods would be identical except for their safety and one less addition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think that's enough reason to make the change.

}

let mut vec = Vec::new();
// must delegate to spec_extend() since extend() itself delegates
// to spec_from for empty Vecs
vec.spec_extend(iterator);
iterator.move_into(&mut vec);
vec
}
}
Expand Down Expand Up @@ -2391,11 +2384,48 @@ where
}

impl<T> SpecExtend<T, IntoIter<T>> for Vec<T> {
fn spec_extend(&mut self, mut iterator: IntoIter<T>) {
unsafe {
self.append_elements(iterator.as_slice() as _);
fn spec_extend(&mut self, iterator: IntoIter<T>) {
// Avoid reallocation if we can use iterator's storage instead. This requires 1 memcpy and 0-1 memmove
the8472 marked this conversation as resolved.
Show resolved Hide resolved
// while reallocation would require 1 alloc, 1-2 memcpy, 1-2 free.
//
// ## non-empty self, partially consumed iterator
the8472 marked this conversation as resolved.
Show resolved Hide resolved
//
// == step == == memory == == self == == iter / v ==
// 0123456789abcdef0123456789abcdef
// 0---------------1---------------
//
// [initial] AAAA_-----__BBB___-------------- Vec(0x00, 4, 5) IntoIter(0x0a, 0x0c, 0x0f, 8)
// into_vec AAAA_-----____BBB_-------------- Vec(0x00, 4, 5) Vec(0x0a, 7, 8)
// prepend _____-----AAAABBB_-------------- Vec(0x00, 0, 5) Vec(0x0a, 7, 8)
// *self = v ----------AAAABBB_-------------- Vec(0x0a, 7, 8)
//
// ## empty self, partially consumed iterator
//
// [initial] ____------__BBBB__-------------- Vec(0x00, 0, 4) IntoIter(0x0a, 0x0c, 0x10, 8)
// into_vec ____------BBBB____-------------- Vec(0x00, 0, 4) Vec(0x0a, 4, 8)
// *self = v ----------BBBB____-------------- Vec(0x0a, 4, 8)
//
// ## empty self, pristine iterator
//
// [initial] ----------BBBB____-------------- Vec(0x00, 0, 0) IntoIter(0x0a, 0x0a, 0x0e, 8)
// *self = v ----------BBBB____-------------- Vec(0x0a, 4, 8)
//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That line was intentionally left blank.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think following the rest of the codebase convention here would have this blank line not have the comment //, and just be empty.

if mem::size_of::<T>() > 0
&& self.capacity() - self.len() < iterator.len()
&& iterator.cap - iterator.len() >= self.len()
Comment on lines +2428 to +2429
Copy link
Contributor

@pickfire pickfire Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be simplified as?

Suggested change
&& self.capacity() - self.len() < iterator.len()
&& iterator.cap - iterator.len() >= self.len()
&& iterator.len() - self.len() < iterator.cap - self.capacity()

I wonder if this will always grow the vec if the iterator is larger.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed change would underflow if self is larger than iterator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Underflow?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is technically still an overflow, but that iterator.len() - self.len() would panic or wrap if, say, self.capacity() == 20_000 and self.len() == 19_950, and iterator.len() == 100.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the case where self.capacity() > iterator.cap . The subtraction would underflow the usize result and thus lead to the inequality unexpectedly evaluating to true which would then violate the safety constraints of into_vec_with_uninit_prefix

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then how about?

Suggested change
&& self.capacity() - self.len() < iterator.len()
&& iterator.cap - iterator.len() >= self.len()
&& iterator.len().saturating_sub(self.len()) < iterator.cap.saturating_sub(self.capacity())

I

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a self with len == 2 && cap == 2 and an iterator len == 2 && cap == 3 that would evaluate to true and attempt to store 4 elements into an allocation of 3. 💣💥

{
// Safety: we just checked that IntoIter has sufficient capacity to prepend our elements.
// Prepending will then fill the uninitialized prefix.
let v = unsafe {
let mut v = iterator.into_vec_with_uninit_prefix(self.len() as isize);
ptr::copy_nonoverlapping(self.as_ptr(), v.as_mut_ptr(), self.len);
self.set_len(0);
the8472 marked this conversation as resolved.
Show resolved Hide resolved
v
};
*self = v;
return;
the8472 marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we turn this into an else block instead of early returning?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can, but I prefer early returns since the later part indicates the default approach in contrast to the special case above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned the same thing above #77496 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find splitting the branches here a bit easier to follow so you can tell that there's two paths we can take and don't have any way to fall through this one.

}
iterator.ptr = iterator.end;
iterator.move_into(self);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cooler if the code are linked to the cool diagram.

Suggested change
iterator.move_into(self);
// Insufficient capacity
iterator.move_into(self);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, yes and no.

With the optimization present that is indeed all it covers. But it is the general codepath that also works without the optimization. So I don't want to give the impression that it can only handle that case.

}
}

Expand Down Expand Up @@ -2928,6 +2958,37 @@ impl<T> IntoIter<T> {
self.ptr = self.buf.as_ptr();
self.end = self.buf.as_ptr();
}

/// Shifts the remaining elements to `offset` and then converts the whole allocation into a Vec
/// with `vec.len() == offset + self.len()`
///
/// # Safety
///
/// When a non-zero offset is passed the resulting Vec will have an uninitialized prefix
/// that needs to be filled before the Vec is valid again.
///
/// * `offset + self.len()` must not exceed `self.cap`
/// * `offset == 0` is always valid
/// * `offset` must be positive
the8472 marked this conversation as resolved.
Show resolved Hide resolved
unsafe fn into_vec_with_uninit_prefix(self, offset: isize) -> Vec<T> {
debug_assert!(offset > 0);
the8472 marked this conversation as resolved.
Show resolved Hide resolved
debug_assert!(offset as usize + self.len() <= self.cap);
let dst = unsafe { self.buf.as_ptr().offset(offset) };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably use add instead of offset for usize, but I don't know if the compiler will optimize out the offset if it is 0? Does it or should we have two functions, one with offset and one without?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 is a constant in the other callsite, so it should be easy to optimize for llvm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @lzutao to see if he's interested to check this out

if self.ptr != dst as *const _ {
the8472 marked this conversation as resolved.
Show resolved Hide resolved
unsafe { ptr::copy(self.ptr, dst, self.len()) }
}

let iter = ManuallyDrop::new(self);
unsafe { Vec::from_raw_parts(iter.buf.as_ptr(), offset as usize + iter.len(), iter.cap) }
}

/// Move remaining elements to the end of `dest`.
fn move_into(mut self, dest: &mut Vec<T>) {
unsafe {
dest.append_elements(self.as_slice() as _);
}
self.ptr = self.end;
}
}

#[stable(feature = "vec_intoiter_as_ref", since = "1.46.0")]
Expand Down