Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reuse RHS allocation for vec.extend(vec.into_iter()) when they do not fit into the LHS #77496

Closed
wants to merge 12 commits into from
63 changes: 49 additions & 14 deletions library/alloc/src/vec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2199,19 +2199,12 @@ impl<T> SpecFromIter<T, IntoIter<T>> for Vec<T> {
// But it is a conservative choice.
let has_advanced = iterator.buf.as_ptr() as *const _ != iterator.ptr;
if !has_advanced || iterator.len() >= iterator.cap / 2 {
unsafe {
let it = ManuallyDrop::new(iterator);
if has_advanced {
ptr::copy(it.ptr, it.buf.as_ptr(), it.len());
}
return Vec::from_raw_parts(it.buf.as_ptr(), it.len(), it.cap);
}
// Safety: passing 0 is always valid
return unsafe { iterator.into_vec(0) };
}

let mut vec = Vec::new();
// must delegate to spec_extend() since extend() itself delegates
// to spec_from for empty Vecs
vec.spec_extend(iterator);
iterator.move_to(&mut vec);
vec
}
}
Expand Down Expand Up @@ -2391,11 +2384,24 @@ where
}

impl<T> SpecExtend<T, IntoIter<T>> for Vec<T> {
fn spec_extend(&mut self, mut iterator: IntoIter<T>) {
unsafe {
self.append_elements(iterator.as_slice() as _);
fn spec_extend(&mut self, iterator: IntoIter<T>) {
// Avoid reallocation if we can use iterator's storage instead. This requires 1 memcpy and 0-1 memmove
the8472 marked this conversation as resolved.
Show resolved Hide resolved
// while reallocation would require 1 alloc, 1-2 memcpy, 1-2 free
if mem::size_of::<T>() > 0
&& self.capacity() - self.len() < iterator.len()
&& iterator.cap - iterator.len() >= self.len()
Comment on lines +2428 to +2429
Copy link
Contributor

@pickfire pickfire Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be simplified as?

Suggested change
&& self.capacity() - self.len() < iterator.len()
&& iterator.cap - iterator.len() >= self.len()
&& iterator.len() - self.len() < iterator.cap - self.capacity()

I wonder if this will always grow the vec if the iterator is larger.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed change would underflow if self is larger than iterator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Underflow?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is technically still an overflow, but that iterator.len() - self.len() would panic or wrap if, say, self.capacity() == 20_000 and self.len() == 19_950, and iterator.len() == 100.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the case where self.capacity() > iterator.cap . The subtraction would underflow the usize result and thus lead to the inequality unexpectedly evaluating to true which would then violate the safety constraints of into_vec_with_uninit_prefix

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then how about?

Suggested change
&& self.capacity() - self.len() < iterator.len()
&& iterator.cap - iterator.len() >= self.len()
&& iterator.len().saturating_sub(self.len()) < iterator.cap.saturating_sub(self.capacity())

I

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a self with len == 2 && cap == 2 and an iterator len == 2 && cap == 3 that would evaluate to true and attempt to store 4 elements into an allocation of 3. 💣💥

{
// Safety: we just checked that IntoIter has sufficient capacity to prepend our elements.
// Prepending will then fill the uninitialized prefix.
*self = unsafe {
let mut v = iterator.into_vec(self.len() as isize);
the8472 marked this conversation as resolved.
Show resolved Hide resolved
ptr::copy_nonoverlapping(self.as_ptr(), v.as_mut_ptr(), self.len);
self.set_len(0);
the8472 marked this conversation as resolved.
Show resolved Hide resolved
v
};
return;
the8472 marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we turn this into an else block instead of early returning?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can, but I prefer early returns since the later part indicates the default approach in contrast to the special case above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned the same thing above #77496 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find splitting the branches here a bit easier to follow so you can tell that there's two paths we can take and don't have any way to fall through this one.

}
iterator.ptr = iterator.end;
iterator.move_to(self);
}
}

Expand Down Expand Up @@ -2928,6 +2934,35 @@ impl<T> IntoIter<T> {
self.ptr = self.buf.as_ptr();
self.end = self.buf.as_ptr();
}

/// Shifts the remaining elements to `offset` and then converts the whole allocation into a Vec
/// with `vec.len() == offset + self.len()`
///
/// # Safety
///
/// When a non-zero offset is passed the resulting Vec will have an uninitialized prefix
/// that needs to be filled before the Vec is valid again.
///
/// * `offset + self.len()` must not exceed `self.cap`
/// * `offset == 0` is always valid
/// * `offset` must be positive
the8472 marked this conversation as resolved.
Show resolved Hide resolved
unsafe fn into_vec(self, offset: isize) -> Vec<T> {
the8472 marked this conversation as resolved.
Show resolved Hide resolved
let dst = unsafe { self.buf.as_ptr().offset(offset) };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably use add instead of offset for usize, but I don't know if the compiler will optimize out the offset if it is 0? Does it or should we have two functions, one with offset and one without?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 is a constant in the other callsite, so it should be easy to optimize for llvm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @lzutao to see if he's interested to check this out

if self.ptr != dst as *const _ {
the8472 marked this conversation as resolved.
Show resolved Hide resolved
unsafe { ptr::copy(self.ptr, dst, self.len()) }
}

let iter = ManuallyDrop::new(self);
unsafe { Vec::from_raw_parts(iter.buf.as_ptr(), offset as usize + iter.len(), iter.cap) }
}

/// Move remaining elements to the end of `dest`.
fn move_to(mut self, dest: &mut Vec<T>) {
the8472 marked this conversation as resolved.
Show resolved Hide resolved
the8472 marked this conversation as resolved.
Show resolved Hide resolved
unsafe {
dest.append_elements(self.as_slice() as _);
}
self.ptr = self.end;
}
}

#[stable(feature = "vec_intoiter_as_ref", since = "1.46.0")]
Expand Down