Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy clone semantics #1521

Merged
merged 6 commits into from May 4, 2016

Conversation

Projects
None yet
@ubsan
Copy link
Contributor

ubsan commented Mar 1, 2016

Specifying that <T as Clone>::clone(&t) where T: Copy should be equivalent to ptr::read(&t).

@durka

This comment has been minimized.

Copy link

durka commented on text/0000-copy-clone-semantics.md in d69cd0a Mar 1, 2016

Specialization would allow [...] in the case of T: Copy

@durka

This comment has been minimized.

Copy link

durka commented on text/0000-copy-clone-semantics.md in d69cd0a Mar 1, 2016

Shouldn't we also have a note like "Manual implementors of Clone must be careful to uphold this."?

@Aatch

This comment has been minimized.

Copy link

Aatch commented on text/0000-copy-clone-semantics.md in d69cd0a Mar 1, 2016

I think this section should probably also include a "Explicitly define the behaviour/semantics in this case" aspect. Rather than just "add X to the documentation". The documentation reflects the specification, but it probably shouldn't be the specification.

@ubsan

This comment has been minimized.

Copy link
Contributor Author

ubsan commented Mar 1, 2016

Note that these comments were originally line comments, but that fact seems to have been lost by github.

@briansmith

This comment has been minimized.

Copy link

briansmith commented Mar 1, 2016

I originally thought this was a good idea, and even suggested something very much like this on the internals mailing list, but I've changed my mind. Every type that implements Copy and Clone should derive Clone and the derived implementation of Clone should be implemented in terms of Copy. Further, the optimizer should be able to notice that a hierarchy of copies is equivalent to a single flattened copy and optimize it to a memcpy. If the optimizer doesn't do that, then it is going to generate terrible code for lots of common cases that aren't calls to Clone::clone such as record updates that modify a single field.

@ubsan

This comment has been minimized.

Copy link
Contributor Author

ubsan commented Mar 1, 2016

@briansmith The issue isn't derived Clone implementations, it's that we allow you to not derive Clone. If your T: Copy, and you manually implement Clone, this would make it invalid to not make Clone::clone(&t) and ptr::read(&t) equivalent.

The optimizer is also not turned on in debug mode. Allowing us to turn Vec::clone into a memcpy, even in debug mode, would make for much faster code.

Edit: not invalid like undefined behavior or anything, just to be clear.

@ubsan

This comment has been minimized.

Copy link
Contributor Author

ubsan commented Mar 1, 2016

Note, we've already made breaking changes based on this assumption.

The change in #[derive(Copy, Clone)] to fn clone(&self) -> Self { *self } would break any code that relies on the fact that each of the inner Clone::clones were called.

@briansmith

This comment has been minimized.

Copy link

briansmith commented Mar 1, 2016

it's that we allow you to not derive Clone

IMO, it is OK to say that if you want Clone to be fast for your Copy type, you need to derive Clone.

The change in #[derive(Copy, Clone)] to fn clone(&self) -> Self { *self } would break any code that relies on the fact that each of the inner Clone::clones were called.

Yes, but IMO that was also a short-term mistake. Once the optimizer is doing the right thing, that shortcut won't be necessary and we can return to the simpler semantics.

In particular, the optimizer can already notice that a loop is equivalent to memcpy. It just needs to get smarter to recognize non-loop equivalents, including recursive invocations of cone/copy, are equivalent to memcpy are equivalent to memcpy too.

Consider let x = y { u32_field: new_value }. How would your proposed optimization improve this situation? It seems it wouldn't, but this kind of pattern is also important to optimize to memcpy + 32-bit memory write.

@ubsan

This comment has been minimized.

Copy link
Contributor Author

ubsan commented Mar 1, 2016

@briansmith This isn't an optimization. This is specification. It's saying that the standard library (and anyone else) may rely on the fact that a properly implemented Clone::clone for T: Copy == ptr::read. This allows for optimization through specialization, for example, the #[derive(Copy, Clone)] thing, and many other optimizations based around memcpy.

@briansmith

This comment has been minimized.

Copy link

briansmith commented Mar 1, 2016

Yes, but I'm saying that if the optimizer is working well, then you don't need to specify anything special here. It would be useful to see an example where this rule is needed that can't be done without the rule by simply improving the optimizer.

@Amanieu

This comment has been minimized.

Copy link
Contributor

Amanieu commented Mar 1, 2016

The RFC should specify what the consequences are for types that violate this rule. In particular this must not result in memory unsafety.

@ubsan

This comment has been minimized.

Copy link
Contributor Author

ubsan commented Mar 1, 2016

@briansmith The issue isn't where the optimizer is working. It's where it isn't. Optimization is not always on, like in debug mode. Also, building a language for a sufficiently smart compiler doesn't seem like the correct way of doing things. Often we will have to specifically specialize.

@ubsan

This comment has been minimized.

Copy link
Contributor Author

ubsan commented Mar 1, 2016

@Amanieu yes, that is the intention, I will write it down.

@briansmith

This comment has been minimized.

Copy link

briansmith commented Mar 1, 2016

Why not specify that even in debug mode, the copy coalescing optimizations are done? Then there is no semantic change to the language needed and debug build results are faster.

@ubsan

This comment has been minimized.

Copy link
Contributor Author

ubsan commented Mar 1, 2016

@briansmith The semantics would still change, and then you would have different semantics between debug and release.

@briansmith

This comment has been minimized.

Copy link

briansmith commented Mar 1, 2016

To be clear, there's nothing that would change semantically in my definition. The only thing that would change is that debug builds would be specified to have an optimization pass enabled, for at least Clone::clone that they currently don't have enabled, and that optimization pass would be made smarter. The language definition would stay exactly the same.

@ubsan

This comment has been minimized.

Copy link
Contributor Author

ubsan commented Mar 1, 2016

@briansmith Ah. You can't really do that. The necessary optimizations to realize that:

assert!(src.len() == dst.len());
let len = dst.len();
let src = &src[..len];
for i in 0..len {
    self[i].clone_from(&src[i]);
}

is equivalent to

assert!(src.len() == dst.len());
memcpy(src.as_ptr(), dst.as_mut_ptr(), dst.len());

is... most of them.

# Unresolved questions
[unresolved]: #unresolved-questions

What the exact wording should be.

This comment has been minimized.

@petrochenkov

petrochenkov Mar 1, 2016

Contributor

Alternative wording: "When Copy type has user-defined implementation of Clone it's unspecified whether clone or memcpy is called when cloning a value of this type." or (shamelessly stolen from somewhere) "<...if Copy...> An implementation is allowed to omit the call to clone even if is has side effects. Such elision of calls to clone is called clone elision."

This comment has been minimized.

@ubsan

ubsan Mar 1, 2016

Author Contributor

@petrochenkov it's not that it's unspecified whether clone or ptr::read is called when you actually call .clone(). It's just that libraries (including the standard library) are able to assume that Clone::clone == ptr::read for T: Copy; and, if that requirement is not met, libraries are allowed to exhibit unexpected (but defined) behavior. (Just like if an Ord type doesn't actually have an absolute ordering, or if an Eq type doesn't actually have transitive equality).

This comment has been minimized.

@petrochenkov

petrochenkov Mar 1, 2016

Contributor

it's not that it's unspecified whether clone or ptr::read is called when you actually call .clone()

Why not? If we go this route anyway, let's give the compiler/libraries all the freedom.
I'd even write "user-defined implementations of Clone for Copy types are ignored", but I'm not sure about generics:

fn f<T: Clone>(arg: T) -> T { arg.clone() } // It's not known if `T` is `Copy` or not and I'm not sure it's convenient to turn `clone`s into `memcpy`s during monomorphisation.

This comment has been minimized.

@petrochenkov

petrochenkov Mar 1, 2016

Contributor

It's just that libraries (including the standard library) are able to assume that Clone::clone == ptr::read for T: Copy

derive is not a library, so the language already can omit clones as well.

This comment has been minimized.

@petrochenkov

petrochenkov Mar 1, 2016

Contributor

Also, "clone is always ignored" is better than "clone is sometimes ignored" from pedagogical point of view.

This comment has been minimized.

@ubsan

ubsan Mar 1, 2016

Author Contributor

hmm...

At first I thought this was a bad idea, but then I started thinking. Not "always ignored", but if your clone has different semantics from a ptr::read then it should be implementation defined whether it is called.

This comment has been minimized.

@arielb1

arielb1 Mar 1, 2016

Contributor

I really don't like the "optimizer magic overriding of clone".

This comment has been minimized.

@ubsan

ubsan Mar 2, 2016

Author Contributor

@arielb1 Yeah, I guess. I do think it would be better if it's just kept to libraries and language constructs like #[derive]

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

glaebhoerl commented Mar 1, 2016

Is there any reason to write a manual Clone impl for a type that's Copy in the first place, instead of deriving it? If we want to depend on the two being equivalent at the library level (as proposed), we should lint against it. (If we were to want to depend on it at the language level (as not proposed), then we should just make it an error outright.)

@comex

This comment has been minimized.

Copy link

comex commented Mar 1, 2016

@glaebhoerl Unfortunately, derived impls on generic structs sometimes have the wrong trait bounds, so you have to implement them manually.

@oli-obk

This comment has been minimized.

Copy link
Contributor

oli-obk commented Mar 1, 2016

@comex: isn't that "just a bug"?

@ubsan

This comment has been minimized.

Copy link
Contributor Author

ubsan commented Mar 1, 2016

Also, it's possible to implement manually, so it's possible to implement manually wrong. This RFC isn't about all those people who #[derive(Copy, Clone)]. It's about anyone who manually implements Clone, for whatever reason (including the reasons @comex stated, or just reasons of style, or to rely on Copy for older compilers, or even really silly reasons).

@retep998

This comment has been minimized.

Copy link
Member

retep998 commented Mar 1, 2016

@ubsan I manually implement Clone in winapi to force the implementation to rely on Copy and thereby improve compile times.

@ubsan

This comment has been minimized.

Copy link
Contributor Author

ubsan commented Mar 1, 2016

Some people do have to implement manually, like @retep998.

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Mar 1, 2016

I am not sure how much teeth this RFC has.

We already have both Clone::clone and Clone::clone_from, which may be implemented separately (and allocate differently, even). I don't think that we ever intended to specify which of them is called by standard library functions.

@ubsan

This comment has been minimized.

Copy link
Contributor Author

ubsan commented Mar 1, 2016

@arielb1 Yes, because it's specified that Clone::clone_from(x, y) shall be equivalent to *x = Clone::clone(y). We have to specify how cloning Copy types should work, or else we must do exactly what we say we are doing (i.e., cloning each individual value).

@aturon aturon self-assigned this Mar 2, 2016

and may not always be optimized (although it often will be). Specialization
would allow us to simply `memcpy` the values from the old `Vec` to the new
`Vec` in the case of `T: Copy`. However, if we don't specify this, we will not
be able to, and we will be stuck looping over every value.

This comment has been minimized.

@bluss

bluss Mar 3, 2016

We will be stuck relying on the optimizer to figure it out. Which it can do and insert memcpy, typically for Copy values of 8 bytes or smaller per element, it looks like.

This comment has been minimized.

@ubsan

ubsan Mar 3, 2016

Author Contributor

@bluss exactly.

@bluss

This comment has been minimized.

Copy link

bluss commented Mar 3, 2016

What the memcpy equivalent exactly is, is not specified. A derived Clone never copies struct padding, while Copy and memcpy do, for example.

I think I agree with the intention of this RFC build I would like it to simply say two things:

  • If a type implements Copy, it is always ok to replace general Clones with copies.
  • Code that relies on its specific behavior of Clone while implementing Copy is buggy; the trait implementations are simply wrong, just like if you implemented Ord incorrectly.
@ubsan

This comment has been minimized.

Copy link
Contributor Author

ubsan commented Mar 3, 2016

@bluss ptr::read is specified in the document. It's not defined whether memcpy copies padding bytes.

@Kimundi

This comment has been minimized.

Copy link
Member

Kimundi commented Mar 3, 2016

@ubsan My point was not that there should be Clone specific compiler optimizations, but that there could be.

As in, if Clone is specified to not do more than Copy, then it would not matter if the compiler did this since there would be no observable behavior difference.

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Mar 3, 2016

@Kimundi

I see this as pretty similar to the situation with PartialEq and co. The compiler will not assume that your implementation of PartialEq is symmetric in any way (calls to a == b will not become calls to PartialEq::eq(&b, &a)). However, library functions might get confused.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Apr 15, 2016

🔔 This RFC is now entering its week-long final comment period 🔔

@e-oz

This comment has been minimized.

Copy link

e-oz commented Apr 17, 2016

Hate breaking changes. Have you checked how many cargo crates will be affected? If yes - please share the numbers. From discussion it looks like breaking change just to make debug mode faster, but I think it can't be true.

@retep998

This comment has been minimized.

Copy link
Member

retep998 commented Apr 17, 2016

@e-oz This won't cause anything to stop compiling. This merely makes an assumption that the standard library and compiler can optimize based on. If you can find any crates in the wild that would be affected negatively by this change, please do bring them up as there isn't really any way to find out in an automated fashion.

@ticki

This comment has been minimized.

Copy link
Contributor

ticki commented Apr 17, 2016

I think it is very unlikely that such a change will break anything at all.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Apr 17, 2016

(I also don't consider this a change, merely formally stating an existing assumption.)

Pre-emptive update: I don't really want to get into dickering over what is or is not a breaking change. What I meant here was that there is surely existing code that would exhibit bugs if clone and copy are not aligned (think generics), and that we've used the assumption that copy and clone align in their semantics elsewhere, but failed to document that assumption.

@e-oz

This comment has been minimized.

Copy link

e-oz commented Apr 17, 2016

Committed code contains words "this is a breaking change", that's why I asked. Sorry for being suspicious:)

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Apr 19, 2016

I'm in favor of documenting these expectations for the semantics of Clone -- clearly, any other behavior for Copy data would be incredibly surprising and likely to lead to bugs.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Apr 22, 2016

Huzzah! The @rust-lang/lang team has decided to accept this RFC. However, since the libs team also claims jurisdiction, I'll let them merge it. :)

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented May 4, 2016

The libs team discussed this RFC at triage today and the decision was to merge. It was brought up that the user-facing documentation probably wants to indicate *self instead of ptr::read as it's a bit more understandable, but that doesn't need to change the body of this RFC.

Thanks again for the RFC @ubsan!

Tracking issue: rust-lang/rust#33416

@bluss

This comment has been minimized.

Copy link

bluss commented Jan 5, 2017

Vec::clone specialization for T: Copy is realized since rust-lang/rust/pull/38182 (Through extend_from_slice).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.