Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upAutoreferencing `Copy` Types #2111
Conversation
cramertj
added some commits
Aug 7, 2017
cramertj
added
T-lang
Ergonomics Initiative
labels
Aug 16, 2017
oli-obk
reviewed
Aug 16, 2017
| fn main() { | ||
| let mut x = 5; | ||
| let y = u8Ref(x); |
This comment has been minimized.
This comment has been minimized.
oli-obk
Aug 16, 2017
•
Contributor
I find this rather unexpected. I thought this RFC meant that Copy types are copied at the site and create a reference to the copy. That is how such a function call looks to me, I'm moving the value into the function, but since it's Copy, it will be copied and I retain the original.
This comment has been minimized.
This comment has been minimized.
repax
Aug 16, 2017
•
As the owner of x, you would have to trust u8Ref(x) not to mutate x.
(sorry for the misunderstanding)
That x is borrowed is indeed surprising.
This comment has been minimized.
This comment has been minimized.
cramertj
Aug 16, 2017
Author
Member
@repax: this RFC only allows coercion from T -> &T, not &mut T, so u8Ref can't mutate x.
This comment has been minimized.
This comment has been minimized.
repax
Aug 16, 2017
If the lifetime of the borrow was limited to a function call I think I would be less confused:
let mut x = 5;
let y = foo(x);
// ^ autoreference of `x` occurs here
x = 7; // `y` is unrelated to `x` so this is ok
This comment has been minimized.
This comment has been minimized.
est31
Aug 16, 2017
Contributor
@cramertj is there a guarantee that we'll never get a Copy-able UnsafeCell like wrapper?
This comment has been minimized.
This comment has been minimized.
Ixrec
Aug 16, 2017
Contributor
rust-lang/rust#25053 was active not that long ago, seems there's at least some support for making UnsafeCell implement Copy
This comment has been minimized.
This comment has been minimized.
nikomatsakis
Aug 17, 2017
Contributor
I think that the copy-ability of UnsafeCell is a somewhat distinct question. The last time we discussed it in depth, the conclusion was that we ought to address with a lint. i.e., we should have the ability to make things copy, but warn if they are actually copied. =) I still think this makes sense, and it would apply here.
This comment has been minimized.
This comment has been minimized.
nikomatsakis
Aug 17, 2017
Contributor
I am torn about whether to treat this auto-ref as borrowing the actual variable. I too find it somewhat surprising; I think I might prefer that we always create a temporary (with the standard lifetime). It would certainly suffice for the cases that are motivating the RFC; it seems like those cases where it would not suffice are also kind of confusing and unexpected to me (i.e., a reference that lives longer than the current statement).
It also seems more compatible with possibly extending this "auto-ref" to types beyond Copy types. At least the way we had previously talked about that, we wanted foo(x) to be loosely equivalent to { let y = x; foo(&y) }, but with better error messages indicating that one can add an &x if one must retain ownership. I am still interested in thinking that over, but it seems incompatible with the approach of this RFC.
This comment has been minimized.
This comment has been minimized.
Ixrec
Aug 17, 2017
Contributor
I'm currently also leaning towards making a copy every time instead of borrowing. That seems like:
- the only way to prevent the combination of Copy+UnsafeCell and this RFC from allowing "implicit mutability", which seems like a terrible footgun, and
- the only way to prevent the "found T, expected &T" errors we get today from turning into borrow checker errors (which are typically much harder to understand and fix, no matter how good the error message)
If that extra copy is somehow undesirable, iiuc you can always add the explicit & even after this RFC.
This comment has been minimized.
This comment has been minimized.
joshtriplett
Aug 23, 2017
Member
Can we make an implicit copy, but promise that the optimizer will almost always eliminate that copy in practice, except in cases like UnsafeCell?
petrochenkov
reviewed
Aug 16, 2017
| `Copy` types will autoreference: at | ||
| [coercion sites](https://github.com/rust-lang/rfcs/blob/master/text/0401-coercions.md#coercions), | ||
| when a reference is expected, but an owned | ||
| `Copy` type is found, an `&` will be automatically inserted. |
This comment has been minimized.
This comment has been minimized.
petrochenkov
Aug 16, 2017
Contributor
So, this will work?
trait Trait {}
impl Trait for u8 {}
let x: &Trait = 0u8;
// A reference to `Trait` is expected and `u8: Copy` is provided (but `u8 != Trait`) => a `&` is still inserted
// let x: &Trait = &0u8;
// Now coercion `&u8 -> &Trait` can apply.
// Nice.
This comment has been minimized.
This comment has been minimized.
cramertj
Aug 16, 2017
Author
Member
Similar to my answer below, I would expect this to work since coercions are supposed to apply transitively.
This comment has been minimized.
This comment has been minimized.
|
The question about multiple references again. Does this work? fn f(arg: &&&u8) {}
f(0u8); // Desugared into `f(&&&u8)`
// Note that reference types themselves `&T` are `Copy`.This needs to be clarified in the RFC text. |
This comment has been minimized.
This comment has been minimized.
|
This comment has been minimized.
This comment has been minimized.
Yes, but not all currently-existing coercions are applied transitively. The original coercions RFC says that they should be, though, so that is the behavior I would expect. |
This comment has been minimized.
This comment has been minimized.
|
Any use cases for wanting to pass a Copy value by reference instead of by value? The example given is not really convincing; an u8 is one byte and gets most likely passed via one register anyway. Its also inconsistent with non Copy types which really should not get autoreferencing. |
This comment has been minimized.
This comment has been minimized.
|
@est31 The concern here isn't the performance implications-- the concern is making it easier and more ergonomic to work with Although it's more of a long-term goal, and I've brought it up in the unresolved questions, I think that it would be good to provide some more explanation of this process in the RFC itself. I'll amend the RFC soon to add more details. |
This comment has been minimized.
This comment has been minimized.
I know that the RFC intends to make passing Copy types per reference easier. My question was: Why would you want to pass a Copy type per reference instead of per value?
I'm not sure whether its good to allow comparing &T with T. In C this means something much different. |
This comment has been minimized.
This comment has been minimized.
Usually, you wouldn't. However, cases often show up in generic or autogenerated code where a function or type expects
We already implement operators for |
cramertj
added some commits
Aug 17, 2017
This comment has been minimized.
This comment has been minimized.
|
Is it possible to design this feature in a general manner, not limited in "auto-reference"? I would expect a trait in std library which exposes this feature to be user-extendable. Such as:
I wouldn't mind if it is a lang item. Whenever we need an "implicit type conversion", compiler will call this method. For Maybe it is dangerous to allow users to define implicit type conversions. |
nikomatsakis
reviewed
Aug 17, 2017
| `Copy` type is found, an `&` will be automatically inserted. | ||
|
|
||
| If the `Copy` value is a temporary, its lifetime will be promoted to meet the | ||
| required lifetime of the reference (where possible). |
This comment has been minimized.
This comment has been minimized.
nikomatsakis
Aug 17, 2017
Contributor
I find this sentence at odds with the behavior specific in RFC 66 etc. Specifically, the sentence as written suggests to me that we will infer the appropriate lifetime for the temporary, which would be quite at odds with the rest of the language. I think I would say it more like this:
If the value to be borrowed is an rvalue (e.g., a constant like 1), then a temporary will be created for it, just as if one explicitly wrote the & (e.g., &1). This temporary will have the same lifetime that would result from an explicit &, which generally means that it lives until the end of the innermost enclosing statement (until the next ;) -- but in some cases it may live longer. See RFC 66 and @nikomatsakis's amendment for more details.
Note that a consequence of this is that in some cases there will be implicit borrows that might be surprising. An example:
fn foo(x: &i32) -> &i32 { x }
let mut x = 22
let p = foo(x); // implicitly: borrows `x`
x += 1; // ERROR -- `x` is borrowed
print(p);
This comment has been minimized.
This comment has been minimized.
leodasvacas
commented
Aug 18, 2017
|
As has been mentioned before, we should consider the compatibility with the discarding ownership idea. |
This comment has been minimized.
This comment has been minimized.
|
I think if we interpret this RFC as "make a copy and borrow the copy" (which I think we should anyway) then the discarding ownership idea is actually a strict superset of this proposal, since that idea is basically "do a move and borrow the move" for all types (and it's already the case that "moving" a Copy type means copying it). So I'd imagine we can easily do that later. Hopefully I'm not missing some important complication there. |
This comment has been minimized.
This comment has been minimized.
rpjohnst
commented
Aug 19, 2017
•
|
In addition to solving the |
RalfJung
reviewed
Aug 21, 2017
| `T: Copy`: | ||
| - `T` to `&mut T`: This conversion has the potential to introduce hidden | ||
| mutations. With this change, passing a variable to a function could allow | ||
| the function to change the value of the variable, even if no `&mut` is present. |
This comment has been minimized.
This comment has been minimized.
RalfJung
Aug 21, 2017
Member
As already mentioned elsewhere I think, using & does not exclude this. For example, Cell could well be Copy, and it is possible for a user crate to soundly implement a Copy Cell today.
I think this RFC should take the possibility of Copy interior mutable types into account.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
RalfJung
Aug 21, 2017
•
Member
It's not Copy yet, but AFAIK there is any good reason for that. So this RFC should work under the assumption that UnsafeCell can be Copy.
EDIT: Also see rust-lang/rust#25053
This comment has been minimized.
This comment has been minimized.
nikomatsakis
Aug 23, 2017
Contributor
I find @RalfJung's logic compelling -- we should consider interior mutability. That said, when we last discussed the Copy question, we had thought to add a lint to reflect the "can copy" vs "should copy" distinction (similar to "must use"). I thought this was a good plan, but it never happened. If it had -- and of course it still can! -- it could certainly apply here.
RalfJung
reviewed
Aug 21, 2017
| mutations. With this change, passing a variable to a function could allow | ||
| the function to change the value of the variable, even if no `&mut` is present. | ||
| - `&mut T` and `&T` to `T`: This conversion would cause extra copies to occur | ||
| which could be difficult to identify when attempting to optimize code. |
This comment has been minimized.
This comment has been minimized.
RalfJung
Aug 21, 2017
Member
Given the fact that copying is already implicit in many places, I don't think this is a strong concern. I would argue that if copying a type is actually noticeable in cost, it should not be Copy -- then the clone calls will make it obvious where the cost is incurred.
This comment has been minimized.
This comment has been minimized.
scottmcm
Aug 21, 2017
Member
While I agree with everything you said, it seems like the PoR is for [i128; 1_000_000] to be Copy...
This comment has been minimized.
This comment has been minimized.
We've generally tried to avoid "open-ended" magic, preferring instead that overloaded operators be constrained by the type signature to do fairly harmless things. For example, the |
This comment has been minimized.
This comment has been minimized.
|
There have been a lot of interesting responses to this RFC. I'll do my best to summarize and respond here. Copying values firstMany people reading the RFC expected to see values copied before being autoreferenced. Initially, I was opposed to this idea since it seems counter-intuitive to some of the RFC's goals, such as making it easier to work with (potentially large) Additionally, introducing removable copies could make it hard for users to optimize their code. If a user wanted to prevent excess copies, they would need to search each function call to check where items have been passed by-value that could have been passed by-reference. This seems like a frustrating and error-prone process. That said, copying values first would avoid the problem of unexpected borrows. Personally, i don't find this to be too motivating, as I think such cases would be rare in practice, and good error messages would make them easy to fix. Interaction with
|
This comment has been minimized.
This comment has been minimized.
This already happens now. If there's a generic function taking a |
This comment has been minimized.
This comment has been minimized.
Uhm, no, |
This comment has been minimized.
This comment has been minimized.
|
Agree with @oli-obk . C++ has copy by default everywhere, and it sucks bad for performance. Rust therefore has move by default everywhere. But there is one exception for Rust, types that implement If you pass copy types per-value they get copied, not moved. This is a decision done in a trade off between ergonomics and performance. The proposed copy-then-autoref behaviour is the same behaviour we have right now for by-value passing of arguments: fn bar(_v: [u32; 64]) {
}
fn foo() {
let a = [0; 64];
bar(a); // Creates a copy of a
bar(a); // Creates a second copy of a
}With copy-then-reference, this would look like: fn bar(_v: &[u32; 64]) {
}
fn foo() {
let a = [0; 64];
bar(a); // Creates a copy of a, then references it
bar(a); // Creates a second copy of a, then references it
}I definitely support a guaranteed optimisation or a lint that warns you if no optimisation could be performed, its a great idea. But I think its not really neccessary, mainly because the main motivation for this RFC seems to be to make generic interfaces that take references more ergonomic to use for Copy types, e.g. the usual "indexing HashMap<u32,T>" case. |
This comment has been minimized.
This comment has been minimized.
|
So @eddyb was talking to me on IRC this morning. He was advocating for a more general approach to autoref -- in particular, I have long been against this, in part stemming from prior experience where I found it helpful to have an idea of where moves occur without having to look at the definition of callees (this was back in the days of "cross-borrows", in which we would silently coerce However, people often bring it up, and talking to @eddyb I started to wonder if I was being too harsh. It would be nice, after all, if I could do I do think that this starts to erode a certain clarity that we have around moves and borrows. It might impede learnability -- this is a bit hard for me to determine. I know that when I teach about moves, I almost always get questions about why (I've also not thought through how this interactions with generics and other coercions. Certainly not knowing, in general, what is a move and what is not might make lots of stuff more complicated.) Regardless, it's worth considering as another way to build on this RFC. If we wanted to go this way, then it would suggest some value towards not making |
This comment has been minimized.
This comment has been minimized.
|
@nikomatsakis I'm mostly supportive of this RFC as it limits auto-referencing to a small subset of cases, but I think extending this beyond Copy types and creating references without asking for all types except a small minority would be really bad. Being explicit about the difference between references and non-references seems part of what a systems language is for me. As someone with a C/C++ background, I think it was one of the mistakes of C++ to have introduced references without making creation of them happen with any discernible syntax. That being said, I do definitely favour adding a |
This comment has been minimized.
This comment has been minimized.
I think the reason this was a mistake is that it lets the callee change the value of my variable. That's not the case here, it would only be done for shared references and types with no interior mutability. For me, that makes all the difference. |
This comment has been minimized.
This comment has been minimized.
leodasvacas
commented
Sep 11, 2017
|
Operators are traits, coercing on them would be doing the same thing we do with method receivers which is quite a slippery slope. I'd rather wait for a better specialization that allows us to make impl<T: Borrow<i32>, U: Borrow<i32>> Add<T> for U |
This comment has been minimized.
This comment has been minimized.
That's exactly why we should coerce on them - they are similar enough to method receivers, where we already do coercions.
Transitive traits are far off in the future - I'm not aware of any plans to implement them. |
This comment has been minimized.
This comment has been minimized.
leodasvacas
commented
Sep 11, 2017
•
Thanks for the correction, that was a misconception on my part. If we can't specify it in the impl then auto-derefing and auto-refing for operators is an interesting idea. I see it as complementing this rfc since there is no intersection between what that solves and what this rfc solves. EDIT: But with intersection impls we could have EDIT2: Also if we start coercing operators then we can now add two |
This comment has been minimized.
This comment has been minimized.
|
One motivation for this RFC was that if/when we add coercion to operators, copy types will autoref in operators. I addressed this in the unresolved questions section of the RFC. |
This comment has been minimized.
This comment has been minimized.
|
The reason the LHS of operators isn't a coercion site is because it breaks expected type propagation (it's actually very close to a standard "generics break coercions" case). The RHS of operators is currently a normal coercion site, but we could change that somewhat. |
This comment has been minimized.
This comment has been minimized.
|
I somewhat agree with @arielb1, and I'm worried we may not have taken a full view of all the trade offs in this decision. There are four obvious points we could sit on in the space of "automatically referencing arguments":
All of these have downsides. Its not obvious to me that this RFC is the best position (right now, any of them but the last seems potentially viable to me). I believe we went with 2 instead of 3 here because of this concern: Let's say I do That's a valid concern, but its also a concern to me that we're going to exacerbate the existing pedagogical difference between |
This comment has been minimized.
This comment has been minimized.
leodasvacas
commented
Sep 11, 2017
|
@arielb1 Woah, that's a surprise for me. I noticed that if you add another impl for the same type, the compiler will no longer coerce and instead error with "trait not implemented", as in this example. Is this all by design? @withoutboats Again the point here is not to make an absolute decision but to experiment with a minimal version so that we may be better informed to decide where we want to sit. This RFC is not simply option two, it's precisely the intersection of the last two options. |
This comment has been minimized.
This comment has been minimized.
|
@leodasvacas I believe each of those is actually forward compatible with all subsequent items (that is, we could also do 3, and relax it to 4, without breaking any code). |
This comment has been minimized.
This comment has been minimized.
leodasvacas
commented
Sep 11, 2017
•
|
@withoutboats I agree that semantically moving is forwards compatible with not semantically moving. However semantically copying is forwards incompatible with not semantically copying. Consider that |
leodasvacas
referenced this pull request
Sep 11, 2017
Closed
Allow autoderef and autoref in operators #2147
This comment has been minimized.
This comment has been minimized.
|
I don't like this (because I don't like implicitness). |
This comment has been minimized.
This comment has been minimized.
tomwhoiscontrary
commented
Sep 14, 2017
|
Reading these comments has been really interesting. Great stuff, everyone! This feature is clearly useful. However, it's also clearly not completely thought through to its final ramifications, including interactions with other potential changes. Postpone? eRFC? Extended period behind a feature gate? |
This comment has been minimized.
This comment has been minimized.
I too am wary. I think that there are obvious pain-points being addressed here, but it's also not obvious that this is the most elegant fix. Also, interestingly, I've been observing the type errors I get lately, and it seems that most of the time the error I get is actually when I have a That said, I did propose to merge, and I did so primarily because I felt like this is the kind of thing where we need to gain some "real-time" experience. Now I'm rethinking that and wondering if this is a case where an eRFC would be appropriate, but one with a broader scope: that is, basically, some kind of experimentation aiming to reduce the problem of "&T vs T" mismatches writ large. That would probably cover all of the above:
The goal would be to try out different models (on a strictly opt-in, experimental basis, obviously) and explore the tradeoffs. Ideally we would have @joshtriplett actively involved =) since he seems to have lots of scenarios in mind where the difference between ref and not-ref is vital to performance (I'd like to see examples and try to run various proposals across them and see what happens). The eRFC process seems suitable because we could come back with a proposed set of rules, once we have more data and a better feeling for what they ought to be.
Well, the code will compile, but I still don't think all those variants are forwards compatible. In particular, there are subtle changes with respect to when destructors run. For example, if we say that |
This comment has been minimized.
This comment has been minimized.
@nikomatsakis I think that's entirely the wrong direction to head into. Sure, experimentation is great, but I already know now that I don't want these things in the language, ever. If you are trying to teach Rust to people who don't even know the difference between T and &T then removing that difference in a few subtle cases where its barely noticeable won't help with those people learning the language! And for autoclone... that's just a tiny little papercut and its definitely not worth the special case and added complexity. |
This comment has been minimized.
This comment has been minimized.
|
@nikomatsakis I wholeheartedly approve of this plan for experimentation -- it's essentially what I was hoping to do for coercion/generics interaction as well. I think it's clear that there are some real problems here and some plausible solutions (not just in this RFC), but hands-on experience would be helpful in pinning down the tradeoffs. There is a procedural question here: the RFCs this would apply to currently specify substantial technical detail. I'm not sure that we'd want to merge them as-is, but I also don't think it makes sense to write a fresh eRFC at this late date. I wonder if the right approach is "Closed as experiment-first" or some such? |
This comment has been minimized.
This comment has been minimized.
leodasvacas
commented
Sep 14, 2017
|
@est31 The difference between In my experience teaching rust, people understand |
This comment has been minimized.
This comment has been minimized.
I would be ok with closing the RFCs as "approved for experimentation" and opening a tracking issue that has links to both of them for posterity (or, maybe even better, opening up a GH repo -- perhaps in rust-lang-nursery -- that includes their text, and where future plans can be laid, but I'm not sure about that). (One thing i'm not clear about, incidentally-- if @arielb1 or @cramertj were to delete their GH accounts, presumably, it would mean the text of these RFCs disappears?) |
This comment has been minimized.
This comment has been minimized.
Storyyeller
commented
Sep 14, 2017
|
&i32 vs i32 doesn't make much difference semantically, but it does have performance implications. Apart from that, it's useful to be able to build a mental model of how things work, even if it doesn't capture every detail of the actual implementation. For example, in Python, my mental model is that every value is a pointer to a structure that contains a hashmap of properties and methods. In Java, everything is a pointer except for the primitive types. I think when you try to confuse these, it could cause problems. |
This comment has been minimized.
This comment has been minimized.
repax
commented
Sep 14, 2017
I wonder if not LLVM is smart enough to optimise away references to primitive types in local expressions (e.g. iterators). If so, then this isn't so much of a problem. For complex |
This comment has been minimized.
This comment has been minimized.
|
Difference between i32 and &i32 is only really “meaningless” in scenarios
where the reference is pointing at a regular memory. It could also very
well point to MMIO region or even some unmapped pages that get mapped on
demand/access.
In cases like these auto-deref becomes implicit auto-code-execution, and
with this RFC people wanting to keep the explicitness would have to resort
to raw pointers and therefore -- littering their code with unsafe blocks
all over.
…On Sep 14, 2017 10:52 PM, "repax" ***@***.***> wrote:
&i32 vs i32 doesn't make much difference semantically, but it does have
performance implications.
I wonder if not LLVM is smart enough to optimise away references to
primitive types in local expressions (e.g. iterators). If so, then this
isn't so much of a problem. For complex T types references are probably
more performant anyhow.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2111 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AApc0p8TnlHQRBKGFPCmxnFCAC6NziTkks5siYQBgaJpZM4O4hPT>
.
|
This comment has been minimized.
This comment has been minimized.
|
@nagisa Rust already assumes that successful reads are side-effect-free. The compiler is free to insert spurious reads or reorder reads around other effectful operations, and it tells LLVM so. In corner cases like you describe, you already have to use more explicit operations. |
This comment has been minimized.
This comment has been minimized.
|
Well yes, you would use the volatile read and write functions, but that
does not prevent such a reference from being automatically dereferenced by
user mistake.
I mean, sure, I don't disagree that there would be better and less
footgunny ways to write down the same thing (a wrapper type perhaps) -- I'm
just pointing out that the difference between two types is not always...
Meaningless.
On Sep 15, 2017 10:26, "Ralf Jung" <notifications@github.com> wrote:
@nagisa <https://github.com/nagisa> Rust already assumes that successful
reads are side-effect-free. The compiler is free to insert spurious reads
or reorder reads around other effectful operations, and it tells LLVM so.
In corner cases like you describe, you already have to use more explicit
operations.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2111 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AApc0i810pJXWqV8mRT4Jn3edRvnoQj5ks5siiafgaJpZM4O4hPT>
.
|
This comment has been minimized.
This comment has been minimized.
The compiler is free to insert spurious reads. For example, if you write fn foo(x: &i32) {
if some_condition { let x = *x; ... }
}then it is fine for this code to be translated into fn foo(x: &i32) {
let x = *x;
if some_condition { ... }
}So, the guarantee you are trying to protect against is already non-existent. (Or, if LLVM doesn't do this despite all the annotations we give it, the guarantee is non-existent in many of the Rust semantics that are being proposed.) |
This comment has been minimized.
This comment has been minimized.
|
@rfcbot fcp postpone OK, having thought more on this, I feel better and better about moving towards an experimentation model here, as I described earlier. Therefore -- although FCP completed, I am going to make an alternate proposal, and move to postpone this RFC, with the intention of immediate experimentation. In other words, the intention is to do the following:
|
This comment has been minimized.
This comment has been minimized.
|
@nikomatsakis you have to cancel the existing FCP before proposing a new one. That said: given that this decision effects multiple in-flight RFCs, and that we don't plan to open a separate eRFC (rather, just a direct tracking issue), I'm going to go ahead and close out the relevant RFCs and get the tracking issue set up. Thanks, all, for the illuminating discussion so far! Hopefully trying out some of these ideas behind a flag will give us additional clarity on the tradeoffs, before we come back with a fresh RFC. |
cramertj commentedAug 16, 2017
This RFC allows autoreferencing of T where T: Copy.
Example:
Rendered