Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoreferencing Copy Types #2111

Closed
wants to merge 5 commits into from
Closed

Conversation

cramertj
Copy link
Member

This RFC allows autoreferencing of T where T: Copy.

Example:

[derive(Copy, Clone)]
struct S(u8);

fn by_val(_: S) {}
fn by_ref(_: &S) {}
fn by_mut_ref(_: &mut S) {}

fn main() {
    let s = S(5);
    by_val(s);
    by_ref(s); // This works after this RFC.
    // by_mut_ref(s); // ERROR -- expected `&mut S`, found `S`
}

Rendered

@cramertj cramertj added T-lang Relevant to the language team, which will review and decide on the RFC. Ergonomics Initiative Part of the ergonomics initiative labels Aug 16, 2017

fn main() {
let mut x = 5;
let y = u8Ref(x);
Copy link
Contributor

@oli-obk oli-obk Aug 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this rather unexpected. I thought this RFC meant that Copy types are copied at the site and create a reference to the copy. That is how such a function call looks to me, I'm moving the value into the function, but since it's Copy, it will be copied and I retain the original.

Copy link

@repax repax Aug 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the owner of x, you would have to trust u8Ref(x) not to mutate x.
(sorry for the misunderstanding)

That x is borrowed is indeed surprising.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@repax: this RFC only allows coercion from T -> &T, not &mut T, so u8Ref can't mutate x.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the lifetime of the borrow was limited to a function call I think I would be less confused:

    let mut x = 5;
    let y = foo(x);
    //          ^ autoreference of `x` occurs here
    x = 7; // `y` is unrelated to `x` so this is ok

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cramertj is there a guarantee that we'll never get a Copy-able UnsafeCell like wrapper?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rust-lang/rust#25053 was active not that long ago, seems there's at least some support for making UnsafeCell implement Copy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the copy-ability of UnsafeCell is a somewhat distinct question. The last time we discussed it in depth, the conclusion was that we ought to address with a lint. i.e., we should have the ability to make things copy, but warn if they are actually copied. =) I still think this makes sense, and it would apply here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am torn about whether to treat this auto-ref as borrowing the actual variable. I too find it somewhat surprising; I think I might prefer that we always create a temporary (with the standard lifetime). It would certainly suffice for the cases that are motivating the RFC; it seems like those cases where it would not suffice are also kind of confusing and unexpected to me (i.e., a reference that lives longer than the current statement).

It also seems more compatible with possibly extending this "auto-ref" to types beyond Copy types. At least the way we had previously talked about that, we wanted foo(x) to be loosely equivalent to { let y = x; foo(&y) }, but with better error messages indicating that one can add an &x if one must retain ownership. I am still interested in thinking that over, but it seems incompatible with the approach of this RFC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm currently also leaning towards making a copy every time instead of borrowing. That seems like:

  • the only way to prevent the combination of Copy+UnsafeCell and this RFC from allowing "implicit mutability", which seems like a terrible footgun, and
  • the only way to prevent the "found T, expected &T" errors we get today from turning into borrow checker errors (which are typically much harder to understand and fix, no matter how good the error message)

If that extra copy is somehow undesirable, iiuc you can always add the explicit & even after this RFC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make an implicit copy, but promise that the optimizer will almost always eliminate that copy in practice, except in cases like UnsafeCell?

`Copy` types will autoreference: at
[coercion sites](https://github.com/rust-lang/rfcs/blob/master/text/0401-coercions.md#coercions),
when a reference is expected, but an owned
`Copy` type is found, an `&` will be automatically inserted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this will work?

trait Trait {}
impl Trait for u8 {}

let x: &Trait = 0u8;
// A reference to `Trait` is expected and `u8: Copy` is provided (but `u8 != Trait`) => a `&` is still inserted
// let x: &Trait = &0u8;
// Now coercion `&u8 -> &Trait` can apply.
// Nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my answer below, I would expect this to work since coercions are supposed to apply transitively.

@petrochenkov
Copy link
Contributor

petrochenkov commented Aug 16, 2017

The question about multiple references again. Does this work?

fn f(arg: &&&u8) {}

f(0u8); // Desugared into `f(&&&u8)`
// Note that reference types themselves `&T` are `Copy`.

This needs to be clarified in the RFC text.

@oli-obk
Copy link
Contributor

oli-obk commented Aug 16, 2017

Does this work?

&u8 is Copy. So this should work automatically, right?

@cramertj
Copy link
Member Author

cramertj commented Aug 16, 2017

@oli-obk @petrochenkov

&u8 is Copy. So [coercing u8 to &&u8] should work automatically, right?

Yes, but not all currently-existing coercions are applied transitively. The original coercions RFC says that they should be, though, so that is the behavior I would expect.

@est31
Copy link
Member

est31 commented Aug 16, 2017

Any use cases for wanting to pass a Copy value by reference instead of by value? The example given is not really convincing; an u8 is one byte and gets most likely passed via one register anyway.

Its also inconsistent with non Copy types which really should not get autoreferencing.

@cramertj
Copy link
Member Author

cramertj commented Aug 16, 2017

@est31 The concern here isn't the performance implications-- the concern is making it easier and more ergonomic to work with Copy types. While I agree this particular RFC is fairly minimal and doesn't result fix all of the examples outlined in the motivation section, the hope is to eventually augment this RFC with smarter generics-aware coercions. With that, cases like &&3 > 1 (the "filter" problem) could be solved using the coercion in this RFC (1 -> &1) and the existing AsRef coercions which allow &&T -> &T (see this playground example).

Although it's more of a long-term goal, and I've brought it up in the unresolved questions, I think that it would be good to provide some more explanation of this process in the RFC itself. I'll amend the RFC soon to add more details.

@est31
Copy link
Member

est31 commented Aug 16, 2017

making it easier and more ergonomic to work with Copy types

I know that the RFC intends to make passing Copy types per reference easier.

My question was: Why would you want to pass a Copy type per reference instead of per value?

cases like &&3 > 1 (the "filter" problem)

I'm not sure whether its good to allow comparing &T with T. In C this means something much different.

@cramertj
Copy link
Member Author

@est31

Why would you want to pass a Copy type per reference instead of per value?

Usually, you wouldn't. However, cases often show up in generic or autogenerated code where a function or type expects &T rather than T.

I'm not sure whether its good to allow comparing &T with T. In C this means something much different.

We already implement operators for &i32 and &i32 (example). In general, I think the intuition here is that &T shouldn't be thought of as a pointer to T, but as a "shared" T. Comparing pointers in C is more like comparing *mut and *const pointers in Rust.

@F001
Copy link
Contributor

F001 commented Aug 17, 2017

Is it possible to design this feature in a general manner, not limited in "auto-reference"?

I would expect a trait in std library which exposes this feature to be user-extendable. Such as:

trait AutoConvert<T> {
    fn convert(&self) -> T;
}

I wouldn't mind if it is a lang item. Whenever we need an "implicit type conversion", compiler will call this method.

For T: Copy types, we can convert it to &T type implicitly; For &str type, we can convert it to
Cow<'a, str> type implicitly; For T type, we can convert it to Option<T> type implicitly; and so on.

Maybe it is dangerous to allow users to define implicit type conversions.

`Copy` type is found, an `&` will be automatically inserted.

If the `Copy` value is a temporary, its lifetime will be promoted to meet the
required lifetime of the reference (where possible).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this sentence at odds with the behavior specific in RFC 66 etc. Specifically, the sentence as written suggests to me that we will infer the appropriate lifetime for the temporary, which would be quite at odds with the rest of the language. I think I would say it more like this:

If the value to be borrowed is an rvalue (e.g., a constant like 1), then a temporary will be created for it, just as if one explicitly wrote the & (e.g., &1). This temporary will have the same lifetime that would result from an explicit &, which generally means that it lives until the end of the innermost enclosing statement (until the next ;) -- but in some cases it may live longer. See RFC 66 and @nikomatsakis's amendment for more details.

Note that a consequence of this is that in some cases there will be implicit borrows that might be surprising. An example:

fn foo(x: &i32) -> &i32 { x }

let mut x = 22
let p = foo(x); // implicitly: borrows `x`
x += 1; // ERROR -- `x` is borrowed
print(p);

@leoyvens
Copy link

As has been mentioned before, we should consider the compatibility with the discarding ownership idea.

@Ixrec
Copy link
Contributor

Ixrec commented Aug 18, 2017

I think if we interpret this RFC as "make a copy and borrow the copy" (which I think we should anyway) then the discarding ownership idea is actually a strict superset of this proposal, since that idea is basically "do a move and borrow the move" for all types (and it's already the case that "moving" a Copy type means copying it). So I'd imagine we can easily do that later. Hopefully I'm not missing some important complication there.

@rpjohnst
Copy link

rpjohnst commented Aug 19, 2017

In addition to solving the UnsafeCell: Copy issue, and making this into a subset of the "discarding ownership" idea, the implicit copy idea also preserves the current performance characteristics of "I'm passing this value without a &, it should be copied." Further, it enables this to work not only with &T but also with &mut T without surprises.

`T: Copy`:
- `T` to `&mut T`: This conversion has the potential to introduce hidden
mutations. With this change, passing a variable to a function could allow
the function to change the value of the variable, even if no `&mut` is present.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As already mentioned elsewhere I think, using & does not exclude this. For example, Cell could well be Copy, and it is possible for a user crate to soundly implement a Copy Cell today.

I think this RFC should take the possibility of Copy interior mutable types into account.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically, this is not possible, because UnsafeCell is not Copy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't this get fixed recently? Or am I imagining it?

Copy link
Member

@RalfJung RalfJung Aug 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not Copy yet, but AFAIK there is any good reason for that. So this RFC should work under the assumption that UnsafeCell can be Copy.

EDIT: Also see rust-lang/rust#25053

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find @RalfJung's logic compelling -- we should consider interior mutability. That said, when we last discussed the Copy question, we had thought to add a lint to reflect the "can copy" vs "should copy" distinction (similar to "must use"). I thought this was a good plan, but it never happened. If it had -- and of course it still can! -- it could certainly apply here.

mutations. With this change, passing a variable to a function could allow
the function to change the value of the variable, even if no `&mut` is present.
- `&mut T` and `&T` to `T`: This conversion would cause extra copies to occur
which could be difficult to identify when attempting to optimize code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the fact that copying is already implicit in many places, I don't think this is a strong concern. I would argue that if copying a type is actually noticeable in cost, it should not be Copy -- then the clone calls will make it obvious where the cost is incurred.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree with everything you said, it seems like the PoR is for [i128; 1_000_000] to be Copy...

@nikomatsakis
Copy link
Contributor

@F001

Is it possible to design this feature in a general manner, not limited in "auto-reference"?

We've generally tried to avoid "open-ended" magic, preferring instead that overloaded operators be constrained by the type signature to do fairly harmless things. For example, the Deref trait requires you to return a reference, etc.

@cramertj
Copy link
Member Author

cramertj commented Aug 24, 2017

There have been a lot of interesting responses to this RFC. I'll do my best to summarize and respond here.

Copying values first

Many people reading the RFC expected to see values copied before being autoreferenced. Initially, I was opposed to this idea since it seems counter-intuitive to some of the RFC's goals, such as making it easier to work with (potentially large) Copy types and functions that take them by reference. Functions that take slices (&[T]) or BigNum types often take their arguments by reference just to avoid a memcpy, This RFC would make such APIs easier to work with as it would allow the user to avoid having to sprinkle & throughout their code.

Additionally, introducing removable copies could make it hard for users to optimize their code. If a user wanted to prevent excess copies, they would need to search each function call to check where items have been passed by-value that could have been passed by-reference. This seems like a frustrating and error-prone process.

That said, copying values first would avoid the problem of unexpected borrows. Personally, i don't find this to be too motivating, as I think such cases would be rare in practice, and good error messages would make them easy to fix.

Interaction with UnsafeCell

One thing I didn't consider when writing this RFC was the possibility that UnsafeCell could become Copy. This would allow Copy types to have interior mutability. This would put a hole in this RFC's assertion that &T is functionally equivalent to T when T: Copy. In particular, this case provides strong motivation for introducing a copy prior to autoreferencing values.

Guaranteed Optimization

One proposed solution is to provide some sort of guarantee that the copy is optimized out where possible. I believe this should be possible if and only if

  • the value does not implement Drop and
  • the value does not contain an UnsafeCell
  • a borrow can be introduced without either:
    • extending the lifetime of the value or
    • overlapping with an existing mutation or mutable borrow

I'm interested in what we can do here-- is it possible to guarantee these types of optimizations and to provide a warning whenever the optimization isn't possible? I think that would be the best of both worlds. How do you all feel about this idea? Does anyone have a use case that would be notably worse or more confusing under this change?

Interop with more liberal "discarding ownership" proposals

There have been other proposals to allow autoref of non-Copy values. Under this solution, non-Copy values would be moved when passed to a function and would be unusable afterwards, even if the function itself only needed a reference. Personally, I'm opposed to this idea as it would make it harder to determine where a move could be prevented, and users would wind up having to sort through documentation or sprinkle around & in order to see if they could prevent a move from occurring. It's possible we could introduce a warning to help users fix their code, but introducing a new feature and then warning against its use feels dubious to me. It's possible this would help with the initial edit-compile-debug cycle by making time-to-first-compilation faster, but eventually users are going to have to go back and fix up the warnings.

Please let me know if I've missed something, and I'll do my best to follow up as soon as possible.

@oli-obk
Copy link
Contributor

oli-obk commented Aug 24, 2017

Additionally, introducing removable copies could make it hard for users to optimize their code. If a user wanted to prevent excess copies, they would need to search each function call to check where items have been passed by-value that could have been passed by-reference. This seems like a frustrating and error-prone process.

This already happens now. If there's a generic function taking a Deref<Target = [T]> AsRef<[T]> and you pass it an array of Copy elements by value, then the array is copied. I think it would be totally sensible to add some configurable lint (maybe in clippy) that complains when you copy objects bigger than the word size of your target. You could then turn on this lint in the performance critical regions of your code. There could also be a restriction lint which complains about every situation where an autoreferencing is happening (basically reverting this RFC on a by-crate basis with a lint).

@eddyb
Copy link
Member

eddyb commented Aug 24, 2017

If there's a generic function taking a Deref<Target = [T]> and you pass it an array of Copy elements by value, then the array is copied

Uhm, no, Deref has nothing to do with array -> slice unsizing. That's Unsize<[T]>.

@est31
Copy link
Member

est31 commented Aug 24, 2017

Agree with @oli-obk . C++ has copy by default everywhere, and it sucks bad for performance. Rust therefore has move by default everywhere. But there is one exception for Rust, types that implement Copy.

If you pass copy types per-value they get copied, not moved. This is a decision done in a trade off between ergonomics and performance.

The proposed copy-then-autoref behaviour is the same behaviour we have right now for by-value passing of arguments:

fn bar(_v: [u32; 64]) {
}
fn foo() {
    let a = [0; 64];
    bar(a); // Creates a copy of a
    bar(a); // Creates a second copy of a
}

With copy-then-reference, this would look like:

fn bar(_v: &[u32; 64]) {
}
fn foo() {
    let a = [0; 64];
    bar(a); // Creates a copy of a, then references it
    bar(a); // Creates a second copy of a, then references it
}

I definitely support a guaranteed optimisation or a lint that warns you if no optimisation could be performed, its a great idea. But I think its not really neccessary, mainly because the main motivation for this RFC seems to be to make generic interfaces that take references more ergonomic to use for Copy types, e.g. the usual "indexing HashMap<u32,T>" case.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Aug 24, 2017

So @eddyb was talking to me on IRC this morning. He was advocating for a more general approach to autoref -- in particular, foo(x) could be transformed to foo(&x) if fn foo(x: &T) and the type of T is Freeze (meaning roughly "no unsafe cell"). The idea is essentially that you can do an (implicit) shared borrow so long as no values will be changed through that borrow.

I have long been against this, in part stemming from prior experience where I found it helpful to have an idea of where moves occur without having to look at the definition of callees (this was back in the days of "cross-borrows", in which we would silently coerce ~T (now Box<T>) to &T).

However, people often bring it up, and talking to @eddyb I started to wonder if I was being too harsh. It would be nice, after all, if I could do hashmap.get(some_string), just as I would like to be able to do hashmap.get(22). And it would be nice if I could use that string afterwards. =)

I do think that this starts to erode a certain clarity that we have around moves and borrows. It might impede learnability -- this is a bit hard for me to determine. I know that when I teach about moves, I almost always get questions about why format!("{}", foo) does not move foo -- which is evidence that people are recognizing the pattern right away. Right now, I tell them "it's a macro and expands to wacky code", but maybe saying "when we learn about borrows, you'll see why" would be a suitable answer as well.

(I've also not thought through how this interactions with generics and other coercions. Certainly not knowing, in general, what is a move and what is not might make lots of stuff more complicated.)

Regardless, it's worth considering as another way to build on this RFC. If we wanted to go this way, then it would suggest some value towards not making foo(x) be equivalent to { let y = x; foo(&y) } -- and that we would want to impose some rules around UnsafeCell.

@est31
Copy link
Member

est31 commented Aug 24, 2017

@nikomatsakis I'm mostly supportive of this RFC as it limits auto-referencing to a small subset of cases, but I think extending this beyond Copy types and creating references without asking for all types except a small minority would be really bad. Being explicit about the difference between references and non-references seems part of what a systems language is for me. As someone with a C/C++ background, I think it was one of the mistakes of C++ to have introduced references without making creation of them happen with any discernible syntax.

That being said, I do definitely favour adding a Freeze trait to the language and letting people play with it :).

@RalfJung
Copy link
Member

I think it was one of the mistakes of C++ to have introduced references without making creation of them

I think the reason this was a mistake is that it lets the callee change the value of my variable. That's not the case here, it would only be done for shared references and types with no interior mutability. For me, that makes all the difference.

@leoyvens
Copy link

Operators are traits, coercing on them would be doing the same thing we do with method receivers which is quite a slippery slope.

I'd rather wait for a better specialization that allows us to make Borrow both transitive and reflexive, and then write:

impl<T: Borrow<i32>, U: Borrow<i32>> Add<T> for U

@arielb1
Copy link
Contributor

arielb1 commented Sep 11, 2017

Operators are traits, coercing on them would be doing the same thing we do with method receivers which is quite a slippery slope.

That's exactly why we should coerce on them - they are similar enough to method receivers, where we already do coercions.

I'd rather wait for a better specialization that allows us to make Borrow both transitive and reflexive, and then write:

Transitive traits are far off in the future - I'm not aware of any plans to implement them.

@leoyvens
Copy link

leoyvens commented Sep 11, 2017

Transitive traits are far off in the future - I'm not aware of any plans to implement them.

Thanks for the correction, that was a misconception on my part. If we can't specify it in the impl then auto-derefing and auto-refing for operators is an interesting idea. I see it as complementing this rfc since there is no intersection between what that solves and what this rfc solves.

EDIT: But with intersection impls we could have Borrow be reflexive and have that same impl we have for AsRef that makes it "pass through" references, right? This with impls like the one I mentioned above would solve operators for the common cases of up to two indirections, seems like a reasonable compromise.

EDIT2: Also if we start coercing operators then we can now add two Strings which is a controversy by itself.

@cramertj
Copy link
Member Author

One motivation for this RFC was that if/when we add coercion to operators, copy types will autoref in operators. I addressed this in the unresolved questions section of the RFC.

@arielb1
Copy link
Contributor

arielb1 commented Sep 11, 2017

@cramertj

The reason the LHS of operators isn't a coercion site is because it breaks expected type propagation (it's actually very close to a standard "generics break coercions" case). The RHS of operators is currently a normal coercion site, but we could change that somewhat.

@withoutboats
Copy link
Contributor

withoutboats commented Sep 11, 2017

I somewhat agree with @arielb1, and I'm worried we may not have taken a full view of all the trade offs in this decision.

There are four obvious points we could sit on in the space of "automatically referencing arguments":

  1. We never do it (today)
  2. We do it for Copy types only (this RFC)
  3. We do it for all types, semantically moving them
  4. We do it for all types, not semantically moving them

All of these have downsides. Its not obvious to me that this RFC is the best position (right now, any of them but the last seems potentially viable to me).

I believe we went with 2 instead of 3 here because of this concern: Let's say I do map.get(string);, but now I want to use string. The concern is I won't realize I can do map.get(&string);, and instead I'll think I have to map.get(string.clone());. Copy types don't have this problem.

That's a valid concern, but its also a concern to me that we're going to exacerbate the existing pedagogical difference between Copy types and non-Copy types.

@leoyvens
Copy link

@arielb1 Woah, that's a surprise for me. I noticed that if you add another impl for the same type, the compiler will no longer coerce and instead error with "trait not implemented", as in this example. Is this all by design?

@withoutboats Again the point here is not to make an absolute decision but to experiment with a minimal version so that we may be better informed to decide where we want to sit. This RFC is not simply option two, it's precisely the intersection of the last two options.

@withoutboats
Copy link
Contributor

@leodasvacas I believe each of those is actually forward compatible with all subsequent items (that is, we could also do 3, and relax it to 4, without breaking any code).

@leoyvens
Copy link

leoyvens commented Sep 11, 2017

@withoutboats I agree that semantically moving is forwards compatible with not semantically moving. However semantically copying is forwards incompatible with not semantically copying. Consider that foo(&{copy}, &mut copy) will compile but foo(&copy, &mut copy) will not compile. Not semantically copying with the restrictions specified in the RFC is forwards compatible with semantilcally copying.

@GuillaumeGomez
Copy link
Member

GuillaumeGomez commented Sep 13, 2017

I don't like this (because I don't like implicitness).

@tomwhoiscontrary
Copy link

Reading these comments has been really interesting. Great stuff, everyone!

This feature is clearly useful. However, it's also clearly not completely thought through to its final ramifications, including interactions with other potential changes. Postpone? eRFC? Extended period behind a feature gate?

@nikomatsakis
Copy link
Contributor

@withoutboats

I somewhat agree with @arielb1, and I'm worried we may not have taken a full view of all the trade offs in this decision.

I too am wary. I think that there are obvious pain-points being addressed here, but it's also not obvious that this is the most elegant fix. Also, interestingly, I've been observing the type errors I get lately, and it seems that most of the time the error I get is actually when I have a &T but the function wants a T (and T is something lightweight to copy). That is, the opposite of the problem that this RFC solves. Not to say that I don't get errors around "extra ref needed" too.

That said, I did propose to merge, and I did so primarily because I felt like this is the kind of thing where we need to gain some "real-time" experience. Now I'm rethinking that and wondering if this is a case where an eRFC would be appropriate, but one with a broader scope: that is, basically, some kind of experimentation aiming to reduce the problem of "&T vs T" mismatches writ large.

That would probably cover all of the above:

  • Some kind of lint to avoid accidental copies of types, even copy types (e.g., I probably didn't want to copy that [u8; 1024]).
  • Autoderef/autoref in operators (Allow autoderef and autoref in operators #2147).
  • Autoref in function arguments (this RFC but more general).
  • Autoderef in function arguments (e.g. allowing foo(&2) when fn foo(x: usize)).
  • Maybe even autoclone (e.g., particularly with Rc and Arc, it's annoying to have to type clone sometimes, but sometimes you really want to actively avoid those ref-count adjustments).

The goal would be to try out different models (on a strictly opt-in, experimental basis, obviously) and explore the tradeoffs. Ideally we would have @joshtriplett actively involved =) since he seems to have lots of scenarios in mind where the difference between ref and not-ref is vital to performance (I'd like to see examples and try to run various proposals across them and see what happens).

The eRFC process seems suitable because we could come back with a proposed set of rules, once we have more data and a better feeling for what they ought to be.

There are four obvious points we could sit on in the space of "automatically referencing arguments" ... I believe each of those is actually forward compatible with all subsequent items (that is, we could also do 3, and relax it to 4, without breaking any code).

Well, the code will compile, but I still don't think all those variants are forwards compatible. In particular, there are subtle changes with respect to when destructors run. For example, if we say that foo(x) is "semantically moving" (what you called number 3), then changing that to semantically not moving means that the destructor will run at a different time.

@est31
Copy link
Member

est31 commented Sep 14, 2017

this RFC but more general
Maybe even autoclone

@nikomatsakis I think that's entirely the wrong direction to head into. Sure, experimentation is great, but I already know now that I don't want these things in the language, ever.

If you are trying to teach Rust to people who don't even know the difference between T and &T then removing that difference in a few subtle cases where its barely noticeable won't help with those people learning the language! And for autoclone... that's just a tiny little papercut and its definitely not worth the special case and added complexity.

@aturon
Copy link
Member

aturon commented Sep 14, 2017

@nikomatsakis I wholeheartedly approve of this plan for experimentation -- it's essentially what I was hoping to do for coercion/generics interaction as well. I think it's clear that there are some real problems here and some plausible solutions (not just in this RFC), but hands-on experience would be helpful in pinning down the tradeoffs.

There is a procedural question here: the RFCs this would apply to currently specify substantial technical detail. I'm not sure that we'd want to merge them as-is, but I also don't think it makes sense to write a fresh eRFC at this late date. I wonder if the right approach is "Closed as experiment-first" or some such?

@leoyvens
Copy link

@est31 The difference between T and &T is sometimes meaningful and sometimes meaningless. The difference between &i32 or i32 is mostly meaningless, but if the code is generic taking and passing &T is the right thing to do. Then we end up with the iterator API which does the right thing but is annoying when iterating over simple integers. And how do you teach iterators? With integers of course, Rust makes a bad first impression when you type **x > 0 or 2.pow(**x).

In my experience teaching rust, people understand T vs &T, but it's hard to justify the juggling when the difference is not useful. Easing that would certainly help early beginners. If they stick with the language the difference may come back to bite them, but at least the learning curve will be smoother.

@nikomatsakis
Copy link
Contributor

@aturon

There is a procedural question here: the RFCs this would apply to currently specify substantial technical detail. I'm not sure that we'd want to merge them as-is, but I also don't think it makes sense to write a fresh eRFC at this late date. I wonder if the right approach is "Closed as experiment-first" or some such?

I would be ok with closing the RFCs as "approved for experimentation" and opening a tracking issue that has links to both of them for posterity (or, maybe even better, opening up a GH repo -- perhaps in rust-lang-nursery -- that includes their text, and where future plans can be laid, but I'm not sure about that). (One thing i'm not clear about, incidentally-- if @arielb1 or @cramertj were to delete their GH accounts, presumably, it would mean the text of these RFCs disappears?)

@Storyyeller
Copy link

@leodasvacas

&i32 vs i32 doesn't make much difference semantically, but it does have performance implications. Apart from that, it's useful to be able to build a mental model of how things work, even if it doesn't capture every detail of the actual implementation. For example, in Python, my mental model is that every value is a pointer to a structure that contains a hashmap of properties and methods. In Java, everything is a pointer except for the primitive types. I think when you try to confuse these, it could cause problems.

@repax
Copy link

repax commented Sep 14, 2017

&i32 vs i32 doesn't make much difference semantically, but it does have performance implications.

I wonder if not LLVM is smart enough to optimise away references to primitive types in local expressions (e.g. iterators). If so, then this isn't so much of a problem. For complex T types references are probably more performant anyhow.

@nagisa
Copy link
Member

nagisa commented Sep 15, 2017 via email

@RalfJung
Copy link
Member

@nagisa Rust already assumes that successful reads are side-effect-free. The compiler is free to insert spurious reads or reorder reads around other effectful operations, and it tells LLVM so.

In corner cases like you describe, you already have to use more explicit operations.

@nagisa
Copy link
Member

nagisa commented Sep 15, 2017 via email

@RalfJung
Copy link
Member

RalfJung commented Sep 15, 2017

that does not prevent such a reference from being automatically dereferenced by user mistake.

The compiler is free to insert spurious reads. For example, if you write

fn foo(x: &i32) {
  if some_condition { let x = *x; ... }
}

then it is fine for this code to be translated into

fn foo(x: &i32) {
  let x = *x;
  if some_condition { ... }
}

So, the guarantee you are trying to protect against is already non-existent. (Or, if LLVM doesn't do this despite all the annotations we give it, the guarantee is non-existent in many of the Rust semantics that are being proposed.)

@nikomatsakis
Copy link
Contributor

@rfcbot fcp postpone

OK, having thought more on this, I feel better and better about moving towards an experimentation model here, as I described earlier. Therefore -- although FCP completed, I am going to make an alternate proposal, and move to postpone this RFC, with the intention of immediate experimentation.

In other words, the intention is to do the following:

  • Close this RFC for now; to preserve its contents, we can move the current draft into a gist.
  • Open a tracking issue for the experimentation phase on the compiler website
    • Link this RFC along with others that describe potential experiments (e.g., operator autoref)
  • Let people implement those under a feature-gate (or multiple feature-gates)
  • Gain experience and in particular seek out troublesome code examples where we fear it may be too easy to produce inefficient or broken code

@aturon
Copy link
Member

aturon commented Sep 15, 2017

@nikomatsakis you have to cancel the existing FCP before proposing a new one.

That said: given that this decision effects multiple in-flight RFCs, and that we don't plan to open a separate eRFC (rather, just a direct tracking issue), I'm going to go ahead and close out the relevant RFCs and get the tracking issue set up.

Thanks, all, for the illuminating discussion so far! Hopefully trying out some of these ideas behind a flag will give us additional clarity on the tradeoffs, before we come back with a fresh RFC.

@aturon
Copy link
Member

aturon commented Sep 15, 2017

Experiment tracking issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ergonomics Initiative Part of the ergonomics initiative final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet