New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Generalized Type Ascription #2522

Open
wants to merge 49 commits into
base: master
from

Conversation

@Centril
Copy link
Contributor

Centril commented Aug 10, 2018

🖼️ Rendered

📝 Summary

This RFC supersedes and subsumes RFC 803. It generalizes existing type ascription in let bindings to everywhere a pattern can occur and makes some changes to ascription in expressions. You may now for example write:

let x = (0..10)
    .map(some_computation)
    .collect() : Result<Vec<_>, _>
    .unwrap()
    .map(other_computation) : Vec<usize>
    .into() : Rc<[_]>;

let alpha: u8 = expr;
    ^^^^^^^^^

let [x: u8, y, z] = stuff();
    ^^^^^^^^^^^^^

if let Some(beta: u8) = expr { .. }
            ^^^^^^^^

for x: i8 in 0..100 { .. }
    ^^^^^

fn foo(Wrapping(alpha: usize): Wrapping<usize>) {}
       ^^^^^^^^^^^^^^^^^^^^^^

Here, the underlined bits are patterns.

Finally, when a user writes Foo { $field: $pat : $type }, and when $pat and $type are syntactically α-equivalent, the compiler emits a warn-by-default lint suggesting: Foo { $field: ($pat : $type) }.

💖 Thanks

To @nrc, @kennytm, @varkor for reviewing the draft version.
To @scottmcm in particular for reviewing and being my rubber duck wrt. type inference.

Edit: Direct link to the pFCP checkboxes: #2522 (comment)

Centril added some commits Aug 6, 2018

Centril added some commits Jan 3, 2019

Merge pull request #20 from haslersn/rfc/generalized-type-ascription-…
…if-let-example

rfc, generalized-type-ascription: add `if let` example.
### Expressions

The operational semantics and type checking rules for type ascription in
expression contexts is *exactly* as specified in [RFC 803].

This comment has been minimized.

@haslersn

haslersn Jan 5, 2019

Note: This is (IMO positively) affected by #2623.

@graydon

This comment has been minimized.

Copy link

graydon commented Jan 12, 2019

Approve, assuming nothing breaks (devil is always in details) this is just finishing a thing that ought to have been finished years ago and, as the doc mentions, many people and error messages already assume works this way.

@@ -1067,57 +701,6 @@ to:
let : LET pat maybe_init_expr ';' ;`
```

Finally, the grammar of function definitions is changed:

This comment has been minimized.

@scottmcm

scottmcm Jan 15, 2019

Member

Seems like it's at least worth keeping some of this, right? Because the grammar is no longer pat:ty, it's pat, with a semantic requirement that it be ascribed at the top level?

This comment has been minimized.

@Centril

Centril Jan 15, 2019

Contributor

I changed it back to pat: ty. I'd love for the actual grammar to be pat and then do a semantic check since that makes it possible to improve error messages (by running full global inference) and it makes it possible to use it for proc macro attributes (this would be nice for #[proptest]).

Is @joshtriplett OK with that tho?

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 15, 2019

@rfcbot concern ref-binding-pat

So I finally made time to read the RFC. Thanks @Centril for your attention to detail etc. Beautiful as ever. I'm going to post a few comments/concerns. Let me start with a nit:

The discussion talks about the relative precedence of "ref patterns" and type ascription and gives a few examples that don't, I think, quite match for Rust works today. The way I understand ref bindings and so forth, anyhow, is that the ref keyword is part of the binding. That is, the RFC discusses ref (x: T) as if it were a possibility, which implies that you have a grammar production like Pat = 'ref' Pat, but actually I think it is more like Pat = 'ref' IDENT. So ref (x: T) just cannot be, and instead ref x: T must be parsed as (ref x): T. Note that this implies that the type of x is in fact &T. This matches code like let ref x: u32 = 5, which is legal (and produces a x of type &u32).

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jan 15, 2019

@nikomatsakis I think you are right; @haslersn helpfully pointed out the same error here. Happy to see we've come to the same conclusion; I'll revise the RFC in a bit... :)

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 15, 2019

@rfcbot concern where-does-coercion-occur

Next concern: the RFC is mostly silent, from what I can tell, on the topic of where coercions can occur. But we know it's very important. I think we should get to the bottom of this. In general, I'm open to an argument that this need not block the RFC and we can work it out during implementation, but I wanted to raise a few questions and see them noted somewhere.

In the previous coercion RFC, we uncovered a soundness concern around references and coercions. I didn't find any direct mention of this concern in this RFC, though it does include a mention of temporaries and a reference to the old RFC that might've been alluding to the problem. But I think we should resummarize it.

In a case like &(x: T), it is very important to be clear about whether this is producing a reference to the variable x or to a temporary containing the value of x coerced to T (the old RFC said that it is producing a reference to the variable x, which implies that -- in "borrowing" contexts like this -- : must act as "type equality" and not permit coercions). We need to work out the full details of this (e.g., consider (x: T).foo() -- here, there could be auto-refs, so what do we do?), but this may be best done through implementation and after the fact documentation.

Similarly, this sort of case triggers coercions today:

let x: T = ...

But it's not clear to me if we intend for a case like this to trigger a coercion, I presume not:

let Foo(x: T) = ...

Allowing coercions on "inner type ascriptions" like this feels like a big "shift" to me in terms of how pattern matching works, since it would mean coercing "mid-match" -- I'm not even sure how this would work for matches with multiple arms:

match foo {
    Some(x: T) if ... => { /* do we coerce or not?? */ }
    Some(x)=> { .. }
    None => { ... }
}

UPDATE: It occurs to me that another distinction of top-level type ascriptions from "inner ones" might have to do with rust's bidirectional type checking. We use top-level type ascriptions today as a kind of "hint" when checking the initializer expression for a let. Then we go and use the type of that expression when checking patterns. We would have to make a big shift here is we were going to use the "inner" type ascriptions in pattern matching in the same way -- and in fact I think it is incompatible with how the "binding mode" stuff works, since that relies on knowing the type of the value being matched to know when to insert implied & patterns.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 15, 2019

@rfcbot concern what-do-ascripted-types-constrain

It should be noted that we've actually implemented, as part of the NLL work, a somewhat generalized form of pattern type ascription similar to this. Doing so raised a number of interesting questions about just what these ascripted types actually mean.

To start, a given pat: T ascription T seems to play two roles. First, it acts as a kind of upper bound on the type of the value being matched (here I am assuming that any coercions take place long before). So if you have let pat: T_pat = , then basically the type T_exprof the expression (after coercion) must be a subtype ofT_pat: T_expr <: T_pat`.

But the ascription also serves as a declaration of the type of bindings within pat. So let x: &'static u32 = ... means that the type of x is &'static u32, not "some supertype of &'static u32". Similarly let (x,): (&'static u32,) = .. seems to imply that the type of x is &'static u32.

So the ascribed kind of propagates two ways: first, it bounds the expression, but then it propagates into the pattern as well, where it serves as a precise bound on the type of any bindings we encounter.

Note that in some patterns, you may not have any bindings at all (e.g., let _: T = ...), and so this doesn't come into play. In other cases, not all parts of the type may be relevant to bindings -- e.g., in let (x, _) = (u32, f32), the f32 part is irrelevant.

But this raises some questions. Honestly I forget some of them, but here is an obvious one: What does it mean if there are two nested ascriptions? For example, let (x: &'a u32): &'static u32 = ...? This isn't really a problem for the "supertype" part of the bound (i.e., the bound on the expression), but it's less clear what the type of x should be here -- the innermost ascription?

I guess the second case is whether or not there are cases where we cannot determine the type of bindings given the outer ascription. I know we have some FIXMEs left in the code in this area.

UPDATE: As noted in this comment, we also need to indicate how default binding modes affect things here.

@cramertj

This comment has been minimized.

Copy link
Member

cramertj commented Jan 15, 2019

Note that the ref-binding issue also applies to default-binding-modes, leading to things like this:

fn take_u32(_: u32) {}

let (x: u32, y: u32) = &(1, 2);
take_u32(x); // ERROR: expected `u32`, found `&{integer}`, consider dereferencing

I don't think this is bad enough to convince me that default-binding-modes was a bad decision or that we shouldn't also have type ascription, but it's definitely a pretty confusing and unfortunate combination of features.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 15, 2019

@cramertj

I guess this is another indication of something that wasn't very well specified in the RFC. I had sort of assumed that

let (x: u32, y: u32) = &(1, 2)

would be an error, because it would imply that the type of x must be u32 (when in fact it is &u32).

But now that you mention it, it seems pretty important to clarify how the "binding modes" interact with both parts of the type ascription. I can't find the text now, but I feel like the RFC implied to me that the type ascription expected the & from the a binding mode to be made explicit.

@cramertj

This comment has been minimized.

Copy link
Member

cramertj commented Jan 15, 2019

@nikomatsakis

I feel like the RFC implied to me that the type ascription expected the & from the a binding mode to be made explicit.

To be sure I understand, you'd expect that all of the following would work?

fn u32_refs(_: &[&u32]) {}

let (x: &u32, y: &u32): &(u32, u32) = &(1, 2);
u32_refs(&[x,y]);
let &(ref x: u32, ref y: u32): &(u32, u32) = &(1, 2);
u32_refs(&[x,y]);

struct T { x: u32 }
let T { x: &u32 }: &T = &T { x: 5 };
u32_refs(&[x]);
let &T { ref x: u32 }: &T = &T { x: 5 };
u32_refs(&[x]);
@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 18, 2019

@cramertj yes. specifically, when I see x: &u32, I expect that to mean "the type of x will be &u32". But I'm mildly discomforted by the idea that the type of the value being matched is sort of "modified" to account for the &, in part because it feels somewhat inconsistent with how (e.g.) the type of foo.bar is the type of the field bar, even if Foo has type &Foo (but that inconsistency is basically because we don't have a similar "default binding mode" on field access)

@cramertj

This comment has been minimized.

Copy link
Member

cramertj commented Jan 18, 2019

That seems at odds to me with ref x: u32 binding x to &u32, but I suppose that's just one more reason for folks to stop using ref, and one more reason for me to open an RFC to make it so that let (&x: &u32): &(u32,) = &(5,); // typeof(x) == u32 works.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Jan 20, 2019

@rfcbot concern confusing-code
@rfcbot concern conflict-with-named-arguments

I basically think all of the drawbacks this RFC enumerates are very compelling, and overwhelm the motivation in my opinion. I am not in favor of allowing type ascription (at least, not with this syntax) in arbitrary expression and pattern positions. I would be in favor of removing the feature from nightly and "rescinding" the previous RFC.

I'd add that in addition to the confusion around struct syntax that the RFC lists (as "sub-optimal experience in named fields"), I think the confusion that can arise when used in the middle of method chaining is also troubling:

foo.iter().map(|x| x.bar()).collect(): Vec<_>.as_ref()

I think everyone agrees that this code should be formatted differently, but I see it as a draw back that this RFC makes it valid at all.

I would be in favor of extending type ascriptions to more specific locations that have these qualifications:

  1. Like the current use of type ascription on function arguments and let bindings, these positions do not strongly imply (which is not to say they could not imply at all) that type ascription is generalized. For example, I would consider allowing type ascription after any function call to strongly imply that type ascription can be in any expression, because function calls do not differentiate themselves strongly from other expressions. In contrast, the current positions of type ascription do not strongly imply (in my opinion) that any pattern or expression can be type ascribed, at best they softly imply it.
  2. There's no potential for confusing code by applying type ascriptions in these locations.
  3. There's no conflict with potential syntaxes for named arguments.

That is, I am in favor of adding type ascriptions in positions that don't have the drawbacks of generalized type ascriptions but also don't lead users to be very surprised that type ascription is not permitted in every position.

Some positions that I think this applies to:

  1. if let $pat: **$ty** = $expr { }
  2. while let $pat: **$ty** = $expr { }
  3. match $expr: **$ty**
  4. for $pat: **$ty** in $expr: **$ty**

The headers of control flow constructs, in other words.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Jan 20, 2019

I also don't think this RFC is of high enough priority to the Rust road map to devote a lot of attention to reaching consensus. If people don't feel devastated by it, I would propose accepting a subset like I just proposed under a new feature flag which we can probably reach consensus on quickly to stabilize and postponing further action.

@mark-i-m

This comment has been minimized.

Copy link
Contributor

mark-i-m commented Jan 21, 2019

Unfortunately, my primary use cases for this are (a) “type debugging” and (b) disambiguation for generic function calls. And I can’t recall any time when I’ve wanted ascription in the positions you’ve mentioned.

@burdges

This comment has been minimized.

Copy link

burdges commented Jan 21, 2019

I'd think the identity function #2306 already provides this ala x.identity::<Foo>().bar() right?

Also, I'd think "trait ascription" might help address additional concerns, so let f : impl FnOnce(Foo) -> Bar = |x| { .. }; but not sure if it makes sense.

@scottmcm

This comment has been minimized.

Copy link
Member

scottmcm commented Jan 21, 2019

@burdges identity is a function, not a method. And identity::<Vec<T>>(x.collect()) is definitely not better than x.collect::<Vec<T>>(), so I don't see it as helping with (b).

@withoutboats Can you elaborate on whether your feelings are any different between patterns and expressions? I was pondering another possible subset where the bindings in patterns could be ascribed, since those are already slightly special (see mut), to keep the Ok(x: i32) => arms.

@comex

This comment has been minimized.

Copy link

comex commented Jan 21, 2019

foo.iter().map(|x| x.bar()).collect(): Vec<_>.as_ref()

I think everyone agrees that this code should be formatted differently, but I see it as a draw back that this RFC makes it valid at all.

I agree, but I think the solution is to remove the paragraph about making it valid; I don't think it's essential to this RFC.

@valff valff referenced this pull request Jan 21, 2019

Open

Resolve `await` syntax #57640

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Jan 21, 2019

Can you elaborate on whether your feelings are any different between patterns and expressions?

My concern with patterns is with struct patterns, where users will get weird behavior like trying to write Foo { field: i32 } and getting a field bound to i32. Otherwise, I'd agree that allowing a type ascription on bindings does not seem problematic.

disambiguation for generic function calls.

The reason you can't use turbofish is that the parameter is on the trait, right? That's been my experience, and its essentially always been Into (maybe sometimes AsRef). And I find that I always do the same thing: I pull that part of the expression into its own binding. In my opinion, this has improved my code: if the compiler can't figure out the types of an intermediate into conversion, this expression is probably doing too much.

@mark-i-m

This comment has been minimized.

Copy link
Contributor

mark-i-m commented Jan 21, 2019

The reason you can't use turbofish is that the parameter is on the trait, right? That's been my experience, and its essentially always been Into (maybe sometimes AsRef). And I find that I always do the same thing: I pull that part of the expression into its own binding. In my opinion, this has improved my code: if the compiler can't figure out the types of an intermediate into conversion, this expression is probably doing too much.

Yep, I think that describes my experience as well, butIterator::collect is the method that I bump into most often, and I think we might disagree that creating a new binding is a good solution.

  • Creating a new binding sometimes persists a borrow or literal value longer than I would like, which means creating even more bindings or adding nested blocks.
  • Creating a new binding means there is a variable in scope that has no meaningful interpretation; it’s just the state of computation at the point where the compiler needed a type hint.
  • The type for the conversion is often either (a) obvious from the context (e.g. if I’m using Vec everywhere in the rest of the expression) or (b) irrelevant (I don’t care what type is used for some intermediate state; I just want something iterable that owns its data). In both of these cases, additional bindings don’t help much. They just disrupt the flow of combinators. The reader of the code is not aided by them.
  • Creating new bindings make refactoring/code changes a pain. A common workflow for me is this: I write something simple with Vec but need to add a binding somewhere to help the compiler. Then, I realize I want change the format of input data slightly and I need to insert a new combinator somewhere. Now the original binding I introduced is no longer necessary, but a new one is needed somewhere else.
@burdges

This comment has been minimized.

Copy link

burdges commented Jan 21, 2019

We could easily add a trait

pub Identity { fn identity(self) -> Self { self } }
impl<T> Identity for T {}

that provided this functionality for type debugging. We should really have traits like this for asserts and things anyways.

There is no substantial improvement in writing x.collect(): Vec<T> instead of x.collect::<Vec<T>>() obviously. I agree however that foo(...) : Vec<T> could be far clearer than foo::<_,_,_,Vec<T>,_,_>(...), but actually foo(...).identity::<Vec<T>>() works find in such cases.

@mark-i-m

This comment has been minimized.

Copy link
Contributor

mark-i-m commented Jan 21, 2019

TBH, using identity for this feels like a bit of a hack to me.

@alexreg

This comment has been minimized.

Copy link

alexreg commented Jan 21, 2019

There is no substantial improvement in writing x.collect(): Vec<T> instead of x.collect::<Vec<T>>() obviously. I agree however that foo(...) : Vec<T> could be far clearer than foo::<_,_,_,Vec<T>,_,_>(...), but actually foo(...).identity::<Vec<T>>() works find in such cases.

When partial turbofish finally arrives, this is a moot point though. I'm not sure of the status of this however. @Centril?

@comex

This comment has been minimized.

Copy link

comex commented Jan 22, 2019

We could easily add a trait

pub Identity { fn identity(self) -> Self { self } }
impl<T> Identity for T {}

That still would not allow .identity::<i32>() because turbofish on a method only allows specifying parameters on the method itself, not the trait the method came from. You'd have to do something hacky like

pub trait Is<T> {}
impl<T> Is<T> for T {}
pub trait HackyIdentity: Sized {
    fn identity<T>(self) -> Self where Self: Is<T> {
        self
    }
}
impl<T> HackyIdentity for T {}
@mark-i-m

This comment has been minimized.

Copy link
Contributor

mark-i-m commented Jan 24, 2019

I still prefer generalized ascription, but one alternative that solves part of the motivation of this issue might be to allow turbofish to be used for trait methods. For example something like this (syntax is bikeshedable):

trait Bar {
  fn bar<T, U, V>(args: (T, U, V));
}

foo.bar::<as Bar, T, U, V>(args);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment