Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Permit _ in type aliases #2524

Open
wants to merge 8 commits into
base: master
from

Conversation

Projects
None yet
9 participants
@Centril
Copy link
Contributor

Centril commented Aug 19, 2018

🖼️ Rendered

📝 Summary

Permit type aliases and associated types to have their types be inferred in such a way that their nominal types be transparent as opposed to opaque like so:

type Foo = _;

impl Iterator for Bar {
    type item = _;
}

You may also optionally constrain a type alias or associated type with a bound by writing type Alias: Bound = <definition>;. This provides the minimum capability of the alias as opposed to the maximum capability (see RFC 2071).

💖 Thanks

To @varkor for reviewing the draft version of this RFC.

@Centril Centril added the T-lang label Aug 19, 2018

@Centril Centril self-assigned this Aug 19, 2018

Consider that we have some sort of custom derive macro for a trait with an
associated type. Also consider that the derive macro allows customization
of the derived implementation via attributes on the type definition.
As an example, we have `#[derive(Arbitrary)]` for [proptest_drive]:

This comment has been minimized.

@kennytm

kennytm Aug 19, 2018

Member

proptest_derive ? 🤔

This comment has been minimized.

@Centril

Centril Aug 19, 2018

Author Contributor

Thanks, fixed :)

@drXor

This comment has been minimized.

Copy link

drXor commented Aug 19, 2018

Today, @Centril pointed out to me that fn id<T>(x: T) -> T { x } is not the same as move |x| x. The closure has the following inference variables: move |x: ?0| -> ?1 x; thus, if ?0 and ?1 infer to T and U where T can be implicitly coerced to U (i.e., T = &i32, U = &dyn Copy), then we no longer have id taking T to T!

Polymorphically (with respect to the type parameters of an enclosing function) asserting that a closure takes the same type in as out is currently not expressible (at least, in a local context). However, this RFC allows a way to do just that:

let id = { type T = _; move |x: T| -> T { x } };

The semantics of type T= _; in function context are pretty clear. We can think of type T = _; as introducing a named inferece variable T. This lets us assert that several types are equal without having to actually name the type, and allowing inference to actually chose it.

In general, this feature lets us constrain closures better than we could before. In particular, I believe that the following are equivalent (up to technicality):

run({ fn foo<T1: C1, ..>(x1: U1, ..) -> R { .. }; foo })
run({
    type T1: C1 = _; // ..
    move |x1: U1, ..| -> R { .. }
});

Now, I should point out that this is not the same as higher-ranked types (hence why I said above "up to technicality"). The example {type T = _; move |x: T| -> T x } is typed according to whatever T is inferred as (recalling that T here is basically an inference variable), so monomorphization only happens once per function monomorphization, as opposed to writing something like

let id: for<T> impl Fn(T) -> T = for<T> move |x: T| -> T { x };

where each use of id can trigger a new monomorphization of the closure.

If I think of more examples where introducing named inference variables is useful, I'll note them.

@mark-i-m

This comment has been minimized.

Copy link
Contributor

mark-i-m commented Aug 20, 2018

I feel that this feature would confuse new users because of its similarity in use to existential types.

Also, given that we don't have that much experience with existential types yet, I'm not convinced that listing all bounds on an existential type is unscalable. Can you provide examples of crates where this proved unscalable?

@Centril

This comment has been minimized.

Copy link
Contributor Author

Centril commented Aug 20, 2018

I feel that this feature would confuse new users because of its similarity in use to existential types.

Can you elaborate on this?

If I were to write the documentation for type Orange = _; and type Orange = impl Lime; I would explain the latter first and the former second. I would then explain terms such as inferred, opaque, transparent and then include the list of dos and don'ts that I've included in the RFC. By the time the user has read this documentation it should be clear what feature should be used for what just as I think I have made clear in this RFC (if I have not, please highlight the sections that need improvement).

Fundamentally, type Orange = impl Lime; is about encapsulation while type Foo = _; is about eliding implementation details and fully naming the unnameable.

Also, given that we don't have that much experience with existential types yet, I'm not convinced that listing all bounds on an existential type is unscalable.

Can you provide examples of crates where this proved unscalable?

This is not possible; existential type just got into nightly and is currently far away from stable.
However, I think the examples in the motivation show how quickly it gets unmanageable if you consider adding more existential types.

Also note that if you have a situation such as:

trait Avocado<T> { ... }

struct Tomato<U>(...);

impl<T: TraitA + TraitB, U> Avocado<T> for Tomato<U> { ... }

and you want to create an existential type for Tomato<U> and name all the traits that Tomato<U> affords, then you will have to write:

existential type RedOnion<T: TraitA + TraitB>: Avocado<T> + all_the_other_traits;

You have to hoist the type parameter of Avocado up to the existential type because there is an unbounded amount of bounds that Tomato<U> satisfies. With _ there is no need to mention T as a type parameter.

Worse still, consider that you have:

existential type Salt<T>: Debug;

In this case, it could be that Salt<T> is Pepper for some specific Ts but not others. Here you run into conditional bounds and you'd have to write something akin to:

existential type Salt<T>: Debug + Pepper if T: Lettuce;

However, not even this would suffice in all cases because there may not be a bound Lettuce that you may constrain T with but that T will get you Pepper if and only if T is Banana for example.

With type Orange = _; this would not be a problem.
Also consider the example of the custom derive macro in the motivation where you run into similar problems.
This demonstrates that type Orange = _; adds fundamental capabilities to the language with are inexpressible through other means.

@oli-obk

This comment has been minimized.

Copy link
Contributor

oli-obk commented Aug 20, 2018

The RFC does not mention the semver problems associated with this feature. The reason impl Trait exists is to hide the real type and offer a guaranteed interface without exposing implementation details of the function body. If this RFC were accepted, changes to a functions body would be able to change its signature significantly without this necessarily being noticed by the compiler or the test suite.

impl Trait specifically permits this kind of body exposure for Send and Sync, but nothing else. This RFC basically proposes to add a feature that exposes the entire type, including inherent methods and all trait implementations. This means that it would be impossible to change a function's body from x.into_iter().filter(foo).map(bar) to x.into_iter().filter_map(|x| if foo(&x) { Some(map(x)) } else { None }).

I also think that a viable alternative is to offer a lint which suggests to expose common traits like Clone, Display, ExactSizeIterator, ... for impl Trait return types, if the underlying type implements those traits

@Centril

This comment has been minimized.

Copy link
Contributor Author

Centril commented Aug 20, 2018

@oli-obk

The RFC does not mention the semver problems associated with this feature.

It does. See the last paragraph in the drawbacks.
I can elaborate on the drawbacks using your example.

If this RFC were accepted, changes to a functions body would be able to change its signature significantly without this necessarily being noticed by the compiler or the test suite.

Yep; that's the inevitable consequence. However, a tool such as semverver should be able to capture such breakage. The RFC is clear on that it is not a tool that should be used for encapsulation.

This means that it would be impossible to change a function's body from x.into_iter().filter(foo).map(bar) to x.into_iter().filter_map(|x| if foo(&x) { Some(map(x)) } else { None }).

In the general case yes; but this assumes that you have exported the type alias publically (and that change is somewhat likely). If it is just an internal implementation detail, then you should be able to change such details if you don't rely on the difference between filter_map(..) and filter(..).map(..) internally.

I also think that a viable alternative is to offer a lint which suggests to expose common traits like Clone, Display, ExactSizeIterator, ... for impl Trait return types, if the underlying type implements those traits

That would realistically only work for libstd traits but not for user defined traits.
Furthermore, it does not even work for libstd traits if you have the conditional scenarios, which I've noted above and which RFC 2071 notes as well.
As I've also noted, this would not work at all for proc derive macros.

All in all, I'm not sure I agree that this is a viable alternative.
But I'll note it in any case.

@Nemo157

This comment has been minimized.

Copy link
Contributor

Nemo157 commented Aug 20, 2018

Another possible mitigation strategy is to lint in the case of public aliases which include _ in their definitions. However, this will not help for associated types.

Why could there not be a lint for traits which have an associated type including _ where the trait and implementing type are both public (or potentially public for generic implementations). I'm in agreement with @oli-obk that the potential for accidental breakage is high enough that if this is implemented I would expect the initial implementation to include these lints at at least default-warn level.

Assuming the lints are there, I'm not sure if I'm pro or con this idea. I feel like we should get the experience with existential types first, then consider whether the overhead of using them is too high and something like this is necessary (along with looking at alternatives for existential types that allow for the conditional bounds you mention).

@Centril

This comment has been minimized.

Copy link
Contributor Author

Centril commented Aug 20, 2018

@Nemo157

Why could there not be a lint for traits which have an associated type including _ where the trait and implementing type are both public (or potentially public for generic implementations).

I and @oli-obk discussed this a bit further on Discord, and I think that this lint would be reasonable and I'll change the RFC accordingly. (Do note the unresolved question which touches upon this...).

I feel like we should get the experience with existential types first, then consider whether the overhead of using them is too high and something like this is necessary (along with looking at alternatives for existential types that allow for the conditional bounds you mention).

Again, note that proc macros could not, in the general case, make use of conditional bounds (which also come with their own complexities) because the macro has no way to determine up front what bounds to conditionally add (unless it just lists a bunch of traits, but that would be ad-hoc and then you have more or less regained the semver problems you were trying to be without..). Conditional bounds also have no way to leak inherent implementations, which could be useful. Comparatively, _ in type aliases is dead simple conceptually and also simplifies implementations of proc macros with associated types in as simple a manner.

Wrt. experience, I think that we can experiment with both concurrently.
The design of type Foo = _; also affects the syntax for existential type as using _ to be transparent means that it can't also be opaque (and this idea has been floated in #2515).

@mark-i-m

This comment has been minimized.

Copy link
Contributor

mark-i-m commented Aug 20, 2018

@Centril

If I were to write the documentation for type Orange = _; and type Orange = impl Lime; I would explain the latter first and the former second.

The syntactic difference between the two is not very large, and the semantic difference is IMHO subtle. I don't think there are many people who would intuitively think that the difference is that one is opaque and one is transparent. I think you did a fine job explaining it in the "Guide" section of the RFC, but my worry is about somebody coming across type Foo = _ in some code and just being confused (or worse, attributing the wrong meaning to it).

This is not possible; existential type just got into nightly and is currently far away from stable.
However, I think the examples in the motivation show how quickly it gets unmanageable if you consider adding more existential types.

I'm not really convinced that these cases come up frequently enough in practice. I can only recall one time when I ever had that many bounds on a type. Sure, it was annoying, but given how uncommon it is, having a whole language feature for that seems like overkill. I would much rather wait and see if this turns out to be a problem in real code.

@mark-i-m

This comment has been minimized.

Copy link
Contributor

mark-i-m commented Aug 20, 2018

To elaborate a bit further, I really want to avoid adding lots of subtle features to the type system. Learning rust's type system is already hard for many people, and I think the bar for adding new features should be rather high: there should be a demonstrated, wide-spread need for the feature in the ecosystem.

I'm not denying that such a need may come, but I think we should wait a bit and see if people actually run into the problems motivating this RFC.

@Centril

This comment has been minimized.

Copy link
Contributor Author

Centril commented Aug 20, 2018

@mark-i-m

The syntactic difference between the two is not very large, [..]

I have to disagree here. While _ is a single token that you use, impl Trait has both a keyword as well as a bound that the user has to provide. There is nothing to configure for _ but there is for impl Trait.

[..], and the semantic difference is IMHO subtle. I don't think there are many people who would intuitively think that the difference is that one is opaque and one is transparent.

That's fair enough; but I think the difference would be noticeable once you tried to use the constructs.
We also have a book and documentation for a reason, so we can use those to explain how the language works. For this feature, I believe that things can be explained in an intuitive manner and also grouped appropriately (for example, you can talk about uses of _ and impl Trait in different places at the same time).

I think you did a fine job explaining it in the "Guide" section of the RFC, but my worry is about somebody coming across type Foo = _ in some code and just being confused (or worse, attributing the wrong meaning to it).

First, thank you :)

With respect to being confused or attributing the wrong meaning to it, I think such a risk is low. If you already understand what let foo: _ = ...; means, then the step to type Foo = _; is quite small and the meaning of _ is the same in both contexts. The argument for using impl Trait in both type aliases and elsewhere, as proposed in #2515, is the same.

The nice thing about having a stabilization period is that these hypotheses can be tested.
Some questions are just unanswerable without trying things out.

To elaborate a bit further, I really want to avoid adding lots of subtle features to the type system. > Learning rust's type system is already hard for many people, [..]

I don't think of this as adding a new feature to the type system. Rather, we would be extending a feature that already exists (see let foo: _ = <expr>;) to another context. If anything, I argue that this is making the type system simpler and more uniform and eliminates questions such as "I can use _ in let bindings, why can't I use it over there" for one context.

I would much rather wait and see if this turns out to be a problem in real code.
[..]
I'm not denying that such a need may come, but I think we should wait a bit and see if people actually run into the problems motivating this RFC.

I can only speak for myself, but I have a real world need for this today (the custom derive macro in the motivation) and type Foo = impl Trait; is not good substitute for this because it would deprive the users of the macro of useful properties (such as Clone).

@burdges

This comment has been minimized.

Copy link

burdges commented Aug 20, 2018

I'd use this if it exited. Another idea is controlling the leakage, ala

existential type Alias: ?Trait1 + ?Trait2;

Or strange things like

existential type Alias: ?Crate;

where ?Crate means "Any trait belonging to the current downstream crate leaks here for its own usage".

@dhardy

This comment has been minimized.

Copy link
Contributor

dhardy commented Dec 1, 2018

This proposal would be very useful to me (motivation).

However, I propose the restriction: inferred types and all derived types should not be allowed to be pub; i.e. they should not be allowed to appear in the crate API.

@Centril Centril referenced this pull request Dec 27, 2018

Open

RFC: Elide array size #2545

@graydon

This comment has been minimized.

Copy link

graydon commented Jan 12, 2019

Opposed. Inference has already gone too far in the language with RFC 2071; the things that are impossible in the status quo are ok to remain impossible. They do not warrant further inference. The cognitive load of allowing / encouraging yet more inferred bindings is too high.

@Centril

This comment has been minimized.

Copy link
Contributor Author

Centril commented Jan 12, 2019

@graydon

are ok to remain impossible.

Is there any particular reason you think the use cases (e.g. macros) are not well justified?

The cognitive load of allowing / encouraging yet more inferred bindings is too high.

I understand the semantic versioning concerns people have, they are quite legitimate, wherefore this proposal needs some rework (along the lines of #2524 (comment)).

Cognitive load seems different tho. I think this reduces cognitive load as let foo: _ = ...; is already possible. By making impl Trait and _ work similarly (except for opacity/transparency) and be allowed in similar places, I think cognitive load can be reduced and learning can be made easier. This makes for a system with rules that are less surprising with more transferable knowledge.

@graydon

This comment has been minimized.

Copy link

graydon commented Jan 12, 2019

Macros are an escape hatch that happens at an inopportune phase in the compiler, and as such they always run up against whatever the current limits of the language are. New non-macro language features should very rarely be motivated by making a particular macro easier to write: that way leads to unlimited growth.

The analogy with local inference is incorrect, and users will largely have no idea how impl trait works, so analogy to aspects of its function (its opacity or lack thereof) cannot be leveraged and are not relevant. It's already beyond the cognitive limits of most users, as clearly evidenced by most discussion around it.

@dhardy

This comment has been minimized.

Copy link
Contributor

dhardy commented Jan 12, 2019

@graydon I would like to be able to type things like the following but make S a concrete type:

fn ident_i32(x: i32) -> i32 { x }
struct S<F: Fn(i32) -> i32> {
    f: F
}
let s: S<_> = S { f: ident_i32 };

Since closure / function types are impossible to express, some type of auto-derivation is needed to do this. typeof(ident_i32) might be enough.

Why I want to do this is complicated, and I'm not actually sure this is the best approach. See this example. Here, make_widget constructs a custom struct, implements some stuff for it, and returns an instance; the type of this struct is often not expressible. When this in-expressible type gets placed in another struct field, that struct's type is also in-expressible, and the field's type is reduced to a bound on implemented traits, even though the type is statically known. This means that accessing fields of the inner struct directly is impossible, hence why this example must use a SetText trait instead of simply calling self.display.display.set_text(...);.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.