Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implied bounds #2089

Merged
merged 6 commits into from Sep 11, 2017

Conversation

@scalexm
Copy link
Member

scalexm commented Jul 28, 2017

Eliminate the need for “redundant” bounds on functions and impls where those bounds can be inferred from the input types and other trait bounds. For example, in this simple program, the impl would no longer require a bound, because it can be inferred from the Foo<T> type:

struct Foo<T: Debug> { .. }
impl<T: Debug> Foo<T> {
  //    ^^^^^ this bound is redundant
  ...
}

Hence, simply writing impl<T> Foo<T> { ... } would suffice. We currently support implied bounds for lifetime bounds, super traits and projections. This RFC proposes to extend this to all where clauses on traits and types.

Rendered

@camlorn

This comment has been minimized.

Copy link

camlorn commented Jul 28, 2017

When I first learned Rust, having to repeat the constraints was a pain point that I didn't understand, and it wasn't really explained anywhere that I recall. I think the real way to teach this is to just modify all the guide's examples to take advantage, because then they do what you expect.

Then maybe introduce something somewhere that explains that additional ones can sometimes be useful or spells out corner cases where you would need them that you wouldn't expect to.

@cramertj

This comment has been minimized.

Copy link
Member

cramertj commented Jul 31, 2017

Nice job on the RFC! It's clear you've put a lot of careful thought into this feature (and its implementation). I'm really excited at the possibility of not having to repeat type bounds all over the place 😄.

I'm interested in limiting the use of implied bounds to types defined in the current crate, as mentioned in the alternatives. I believe that this would address the areas where this feature is most-needed, while still giving library authors the flexibility to change their type bounds. (This flexibility made changes like RFC 1651 possible.)

@est31

This comment has been minimized.

Copy link
Contributor

est31 commented Jul 31, 2017

I've got one little question. Is this RFC limited to functions and impl blocks? Or does it also allow implying bounds for struct members? As in, is this code allowed with the RFC:

struct Set<H :Hash> { ... }

struct Bar<H> { set_of_stuff :Set<H>, }

struct Baz<H> { bar :Bar<H>, }

The summary only talks about functions and impl blocks, not about structs, so I guess its no, but maybe I've missed something.

@skade

This comment has been minimized.

Copy link

skade commented Jul 31, 2017

@est31 Would you find that desirable?

I'd see a couple of problems there, especially that you could then build huge piles of types where it's hard to find out where the bound was propagated from.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jul 31, 2017

@est31

I've got one little question. Is this RFC limited to functions and impl blocks? Or does it also allow implying bounds for struct members?

It does not allow implying bounds for struct members. It is plausible that we could enable that, but the current RFC draws the line around functions and impls. I wouldn't say it's a hard-and-fast rule, but as a rule-of-thumb, we've tended towards saying that struct definitions are more explicit than functions. This ensures that, when reading a function signature, you don't have to do a "deep search" to find out what is implied by (e.g.) Baz<H> -- you just look at the struct Baz<H> { .. } header, and see what where-clauses appear there.

This "rule of thumb" arose in part because, in the past, we didn't require explicit lifetime parameters on structs (they could always be elided, much as they are in functions). We found this quite confusing in practice: you would see struct Baz { x: Bar }, and to know if it had any lifetimes associated with it, you'd have to look into Bar to see that it was declared struct Bar { x: &T }. This kind of sleuthing made understanding the sources of lifetime errors very challenging.

That said, it has been pointed out (but I forgot by whom, maybe @RalfJung?) that sometimes when you make a small struct -- which is sort of a glorified tuple -- it would be convenient to infer bounds on its declaration. That seems true, but it's unclear where to draw the line precisely.

@est31

This comment has been minimized.

Copy link
Contributor

est31 commented Jul 31, 2017

Would you find that desirable?

Nope, quite the contrary in fact. That's why I was asking :).

I've wondered whether this RFC would create the "deep search" problem that @nikomatsakis explained above. Great to have us all agree on this being a bad idea :). The RFC gets my 👍 now ;)

@scalexm

This comment has been minimized.

Copy link
Member Author

scalexm commented Jul 31, 2017

@cramertj

(This flexibility made changes like RFC 1651 possible.)

So actually since the T: Copy bound was not on the Cell<T> type declaration at first but on the impl rather, this change would have been possible anyway :)

In general, I'm not a big fan of writing bounds on a type. But when you really need to (like the std::borrow::Cow example) or really want to do such a thing (e.g. for your own private types this is perfectly fine), this is where you benefit from the RFC. This means that crates which cannot afford a major version bump (like std) could still continue to write bounds on impls instead.

A problem which might arise though is when you want to use a type from another crate which does have bounds declared on it, as e.g. a private field of one of your own types. Then you're forced to have the same bounds on your type.

I'm not against the idea of limiting implied bounds to types in your current crate if it is the general consensus though.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Aug 1, 2017

So I am very much in favor of this general idea (and have been for some time now...since 2014, apparently...sheesh!). And I am very excited about the work that @scalexm has done to realize it in practice, formalize it, and work through some of the tricky implications.

To my mind, the primary goal of this RFC is to eliminate the need to copy-and-paste redundant sets of bounds when implementing types (a random example). The RFC should also have the effect, however, of generally improving ergonomics, as also described in RFC #1927 and numerous Rust issues (e.g. rust-lang/rust#20671). Further, a nice side-effect of adopting the strategy laid out in this RFC will be fixing some of the various limitations in rustc's implementation (e.g., rust-lang/rust#20775). Overall, I think the "guts" of this RFC are a slam dunk! I'm inclined to move quickly to FCP.

However, there were some interesting "judgement calls" raised in the discussion in the lang team, and I think it would be helpful to get some feedback on these points. I think there were two such concerns raised (both related):

  1. The first is the most obvious. It concerns the fact that -- if this RFC lands -- removing bounds becomes a backwards incompatible thing to do. This might be considered surprising (although it's actually true today, for T: 'x bounds at least). This also means that sometimes to "disclose" some aspects of your private fields in your struct signature which you can then cannot remove, even if those private fields go away (as @scalexm noted).
  2. The other is one of predictability. This RFC definitely enables some kinds of "long-range" inference. For example, if I have fn foo<T>(x: Bar<T>, y: T), where Bar<T: Iterator>, I can then invoke y.next(), even though I don't (directly) have T: Iterator bound. This is natural when I am "close" to the definition of Bar, but perhaps it is surprising elsewhere (another way to look at it: removing the Bar<T> argument makes calls to y.next() stop working).
    • We have tried to limit this to the cases where we already have precedent (supertraits, inferring T: 'x requirements): informing function and impl bodies by their inputs. But this is obviously going further than we currently go.

To be honest, I think both of these things -- but especially the latter! -- are somewhat hard to judge in the abstract. They feel like concerns we might experiment with during the stabilization period. That said, we considered various ways to ameliorate these concerns, all basically focused on limiting the scope of the implied bounds:

  • Crate-local implied bounds. This could be achieved by not adding the "reverse WF" rules for a struct (or a trait) except within the crate where they are defined.
  • Impl or module local implied bounds. We could also go further and only add reverse rules for a type when checking an impl where the type is the self type -- this would achieve the primary objective, but in a very narrow way. It also wouldn't work for "free functions".

Of the two, crate-local feels a bit better to me. We have plenty of precedent for crate-local rules of this kind, but none at all (very little?) for the other kind. Personally, though, I am inclined to start with the full version and see how it feels (e.g., within rustc and elsewhere). I think that if it feels surprising, we will start to notice it. That said, I think when we stabilize, if we still feel we want more experience, it would be reasonable to have a separate feature-flag for "cross-crate implied bounds".

@Ixrec

This comment has been minimized.

Copy link
Contributor

Ixrec commented Aug 1, 2017

I agree the actual decision on "how far" we want implied bounds to go is probably best made after implementation or even during stabilization, but I wanted to suggest a middle ground between crate-local/module-local implications and "arbitrarily long distance" implications.

Outside of the crate where a type Foo is defined, my current preference is for that type's bounds to be implied in any function that explicitly mentions Foo in its signature. For instance, I would want this to work:

fn foo<K, V>(key: K, set: HashSet<K, V>) -> () {
    ... // we can use == or .hash() on key in here
}

But I would not want this to work:

fn bar<K, V>(key: K, value: V) -> () {
    ... // HashSet appears somewhere in here
}

For structs I'm a lot more wary of cross-crate bound implication, probably for the same reasons we don't do lifetime elision on structs today.

I like this idea primarily because it means any implied bound that applies to your code must be coming from a function signature or type definition that you can see and look up, not an implementation detail you're not supposed to know about. But you still get the benefit of not needing the K: Hash + Eq bound every single time you want to operate on a HashSet. You would still need the bound every time you use a HashSet as a private implementation detail, which imo is ideal: It highlights that your choice of implementation is forcing a change in your public API.

But that's all abstract theory we'll need experience to confirm so I'm 100% onboard with this RFC as-is. Just wanted to get that additional idea out there.

Edit: And it turns out this is already exactly what the RFC says. I have no idea how I missed this the first time I read it, sorry! But it's a good sign that we ended up at the same place despite me apparently not paying attention.

@scalexm

This comment has been minimized.

Copy link
Member Author

scalexm commented Aug 1, 2017

@Ixrec

my current preference is for that type's bounds to be implied in any function that explicitly mentions Foo in its signature

So actually this is already the case and your second example indeed does not work under this RFC (even when considering crate local-ness) :)

I give an example in the guide-level explanation:

// `Set<T>` does not appear in the fn signature: we need to explicitly write the bounds.
fn declare_a_set<T: Hash + Eq>() {
    let set = Set::<T>::new();
}

Same for struct fields, see e.g. @nikomatsakis comment.

@aturon aturon referenced this pull request Aug 3, 2017

Open

Language ergonomic/learnability improvements #17

10 of 31 tasks complete
@jonastepe

This comment has been minimized.

Copy link

jonastepe commented Aug 5, 2017

While learning Rust the circumstances when I could and could not elide bounds seemed very arbitrary to me. That made my first experience with Rust's generics a bit frustrating. I could not really reason about the rules and that made the learning considerably harder for me.

Repeating the bounds every time (especially when writing data structures that forced various constrains on their types) cluttered up my code and made it hard to read.

Imo, this change is really needed, streamlining the elision rules for bounds. It will make Rust's generics easier to learn and handle in practice.

I'm all for starting with the long-range, full version and see how it goes during stabilisation.

@skade

This comment has been minimized.

Copy link

skade commented Aug 7, 2017

@jonastepe Could you give some examples? There's almost no places where bounds in Rust are elided, but for example this is legal:

struct Foo<T> {
    field: T
}

impl<T> Foo<T> {
    fn print_out(x: T) where T: Debug {
         println!("{:?}", x);
    }
}

It has nothing to do with elision though, just that bounds are taken into account at multiple places, so they don't need to be symmetric.

@RalfJung

This comment has been minimized.

Copy link
Member

RalfJung commented Aug 7, 2017

@skade when writing fn foo(x: &mut &mut i32), there actually is an elided bound, namely that the inner lifetime outlives the outer one.

@skade

This comment has been minimized.

Copy link

skade commented Aug 7, 2017

@RalfJung Outside of lifetimes, are there any other examples, though?

@RalfJung

This comment has been minimized.

Copy link
Member

RalfJung commented Aug 7, 2017

Well, one could consider supertraits something similar. Given trait Foo: Debug, if you write T: Foo, you also get T: Debug. It's something like implied bounds, a little.

But otherwise, nested lifetimes are the only one. However, fundamentally, the well-formedness condition regarding nested lifetimes is no different than the one regarding a type's trait bounds.

@jonastepe

This comment has been minimized.

Copy link

jonastepe commented Aug 8, 2017

@skade I'm referring specifically to lifetime bounds defined on a type. You could specify them on a type and decide not to include them (having them inferred) on an impl block. I actually asked about that on SO once (also see the comment by @LukasKalbertodt on the answer). This is something that confused me back then. Why could I elide the lifetime bounds and not the other bounds?
For me, specifying bounds on the type itself meant that the type could only ever be used with type arguments that satisfied those bounds. Having to repeat the bounds on every impl and function mentioning that type explicitly, made little sense to me.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Aug 23, 2017

@rfcbot fcp merge

This RFC has been open for 25 days and so far the comments seem uniformly positive. The one major question is precisely where bounds to permit implied bounds (as I outlined here. There hasn't been much commentary on that point, but honestly I think it's the sort of thing that would be best "discovered" by gaining some real-life experience from the stabilization process.

Therefore, I propose that we merge this RFC, but add an unresolved question indicating whether we should try to limit the range of implied bounds to be crate-local (or module-local, etc). This just means we will revisit this question prior to stabilization and make sure we're happy with the result.

@rfcbot

This comment has been minimized.

Copy link

rfcbot commented Aug 23, 2017

Team member @nikomatsakis has proposed to merge this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@scottmcm

This comment has been minimized.

Copy link
Member

scottmcm commented Aug 23, 2017

Another possibility for the resolve-before-stabilization stuff: Do implied bounds allow use of those bounds in general, or only on the parameter that implied them (unless they're from Self, probably)?

fn silly<T: Default>(x: std::collections::HashSet<T>) -> bool {
    T::default() == T::default()
              // ^ Can I still use == even though *I* never said PartialEq?      
}
@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Aug 23, 2017

@scottmcm We don't have a mechanism for restricting it, so it would be implied for all values of T.

@scottmcm

This comment has been minimized.

Copy link
Member

scottmcm commented Sep 4, 2017

Hmm, there's probably a bunch of "wanted a subset but didn't get it" cases, like

fn foo<T: PartialEq>(x: HashSet<T>, y: T)
@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Sep 6, 2017

@scottmcm

Hmm, there's probably a bunch of "wanted a subset but didn't get it" cases, like

Note that this example is illegal today. (To be clear, I am assuming here that you meant for HashSet to be declared with an Eq bound, like struct HashSet<T: Eq + Hash>.)

@frankmcsherry

This comment has been minimized.

Copy link

frankmcsherry commented Sep 6, 2017

Since you all want comments, here we go:

I like this a lot, I think. I have had several pain points related to this, where types that "should have" had associated types took generic parameters instead, because it was too painful to reiterate the constraints on the associated types each time they were used.

I am a bit worried that the "intra-crate only" view wouldn't solve the pain points though, so let me show you them. :) I had a much better version back when I was trying to get this to work (early March? I can't find the comment though =/).

Here is a differential dataflow trait. It has some generic parameters and some associated types. There are requirements about the associated types.

pub trait TraceReader<K, V, T, R> {
    type Batch: BatchReader<K, V, T, R>+Clone+'static;
    // ...
}

That was the read-only version of a trace and batch. There are larger traits for both that allows construction and mutation:

pub trait Trace<K, V, T, R> : TraceReader<K, V, T, R> 
where <Self as TraceReader<K, V, T, R>>::Batch: Batch<K, V, T, R> {
    // ...
}

Elsewhere, perhaps outside the crate, one tries to write code generically with respect to T: Trace< .. >. However, to use T: Trace< .. > one must also everywhere assert that <T as TraceReader< .. >>::Batch : Batch< .. >. Otherwise you get errors like (don't sweat the details):

    the trait `trace::Batch<K, V2, <G as timely::dataflow::ScopeParent>::Timestamp, R2>` is not implemented for `<T2 as trace::TraceReader<K, V2, <G as timely::dataflow::ScopeParent>::Timestamp, R2>>::Batch`
    = help: consider adding a `where <T2 as trace::TraceReader<K, V2, <G as timely::dataflow::ScopeParent>::Timestamp, R2>>::Batch: trace::Batch<K, V2, <G as timely::dataflow::ScopeParent>::Timestamp, R2>` bound
    = note: required by `trace::Trace`

I as user of the trait Trace don't know anything about Batch, and really would prefer not to. I assume anything implementing Trace satisfies these constraints. Because I am a good goat, I will write the bound down, but details of Trace are now scattered throughout my non-crate implementation and it is something of a pain to ever consider changing them (e.g., adding a constraint on Self::Batch to Trace, which I as user would be delighted to accept).

All of those K, V, T, R type parameters used to be associated types, but they ended up having constraints too, and they imposed further constraints on Self::Batch, which has its own associated types with constraints, and zomg it was seriously eight lines of where constraints a full screen wide just to use T: Trace. All for a method that didn't know or care at all about the associated types.

Anyhow, I'd love it if that goes away. If we are voting on the scope of inference I prefer the larger scope, but have limited vision into the pain this might cause for other people.

To demonstrate my lack of vision, I don't understand why the

    type Batch: BatchReader<K, V, T, R>+Clone+'static;

constraint doesn't need to be re-iterated, but the

where <Self as TraceReader<K, V, T, R>>::Batch: Batch<K, V, T, R>

constraint does. This doesn't need to be explained, it is just data for you understanding how confused I am / whether doing things that benefit me would be pearls before swine. :)


Edit: maybe a final comment, possibly of no use: in all cases I've wanted implied trait bounds, it has only been to satisfy other imposed trait bounds. I've not wanted the bounds to gain access to methods (or other associated items) the trait provides. If life were somehow easier only providing the implied bounds to satisfy constraints, rather than driving inference / resolution, great! And, that just hit me so apologies if there are five paragraphs on that in the RFC.

@rfcbot

This comment has been minimized.

Copy link

rfcbot commented Sep 9, 2017

The final comment period is now complete.


I *think* rustc would have the right behavior currently: just dismiss this branch since it only leads to the tautological rule `(u8: Foo) if (u8: Foo)`.

In Chalk we have a more sophisticated cycle detection strategy based on tabling, which basically enables us to correctly answer "multiple solutions", instead of "unique solution" if a simple *error-on-cycle* strategy were used. Would rustc need such a thing?

This comment has been minimized.

@arielb1

arielb1 Sep 10, 2017

Contributor

I think the case tabling comes into play is:

trait Bar {} // not implemented for anything
trait Foo {}
impl<T: Foo + #[cfg(maybe)] Bar> Foo for Vec<T> {}
#[cfg(maybe2)]
impl Foo for () {}

And then we ask whether ?0: Foo. Currently, rustc will answer "ambiguous" for all 4 cases, while tabling might be able to figure out a more precise answer.

This comment has been minimized.

@scalexm

scalexm Sep 10, 2017

Author Member

Yes exactly, there is a test in Chalk for that precise example.

This comment has been minimized.

@arielb1

arielb1 Sep 10, 2017

Contributor

I think that to be "concrete" the example also needs

struct u32 {}
impl Bar for u32 {}

To avoid the prover just seeing that Bar has no possible impls.

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Sep 10, 2017

IIRC one problem when we tried that was that we add bounds like &'a T: Sized, which causes a problem with rust-lang/rust#21974

struct Set<K: Hash> { ... }
struct NotHash;
fn foo(arg: Set<NotHash>) { ... }

This comment has been minimized.

@arielb1

arielb1 Sep 10, 2017

Contributor

That's not correct in practice, because

  1. rustc will try to prove that ENV(foo) |- WF(Set<NotHash>), because of one of the "the type of everything within a function must be well-formed" rules.
  2. we theoretically have WF(Set<NotHash>) in our environment. Practically, we don't - the current implementation:
    2.A. does not allow using environment predicates that aren't trait predicates, projection predicates, or "ParamWf" predicates. This is IIRC not done for a good reason, and is not needed.
    2.B. ignores environment predicates that do not contain any parameters. This is done because these predicates either hold (and are therefore unneeded) or don't hold (and therefore can't be used in a valid program). This helps caching.
    2.C. has "anti-getting-lost" rules. The rules "as written" mean, that if one has a trait
    trait Foo { type Bar: Sized; }
    Then there is a trait-system rule
    τ type
    WF(τ)
    τ: Foo
    ----------------
    <τ as Foo>::Bar: Sized
    
    Then in order to prove that there is no impl for [T]: Sized, one has to check that we don't have a type [T] = <some-type as Foo>::Bar. That can require going over all the impls to see that there is no applicable impl, so the current implementation in rustc does not use the trait-system rule unless it sees that the type is already a projection type.

@aturon aturon merged commit 02f1f90 into rust-lang:master Sep 11, 2017

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Sep 11, 2017

This RFC has been merged! Tracking issue.

@ExpHP

This comment has been minimized.

Copy link

ExpHP commented Oct 19, 2017

@ExpHP

This comment has been minimized.

Copy link

ExpHP commented Dec 12, 2017

I've been looking forward to this, but after a recent experiment where I attempted to make use of the existing feature of implied bounds on associated types, I am now concerned.

Is it possible that the potentially large number of implied bounds introduced by this through traits may cause unexpected issues with trait selection in user code?

See
rust-lang/rust#33108 (comment) and following discussion
rust-lang/rust#24066
rust-lang/rust#46697

(The "recent experiment" I've alluded to is this code-genned trait (which I am having difficulty minimizing, so take it for what it is). I find that with the numerous Add bounds implied on associated types, it seems only one is used for type checking, and so changing the order of the bounds affects whether or not compilation succeeds.)

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Dec 29, 2017

@ExpHP Fixed rendered link, thanks.

@SoniEx2

This comment has been minimized.

Copy link

SoniEx2 commented Jul 31, 2018

There should be a way to opt-out using something like

struct Foo<T: Clone> {
    v: T
}

impl<T> Foo<T> where T { // errors because T isn't Clone
}

vs

struct Foo<T: Clone> {
    v: T
}

impl<T> Foo<T> { // builds
}

vs

struct Foo<T> {
    v: T
}

impl<T> Foo<T> where T { // builds
}

vs

struct Foo<T: Clone> {
    v: T
}

impl<T> Foo<T> where Foo<T> { // (aka `where Self`) builds
}

(see comments and type bounds, should be self-explanatory.)

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jul 31, 2018

@SoniEx2 It's not at all self-explanatory.

@cramertj

This comment has been minimized.

Copy link
Member

cramertj commented Jul 31, 2018

@Centril I could be wrong, but I think that @SoniEx2 is suggesting that we only introduce implied well-formedness bounds in cases where the type is listed in the where clause. It's an interesting idea that would let you turn the feature on and off, but I personally think I'd find that behavior confusing.

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jul 31, 2018

@cramertj I see, if that is the case then personally I agree that it is confusing both syntactically and in terms of behavior, and I also think that such knobs would lead to decision fatigue.

@SoniEx2

This comment has been minimized.

Copy link

SoniEx2 commented Jul 31, 2018

Actually, explicitly listing a type parameter in where, even with no/empty bounds, would opt-out of implied bounds. For the many cases where you overthink about compatibility (maybe you're writing a library) and don't wanna introduce a breaking change.

Then you could also still have implied bounds with explicit bounds. e.g.

struct Foo<T: Clone> {
    v: T
}
impl<T: Eq> Foo<T> {
    // benefits from implied bounds, T is implied as Clone + Eq
}
impl<T> Foo<T> where T: Clone {
    // doesn't benefit from implied bounds, need explicit T: Clone (as shown)
}

This makes the simple case simple, and the complex case possible. i.e. for some of the more complex type bounds where you'd need type parameters in a where clause(? do these exist? see below), you wouldn't benefit from implied bounds. This is a good thing, IMO.

(Do you ever need where clauses for type parameters? Or do you only need them for like, where T::Bar: Clone and things that don't put bounds on the type parameters directly?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.