Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify rules for GenericArgs nonterminal and other small improvements and fixes #805

Closed
wants to merge 4 commits into from
Closed

Simplify rules for GenericArgs nonterminal and other small improvements and fixes #805

wants to merge 4 commits into from

Conversation

steffahn
Copy link
Member

I initially only wanted to create a version of the GenericArgs non-terminal that doesn’t need 8 cases. On my way there I stumbled across non-terminals that contain the empty word that could easily be changed (much nicer like this IMO), I noticed inconsistencies around parenthesis spacing and fixed some files, found a few typos and mistakes and finally created a bunch of internal links in the tokens.md file.

If any of these changes is controversial I can split it into a different PR.

…onterminals, especially in items/generics.md; fix spacing around lots of parentheses; fix a few small typos.
@ehuss
Copy link
Contributor

ehuss commented Apr 23, 2020

I think the grammar for GenericArgs has been changed (see #785). Can you update it for the new grammar, and then add a section that describes the order requirements in English?

@steffahn
Copy link
Member Author

steffahn commented Apr 23, 2020

I see, so the rules in the reference are supposed to only specify what rusts parser syntactically accepts and rejects? I would’ve interpreted rust-lang/rust#70261 as just a way to improve error messages.

This would then also apply to Type and Lifetime Parameters as that, too, appears to only check order of type vs lifetime parameters in an ATS check, not in parsing.

@ehuss
Copy link
Contributor

ehuss commented Apr 23, 2020

Yea, the grammar should only document the syntax accepted by the parser.

And, yes, it looks like Generics also needs to be updated (looks like it was changed in 1.34).

@steffahn
Copy link
Member Author

Okay, I'll update both.

@ehuss ehuss self-assigned this Apr 23, 2020
@steffahn
Copy link
Member Author

@ehuss, updated them, feel free to continue reviewing.

@steffahn
Copy link
Member Author

I’ve been trying to understand the story around Type vs TypeNoBounds. I’ve so far come to the conclusion that the way they are currently portrayed in the reference they are telling the opposite story of what they’re meant to do. As far as I can tell, something like impl Fn() -> impl Trait + 'static is currently deliberately ambiguous to not pick one way of parsing it over another. Hence, it is (judging from the error message) currently parsed as impl Fn() -> ⌜impl Trait + 'static⌝ rather than ⌜impl Fn() -> impl Trait⌝ + 'static, however in a later stage an error message about ambiguity is emitted.

Of course explicitly writing impl Fn() -> (impl Trait + 'static) or (impl Fn() -> impl Trait) + 'static is both okay.

Now, the way the reference tries to explain this is by saying that in Fn() -> … the may only be a TypeNoBounds which excludes impl or dyn with more than one bound. If this was really what was going on though, then the parse would be unambiguous: impl Fn() -> impl Trait + 'static would have to be interpreted as (impl Fn() -> impl Trait) + 'static and no error could be emitted.

The reason the compiler is able to disallow the ambiguity is that precisely the opposite from what the reference says is true: The in Fn() -> … is, during parsing, accepting any Type and only afterwards rejects any impl and dyn that has any + with an error message about ambiguity.

I’m just noticing that my example is not optimal since the relevant rule, TypePathFn, currently already uses Type rather than TypeNoBounds. This is probably a mistake though, following the current (incorrect) logic at play in the reference, especially since it differs from fn() -> …. A better example might be something like impl Fn() -> &'static dyn Any + 'static that could mean impl (Fn() -> &'static dyn Any) + 'static or impl Fn() -> &'static dyn (Any + 'static), the latter however only if &'static … accepts Type rather than TypeNoBounds as (the rule is ReferenceType).

@steffahn
Copy link
Member Author

Just as an addition: I don’t really like how this whole story is implemented currently. I would like this code to compile (playground).

use core::ops::Add;

trait MyTrait {}
impl Add<i32> for &dyn MyTrait {
    type Output = ();
    fn add(self, _: i32) {}
}
impl MyTrait for () {}

fn main() {
    let x = &() as &dyn MyTrait + 0;
}

especially since the reference page about expressions has as binding stronger than + and the equivalent with - and Sub does compile.

Another way out of this might be to rework TypeParamBounds (reference and implementation) and disallow any unparenthesized TraitBound ending in something with impl or dyn. I haven’t thought this through 100% yet. A last point, the example in the reference is weird. It it

type T<'a> = &'a (dyn Any + Send);

which is reported by the compiler as “ambiguous” but actually it isn’t. At least I cannot think of any different way to place the parantheses that is syntactically correct.

I guess I’ll make this into an issue over on rust-lang/rust soon.

Copy link
Contributor

@ehuss ehuss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Thanks for all the fixes!

>
> _TypeBoundWhereClauseItem_ :\
> &nbsp;&nbsp; _ForLifetimes_<sup>?</sup> [_Type_] `:` [_TypeParamBounds_]<sup>?</sup>
>
> _ForLifetimes_ :\
> &nbsp;&nbsp; `for` `<` [_LifetimeParams_](#type-and-lifetime-parameters) `>`
> &nbsp;&nbsp; `for` `<` [_LifetimeParams_](#type-and-lifetime-parameters)<sup>?</sup> `>`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've removed LifetimeParams, so either this will need to be updated, or LifetimeParams will need to be added back.

Copy link
Member Author

@steffahn steffahn Apr 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah... missed that – good thing I changed them both in an earlier commit in a way that required updating all uses and hence the uses pop out in the diff.

Comment on lines +65 to +69
Generic arguments are passed in the order that they're declared in. In particular,
lifetime arguments must come before type arguments. The last kind of argument,
`Ident = Type`, is for specifying [associated types] in [trait bounds].
These arguments are not positional (i.e. their order does not matter), and they
must come after all the other (positional) arguments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first sentence seems to be immediately contradicted by the statement that the order of bindings doesn't matter. Perhaps it would help to explicitly name the 3 argument kinds, specify the order they must appear in relation to each other, and then delve into the specifics of how type bindings work?

I'd also lean towards retaining the named nonterminals in the grammar, so there is something to refer to (like GenericArgsBinding).

I'm also a little amused about the idea of specifying how the order of arguments correlates to the parameters. Most specs I've read just ignore this, presumably assuming everyone knows function arguments are passed in the given order to the corresponding parameters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main point was that if you already know that the argument order matches the declaration, then the fact that lifetime argument come before type arguments directly follows from the fact that function, trait and method declarations must put lifetime arguments first.

I’ll try to get rid of the contradictory statements, and reintroducing (at least some of) the names seems like a very good idea to make things clearer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then the fact that lifetime argument come before type arguments directly follows from […]

The compiler seems to (maybe) apply the same logic (note how it only complains about “associated type bindings” coming before “generic parameters” without knowing the details Foo).

Well, it can also generate weird messages if you do let it know of a Foo.

@ehuss
Copy link
Contributor

ehuss commented Apr 25, 2020

I think the issue with with "no bounds" stuff is that TypePathFn is supposed to be:

TypePathFn :
( TypePathFnInputs? ) (-> [TypeNoBounds])?

Here is where the return type is parsed. See also there is a note here about possibly relaxing this.

This was changed in 1.25 via rust-lang/rust#45294, and the reference never got updated.

@steffahn
Copy link
Member Author

steffahn commented Apr 25, 2020

I think the issue with with "no bounds" stuff is that TypePathFn is supposed to be:

TypePathFn :
( TypePathFnInputs? ) (-> [TypeNoBounds])?

Here is where the return type is parsed. See also there is a note here about possibly relaxing this.

This was changed in 1.25 via rust-lang/rust#45294, and the reference never got updated.

Not surprising that you don’t seem to immediately understand what I was after, I just wrote down my thoughts as they came and didn’t spend any time reworking my explanation to become more clear and detailed. Also thanks for the links!

TypePathFn is supposed to be […]

This is what I was already guessing in my statement:

TypePathFn, currently already uses Type rather than TypeNoBounds. This is probably a mistake though, following the current (incorrect) logic at play in the reference


My point is that this code:

use core::any::Any;
fn foo<T>(_: T)
where T: Fn(&()) -> &dyn Any + Send
{}

produces the error message

error: ambiguous `+` in a type
 --> src/lib.rs:3:22
  |
3 | where T: Fn(&()) -> &dyn Any + Send
  |                      ^^^^^^^^^^^^^^ help: use parentheses to disambiguate: `(dyn Any + Send)`

In which (at least to me) the ^^^^^^... indicates that this thing was parsed such that dyn Any + Send is a subexpression, just that after parsing there is an error generated because this is defined to only be legal with an extra pair of parantheses (a rule made to avoid ambiguity for the reader).

Why I call the current use of TypeNoBounds throughout the reference “incorrect”

If the current grammar (with a “corrected” TypePathFn) was to be followed, parsing the code above after T: a TypeParamBounds would be exspected, i.e. a +-seperated list of TypeParamBound.

So in Fn(&()) -> &dyn Any + Send we would want to parse some (maximal) prefix as our first TypeParamBound, which in this case can only be a TypePath, that’ll be a single TypePathSegment in this case, more specifically the PathIdentSegmentFn” followed by TypePathFn.

So now, we are parsing a prefix of (&()) -> &dyn Any + Send as TypePathFn, that is, as ( TypePathFnInputs? ) ( -> TypeNoBounds )?.

After processing the Fn-inputs, this leads to parsing a prefix of &dyn Any + Send as TypeNoBounds. We find ourselves in the ReferenceType rule, which is & Lifetime? mut? TypeNoBounds.

This means we next parse a prefix of dyn Any + Send as TypeNoBounds. We can only use the TraitObjectTypeOneBound rule, which is dyn? TraitBound.

Hence, we finally come to parsing a prefix of Any + Send as TraitBound, going into the rule ?? ForLifetimes? TypePath, hence parsing a prefix of Any + Send as TypePath.

The TypePath matches just Any. The rest, + Send cannot be consumed anymore, we’re done with the whole first TypeParamBound, we see a +, as allowed next in TypeParamBounds and continue on parsing Send as the next TypeParamBound.

The whole thing process is an 100% unambiguous parse, no reason to emit any error.

This means the reference does not describe the current behavior of rustc.

@ehuss
Copy link
Contributor

ehuss commented May 7, 2020

+ in a type expression requires parenthesis (documented here). Whenever the parser encounters a possibly ambiguous situation, like this situation, it is an error.

I believe the ambiguity here is enforced just to avoid potential confusion. The language designers certainly could have picked one interpretation or the other, but that can be confusing to the reader who may not know the particular disambiguation rules they picked. It's usually safer (from the user understandability perspective) to require parenthesis in that case.

This means the reference does not describe the current behavior of rustc.

Other than the issue with TypePathFn described above, what is it not describing?

@ehuss ehuss added the S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. label May 12, 2020
@steffahn steffahn closed this by deleting the head repository Oct 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants