New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracking issue for `?` macro repetition #48075

Open
nikomatsakis opened this Issue Feb 8, 2018 · 88 comments

Comments

Projects
None yet
@nikomatsakis
Contributor

nikomatsakis commented Feb 8, 2018

RFC: rust-lang/rfcs#2298

Status

  • RFC accepted
  • Preliminary implementation started (#47752)
  • Preliminary implementation available
  • Edition-dependent behavior proposed (see #51934)
  • Stabilization proposed
  • Stabilized

Known bugs

None.

Unresolved questions to be answered before stabilization

  • Should the ? Kleene operator accept a separator? Adding a separator is completely meaningless (since we don't accept trailing separators, and ? can accept "at most one" repetition), but allowing it is consistent with + and *. Currently, we allow a separator. We could also make it an error or lint.
@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Feb 8, 2018

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Feb 17, 2018

@nikomatsakis Could you update the checklist? The preliminary impl is available on nightly.

@Centril Centril closed this Feb 27, 2018

@Centril Centril reopened this Feb 27, 2018

Centril added a commit to rust-lang/rfcs that referenced this issue Feb 27, 2018

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Feb 27, 2018

@Centril I think you need to update the tracking issue in the RFC, too 😛

@Centril

This comment has been minimized.

Contributor

Centril commented Feb 27, 2018

@mark-i-m Already done 🤣

@clarcharr

This comment has been minimized.

Contributor

clarcharr commented Mar 28, 2018

Is there any plan to stabilise this soon?

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Mar 28, 2018

I think it depends on experience if this is the right way to solve the problems involved... Frankly, I haven't really written any macros since I implemented this 😛 ...

I think there is still the one problem that was pointed out in the RFC thread: this doesn't provide a clean solution for trailing commas in * macros, since you would end up accepting foo!{,} which is a bit weird.

We could conceivably stabilize this anyway if people find it really useful. Any thoughts?

@clarcharr

This comment has been minimized.

Contributor

clarcharr commented Mar 28, 2018

I personally see it more ergonomic to have rules for () and $(...),+ $(,)? than $(...),* and $(...,)*, as the former usually requires way less duplication than the latter. The former it also natural for recursive macros, and it basically solves the trailing comma problem nicely.

Considering how it also allows optional arguments as well, I think it's pretty much a win here.

@clarcharr

This comment has been minimized.

Contributor

clarcharr commented Mar 28, 2018

As far as the one unresolved question (following separators) goes, I'm in favour of allowing it for the sake of a simpler grammar, but linting it in rustc (not clippy). That seems pretty ergonomic to me, and the compiler internally would have to accept it with a separator anyways to provide useful error messages.

@alexreg

This comment has been minimized.

Contributor

alexreg commented Apr 5, 2018

I don’t see the point of allowing the trailing separator even with a lint. It’s totally meaningless as you point out, so its existence in code would probably just perplex readers!

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Apr 5, 2018

the compiler internally would have to accept it with a separator anyways to provide useful error messages.

Hmm... I guess this is a good point. It means that probably the code complexity (in the compiler) of keeping or not keeping the separator is about the same. Still, having a simpler grammar is nice, especially if libsyntax2.0 ever actually intends have something auto-generated.

What about a deny-by-default lint?

@durka

This comment has been minimized.

Contributor

durka commented Apr 5, 2018

When would anyone ever allow the lint, though? This kinda sounds like "we can't decide, just make it configurable".

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Apr 5, 2018

When would anyone ever allow the lint, though?

My guess: probably never. On the other hand, when is anyone likely to actually use a separator for a single item? Will anyone ever run into the lint? The tradeoff is between a mildly simpler grammar and mildly simpler UX.

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Apr 5, 2018

The changes if we want disallow it are pretty minimal:

diff --git a/src/libsyntax/ext/tt/quoted.rs b/src/libsyntax/ext/tt/quoted.rs
index f324ede..a6fb782 100644
--- a/src/libsyntax/ext/tt/quoted.rs
+++ b/src/libsyntax/ext/tt/quoted.rs
@@ -470,8 +470,11 @@ where
                         GateIssue::Language,
                         explain,
                     );
+                } else {
+                    sess.span_diagnostic
+                        .span_err(span, "`?` macro repetition does not allow a separator");
                 }
-                return (Some(tok), op);
+                return (None, op);
             }
             Ok(Ok(op)) => return (Some(tok), op),
@alexreg

This comment has been minimized.

Contributor

alexreg commented Apr 5, 2018

My guess: probably never. On the other hand, when is anyone likely to actually use a separator for a single item? Will anyone ever run into the lint? The tradeoff is between a mildly simpler grammar and mildly simpler UX.

I think this is really "designer's/developer's motivation" vs. "user's motivation". And making the user happy should always win, I think. After all, there's a fair bit of ugly stuff in most compilers (including the Rust one), but most people don't care about it, because they don't touch it, and if they do they expect a significant learning curve. A compiler's end goal is to support a language rather than have an elegant internal structure (which is more of a bonus). The fact is, if you want to go messing with the compiler or libsyntax, taking the time to learn the slight difference between ? and */+ is no big deal. If you're a user, especially one new to Rust, that could be a real gotcha.

Now, since from the Rust user's perspective this feature adds literally nothing, and adding an error (as @mark-i-m's diff shows) is very straightforward, I maintain we disallow it. @durka's point and the article he links to is also a good one. Configuration is only a good thing when it's absolutely needed. Convention over it any other time.

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Apr 5, 2018

I went back and re-read part of the RFC thread, and this comment stuck out to me: rust-lang/rfcs#2298 (comment)

There is the question of what this matches: $(a)?+. It could be:

  • a+ (hypothetical)
  • a?a?a (current implementation)

Allowing the separator actually allows us to disambiguate:

  • $(a),?+ matches a+
  • $(a)?+ matches a?a?a

But at the same time, this seems really dubious, since removing the separator seems like it should not change the pattern matched. If we disallow separator on ?, then things are a bit cleaner:

  • $(a),?+ is parsing error
  • $(a)?+ matches a?a?a

Admittedly, this is all a bit contrived, but it does seem to throw the balance in favor of disallowing a separator on ?...

EDIT: corrected typo

@durka

This comment has been minimized.

Contributor

durka commented Apr 5, 2018

Agreed that if removing the separator changes things that seems weird and bad. But it's a shame that in the latter scenario you can't match aa+...

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Apr 5, 2018

One option would be to change the disambiguation strategy. That is, we stop allowing ? as a separator for any of the Kleene operators. This would be a breaking change, but perhaps it might not be that bad?

@durka

This comment has been minimized.

Contributor

durka commented Apr 5, 2018

@alexreg

This comment has been minimized.

Contributor

alexreg commented Apr 5, 2018

Now that you mention that, maybe a two-character ?? operator makes the most sense. No ambiguity, no allowance for specification of meaningless separators, and no prevention of using ? as a separator for the other Kleene operators.

@CAD97

This comment has been minimized.

Contributor

CAD97 commented Aug 21, 2018

Using the exact nightly I was, a cargo check passed but a cargo doc complained that "? is not a macro repetition operator". This no longer happens on the current nightly, so is no longer an issue.

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Aug 21, 2018

Hmm how interesting. It sounds like rustdoc was not getting the correct edition/feature gates or something. That error is emitted only on 2015 or without the feature gate.

Glad it's working now.

@alexreg

This comment has been minimized.

Contributor

alexreg commented Aug 21, 2018

@CAD97 Yeah, it was pretty late when I wrote that; please ignore me!

@Centril

This comment has been minimized.

Contributor

Centril commented Sep 15, 2018

I have to say I'm not super happy with the decision to do edition breakage here...
We found zero regressions for it. For the case of if let true = p && q {..} which also had zero regressions, we did do a in-2015 breakage, so I think the consistent thing to do would be to do this in Rust 2015 as well and thus reduce the technical debt.

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Sep 15, 2018

I feel like the boat has already sailed on that. I don't really want to change the implementation again :p

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Sep 15, 2018

One idea might be to create a new RFC that specifies/discusses what kind of breakage is acceptable in-edition. Personally, I think the standard should be rather higher than a successful crater run + some time on nightly.

@Centril

This comment has been minimized.

Contributor

Centril commented Sep 15, 2018

@mark-i-m

I don't really want to change the implementation again :p

I can sympathize :)

I feel like the boat has already sailed on that.

The FCP was completed yes; (I wasn't on the lang team then...); so I might just have to live with this :(

One idea might be to create a new RFC that specifies/discusses what kind of breakage is acceptable in-edition.

An interesting thought; but sadly that would take too much time for this particular instance.

Personally, I think the standard should be rather higher than a successful crater run + some time on nightly.

I on the other hand think that a crater run that found zero regressions makes it acceptable.

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Sep 16, 2018

An interesting thought; but sadly that would take too much time for this particular instance.

I meant an RFC for setting a general bar for this sort of in-edition breakage.

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Sep 17, 2018

Note to self: this should be documented in the edition guide somewhere... (If someone else is up for a PR on this, it might get done sooner)...

@Centril

This comment has been minimized.

Contributor

Centril commented Sep 17, 2018

I meant an RFC for setting a general bar for this sort of in-edition breakage.

Yeah I understood that =P

I meant an RFC for setting a general bar for this sort of in-edition breakage.

Probably worth making an issue on the guide :)

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Sep 22, 2018

Documentation for when this is eventually stabilized (hopefully):

Reference: rust-lang-nursery/reference#424
Edition Guide: rust-lang-nursery/edition-guide#111

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Oct 5, 2018

I would really like to see this stabilized. =) Nominating for lang team discussion.

@Centril

This comment has been minimized.

Contributor

Centril commented Oct 25, 2018

@rfcbot fcp merge

Let's stabilize this :)

But let's also have a stabilization report with links to tests and stuff...

@rfcbot concern report-and-tests

@rfcbot

This comment has been minimized.

rfcbot commented Oct 25, 2018

Team member @Centril has proposed to merge this. The next step is review by the rest of the tagged teams:

Concerns:

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Oct 25, 2018

@rfcbot concern stabilization report

I think we need a stabilization report. This should include:

  • brief description of the behavior and any deviations from the RFC
  • which edition(s) are affected and how
  • links to a few tests to show the interesting aspects

@mark-i-m would you be game to produce it?

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Oct 25, 2018

@rfcbot resolve stabilization report

Oh, I see I duplicated @Centril's concern.

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Oct 25, 2018

Stabilization Report 🎉

(Let me know if I missed anything)

Description of behavior

  • Edition 2015 ? is a macro separator, not a kleene op, but using it as a separator triggers a migration lint.

    • This represents no change from previous behavior except the migration lint.
  • Edition 2018 ? is not a valid separator. ? is a valid Kleene op.

    • Add a Kleene operator to macros to repeat a pattern at most once: $(pat)?. Here, ? behaves like + or * but represents at most one repetition of pat.
    • The ? Kleene operator does not accept any separator token, unlike + and *. For example, $(a),? is an error. This is because we don't allow trailing separators, so any separator used with the ? Kleene operator would be impossible to use.
    • For example, under this proposal, $(a)?+ matches + and a+ unambiguously.
  • For a longer description of alternatives and behavior, see this FCP.

  • Note, that the stabilization only impacts Edition 2018.

Differences from RFC

Link to RFC

  • The RFC proposed a disambiguation strategy that used lookahead to discern whether a ? token was a separator for + or * OR was a ? Kleene op. We opted instead to remove ? as a separator in Edition 2018 as a breaking change, removing the ambiguity altogether.

  • The RFC proposed making a minor breaking change in Edition 2015 to change the disambiguation as mentioned above. Instead, we opted to only enable ? Kleene operator in Edition 2018. That is, Edition 2015 is unchanged (except for a migration lint), whereas Edition 2018 has the changes described above.

  • The RFC had an unresolved question: "Should the ? Kleene operator accept a separator token?". We decided the answer is "no".

Documentation and Testing

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Oct 25, 2018

@nikomatsakis @Centril done :) Let me know if I missed anything

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Oct 25, 2018

Also, it looks like the checklist in the OP should be updated.

@Centril

This comment has been minimized.

Contributor

Centril commented Oct 26, 2018

@rfcbot resolve report-and-tests

@mark-i-m that's a most excellent report. :)

@alexreg

This comment has been minimized.

Contributor

alexreg commented Oct 26, 2018

Well done on the super-meticulous work @mark-i-m. Glad we can finally get this feature stabilised!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment