range notation (a..b) has weird operator precedence #20811

dgrunwald · 2015-01-09T16:43:25Z

First off, the operator precedence of a..b is extremely low.
struct Range recently got #[derive(Eq)], so people might want to compare a range to an expression in range notation.

r == 1..10 parses as (r == 1)..10. Similarly, 1..10 == r parses as 1..(10 == r). Both result in type errors.
I think there's no good reason for .. to have a precedence lower than comparisons. I'd suggest to put it in-between the comparison operators and the bitwise operators.

But the current implementation gets weirder when we consider the other forms of range notation:
r == 1.. parses as (r == 1).., and is a type error (or maybe a RangeFrom<bool>).
But r == ..1 parses as r == (..1), and is a valid range comparison.

In the other direction, ..1 == r parses as ..(1 == r) and is potentially a RangeTo<bool>.
But 1.. == r is a syntax error (expected one of ) or ,, found ==).
This inconsistent with the prefix form.

If we keep the current precedence of .. (lower than all binops), I think we should disallow r == ..1. I think this can be implemented easily by using RESTRICTION_NO_DOTS when parsing the RHS of any binop.

Also, we should use RESTRICTION_NO_DOTS when parsing the RHS of prefix-version ..expr -- currently ..1..2 is a RangeTo<Range<_>>, while 1..2..3 and 1..2.. both are syntax errors.

If we increase the precedence, we will need to be careful not to allow code like 1 + ..2. I think any binop with precedence smaller than that of .. will have to use RESTRICTION_NO_DOTS on the RHS.

I also think the implementation can be simplified quite a bit if the prefix .. wasn't handled in parse_prefix_expr (as that's only supposed to be used for operators with high precedence), but at the same level as the binary and postfix version (parse_assign_expr, or parse_binops if we change the precedence). This would allow us to get rid of RESTRICTION_NO_DOTS and use the normal operator precedence handling instead.

The text was updated successfully, but these errors were encountered:

dgrunwald · 2015-01-09T16:51:27Z

I am planning to fix this myself, but I need input on whether to increase the operator precedence so that r == 1..2 is valid.

bstrie · 2015-01-09T16:54:59Z

cc @aturon

I think this is a candidate for a nomination as well.

aturon · 2015-01-09T16:55:34Z

cc @nick29581

aturon · 2015-01-09T18:59:04Z

Nominating.

aturon · 2015-01-09T18:59:16Z

cc @nikomatsakis @alexcrichton

alexcrichton · 2015-01-09T22:07:14Z

May be related to #20830

dgrunwald · 2015-01-10T00:04:36Z

@alexcrichton: That looks like a different bug in the same line of code.

nikomatsakis · 2015-01-10T17:19:17Z

I agree about the current precedence being wrong, and think that the suggested position is probably good, but I am very wary of the "restriction no dots" rules. Every restriction is a parser bug until proven otherwise, from my POV, and I am not convinced that we need RESTRICTION_NO_DOTS at all anymore.

dgrunwald · 2015-01-10T17:56:38Z

@nikomatsakis: I think some kind of parser special case will always be necessary if we want to allow v[0..a+b], avoid grammar inconsistencies between ..a and a.., and not have a combinatorial explosion in the complexity of the formal grammar at the same time.

I know this problem from when I tried writing a C# parser using a parser generator. It turns out that the Microsoft C# compiler is a hand-written recursive-descent parser using the shunting yard algorithm to parse binops. But not all C# binops are true binary operators taking two expressions: the as operator takes a type on the RHS, yet has lower precedence than some other binary operators. This results in some strange effects -- a + b as c + d is valid in C# and parses as ((a + b) as c) + d, even though the precedence of as is lower than that of +. As far as I know, there is no way to write a formal grammar that accepts the same expressions as the MS C# compiler without duplicating all expression-related productions.

I'd prefer if rust does not make the same mistake as C#, so any prefix- and suffix- operators must be forbidden from being used in a context which conflicts with their operator precedence, even if operator precedence is unnecessary to disambiguate the expression. As far as I know, .. is currently the only prefix-/suffix- operator where this problem occurs, since as and all unary operators have a higher precedence than any binary operator.

Here's how I imagine the formal grammar:

BitOrExpr = BitXorExpr | BitOrExpr "^" BitXorExpr
RangeExpr = BitOrExpr | ".." BitOrExpr | BitOrExpr ".." | BitOrExpr ".." BitOrExpr
CompExpr = RangeExpr | RangeExpr ("==" | "!=" | "<" | ">" | "<=" | ">=") RangeExpr
AndExpr = CompExpr | AndExpr "&&" CompExpr

This means that "^" and "&&" are left-associative; comparison operators don't have any associativity (since RFC 558); and the range expression doesn't have any associativity either.

I hope that this can be implemented without using a restrictions bitflag in the parser. But if not, I'd prefer special cases in the parser over special cases in the grammar.

dgrunwald · 2015-01-10T21:43:32Z

Changing the precedence level as I suggested actually creates new syntactic ambiguities:
a == b.. && c
currently is... actually a syntax error because can_begin_expr doesn't handle && (#20901), but if that is fixed, it would be:
(a == b)..(&&c)

With my suggestion, it would be ambiguous between
a == (b..(&&c))
and
a == (b..) && c

I'm starting to think we should keep the precedence as-is, so that .. must always be enclosed in parenthesis when used in a more complex expression, and only fix the inconsistencies between the prefix form and the other two forms.

nikomatsakis · 2015-01-11T02:03:08Z

On Sat, Jan 10, 2015 at 09:56:42AM -0800, Daniel Grunwald wrote:

@nikomatsakis: I think some kind of parser special case will always be necessary if we want to allow v[0..a+b], avoid grammar inconsistencies between ..a and a.., and not have a combinatorial explosion in the complexity of the formal grammar at the same time.

Can you spell this out a bit more?

nikomatsakis · 2015-01-11T12:53:03Z

On Sat, Jan 10, 2015 at 01:43:35PM -0800, Daniel Grunwald wrote:

With my suggestion, it would be ambiguous between
a == (b..(&&c))
and
a == (b..) && c

Why is this not just an argument for a more-refined precedence (like,
we pick one interpretation and run with it). Only one is remotely
sensible (b..(&&c)) so that seems to be the one to use. That said,
I've not sat down and looked at the table of precedences to see how
that would fit.

dgrunwald · 2015-01-11T13:27:57Z

Why is this not just an argument for a more-refined precedence (like, we pick one interpretation and run with it). Only one is remotely sensible (b..(&&c)) so that seems to be the one to use.

True, I guess we can just pick one interpretation and use it.
Though I would have picked the other interpretation: if r1 == 1.. && r2 == ..2 {} should probably be if r1 == (1..) && r2 == (..2) {}
The other choice would disambiguate it as if r1 == (1.. (&& r2)) == ..2 {}.

I'm starting to think we should keep the precedence as-is, so that .. must always be enclosed in parenthesis

When I wrote that, I was just concerned about the complexity of having an optional expression in a chain of binops, but I think probably isn't as bad as I initially thought.

The only binops that potentially conflict with the initial token in an expression are: &&, ||, *, & and .. itself. (UFCS will add < and <<)
Only those with a precedence lower than .. are problematic, so that's && and || only.
I don't see much use for ranges of double references or ranges of lambdas, so I'd add the following rule to disambiguate the formal grammar I've given:
If .. is followed by a token that can be interpreted both as start of expression or binary operator, it will be parsed as:

start of expression if the interpretation as binop has a higher precedence than ..
binary operator if it has a lower precedence than ..

This corresponds to the disambiguation of if r1 == 1.. && r2 == ..2 {} as if r1 == (1..) && r2 == (..2) {}, but 1..*i will be 1..(*i), not (1..)*(i).

Gankra · 2015-01-11T15:55:22Z

#20921 points out that for i in 0 .. n {} doesn't even parse correctly right now, so at very least that should be fixed.

dgrunwald · 2015-01-11T16:02:50Z

I have an implementation of my suggested precedence + disambigation. I will create a PR once I have done some testing.

I was able to eliminate RESTRICTION_NO_DOTS at the cost of adding a minprec parameter to parse_binops and parse_prefix_expr.

dgrunwald · 2015-01-11T16:20:59Z

I think we may need a special case to handle for i in 1.. { }. I'd expect that to be an infinite loop, but currently it gets parsed as for i in (1..{}).

Fixes rust-lang#20811 and rust-lang#20241. Note: this changes the semantics of parse_more_binops() to parse binops with precedence >= min_prec. Previously, it would parse binops with precedence > min_prec.

nikomatsakis · 2015-01-13T11:26:29Z

On Sun, Jan 11, 2015 at 08:21:03AM -0800, Daniel Grunwald wrote:

I think we may need a special case to handle for i in 1.. { }. I'd
expect that to be an infinite loop, but currently it gets parsed as
for i in (1..{}).

We do generally have a subset of expression types that appear in
positions terminated by { (e.g., if, for). This seems relevant
here too.

dgrunwald · 2015-01-13T11:34:46Z

Yes, I already handled that case in my implementation: https://github.com/rust-lang/rust/pull/20958/files#diff-da9d34ca1d0f6beee2838cf02e07345cR2942
Makes RESTRICTION_NO_STRUCT_LITERAL a slight misnomer, it should probably be RESTRICTION_NO_OPEN_BRACE_WHERE_EXPRESSION_MIGHT_END except that's too long.

Overview over what my pull request does:

.. precedence changed to between comparison operators and bit or; using the formal grammar:

BitOrExpr = BitXorExpr | BitOrExpr "^" BitXorExpr
RangeExpr = BitOrExpr | ".." BitOrExpr | BitOrExpr ".." | BitOrExpr ".." BitOrExpr
CompExpr = RangeExpr | RangeExpr ("==" | "!=" | "<" | ">" | "<=" | ">=") RangeExpr
AndExpr = CompExpr | AndExpr "&&" CompExpr

Where the above grammar is ambiguous due to collisions between binary operators and prefix operators (e.g. r == 1.. && condition), && and || are parsed as binary operators, other tokens are parsed as prefix operators.
for i in 1.. {} is parsed as for i in (1..) {}

Grammar changes: * allow 'for _ in 1..i {}' (fixes rust-lang#20241) * allow 'for _ in 1.. {}' as infinite loop * prevent use of range notation in contexts where only operators of high precedence are expected (fixes rust-lang#20811) Parser code cleanup: * remove RESTRICTION_NO_DOTS * make AS_PREC const and follow naming convention * make min_prec inclusive

This PR is intended as alternative to #20958. It fixes the same grammar inconsistencies, but does not increase the operator precedence of `..`, leaving it at the same level as the assignment operator. For previous discussion, see #20811 and #20958. Grammar changes: * allow `for _ in 1..i {}` (fixes #20241) * allow `for _ in 1.. {}` as infinite loop * prevent use of range notation in contexts where only operators of high precedence are expected (fixes #20811) Parser code cleanup: * remove `RESTRICTION_NO_DOTS` * make `AS_PREC` const and follow naming convention * make `min_prec` inclusive r? nikomatsakis

Grammar changes: * allow 'for _ in 1..i {}' (fixes rust-lang#20241) * allow 'for _ in 1.. {}' as infinite loop * prevent use of range notation in contexts where only operators of high precedence are expected (fixes rust-lang#20811) Parser code cleanup: * remove RESTRICTION_NO_DOTS * make AS_PREC const and follow naming convention * make min_prec inclusive

aturon added the I-nominated label Jan 9, 2015

csouth3 mentioned this issue Jan 11, 2015

range notation with for expression for i in 0 .. n {} should be parsed as for i in (0 .. n) {} #20921

Closed

dgrunwald mentioned this issue Jan 11, 2015

Fix precedence of '..' range notation and some grammar inconsistencies. #20958

Closed

brson added this to the 1.0 beta milestone Jan 15, 2015

brson added P-backcompat-lang and removed I-nominated labels Jan 15, 2015

alexcrichton mentioned this issue Jan 18, 2015

Utilize range x..y sugar rust-num/num#48

Closed

dgrunwald mentioned this issue Jan 18, 2015

Fix some grammar inconsistencies for the '..' range notation. #21374

Merged

bors closed this as completed in #21374 Jan 23, 2015

rprichard mentioned this issue Oct 1, 2015

Range expressions: discrepancies between rustc and parser-lalr #28785

Open

dgrunwald mentioned this issue Jan 14, 2016

Comparison operators have higher precedence than range operator .. #22877

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

range notation (a..b) has weird operator precedence #20811

range notation (a..b) has weird operator precedence #20811

dgrunwald commented Jan 9, 2015

dgrunwald commented Jan 9, 2015

bstrie commented Jan 9, 2015

aturon commented Jan 9, 2015

aturon commented Jan 9, 2015

aturon commented Jan 9, 2015

alexcrichton commented Jan 9, 2015

dgrunwald commented Jan 10, 2015

nikomatsakis commented Jan 10, 2015

dgrunwald commented Jan 10, 2015

dgrunwald commented Jan 10, 2015

nikomatsakis commented Jan 11, 2015

nikomatsakis commented Jan 11, 2015

dgrunwald commented Jan 11, 2015

Gankra commented Jan 11, 2015

dgrunwald commented Jan 11, 2015

dgrunwald commented Jan 11, 2015

nikomatsakis commented Jan 13, 2015

dgrunwald commented Jan 13, 2015

range notation (a..b) has weird operator precedence #20811

range notation (a..b) has weird operator precedence #20811

Comments

dgrunwald commented Jan 9, 2015

dgrunwald commented Jan 9, 2015

bstrie commented Jan 9, 2015

aturon commented Jan 9, 2015

aturon commented Jan 9, 2015

aturon commented Jan 9, 2015

alexcrichton commented Jan 9, 2015

dgrunwald commented Jan 10, 2015

nikomatsakis commented Jan 10, 2015

dgrunwald commented Jan 10, 2015

dgrunwald commented Jan 10, 2015

nikomatsakis commented Jan 11, 2015

nikomatsakis commented Jan 11, 2015

dgrunwald commented Jan 11, 2015

Gankra commented Jan 11, 2015

dgrunwald commented Jan 11, 2015

dgrunwald commented Jan 11, 2015

nikomatsakis commented Jan 13, 2015

dgrunwald commented Jan 13, 2015