-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Add match/in statements #2144
Changes from 2 commits
d61d1f2
73e34ec
daf503d
8f064a9
dee68cc
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,150 @@ | ||
- Feature Name: overlapping_match_statements | ||
- Start Date: 2017-09-08 | ||
- RFC PR: (leave this empty) | ||
- Rust Issue: (leave this empty) | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
This feature is to facilitate the writing and using of `match` statements in a whole new way. | ||
This feature would allow the ability to write match statements where multiple branches may be | ||
matched and still allow for code to be used if no branch is matched, similar to the current | ||
use of the `_` pattern. | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
There is a very good software engineering principle where repeating a piece of code is bad. | ||
This is the case because if that selection of code needs to be changed then it has to be | ||
changed in two places which can easily not be done and thus create bugs. A way of doing this | ||
for a large selection of lines of code is to put it into a function, a helper function. Allowing | ||
overlapping match statements extends this paradigm to that where matching is a good idea, the | ||
use of pattern matching, and where exhaustiveness checks are a nice thing. | ||
|
||
This would support use cases where the required execution of several branches overlapped enough | ||
that his would help. A use case for this is when the outcome of one branch is the same as a | ||
combination of the other two branches of a match statement. The expected outcome of this is | ||
the ability to have multiple branches of a match statement, and having those branches still be | ||
checked for exhaustiveness, be executed if more than one of them match the value. | ||
|
||
# Detailed design | ||
[design]: #detailed-design | ||
|
||
Basic Syntax: | ||
```rust | ||
match val in { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since the only difference to the existing I also think There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I chose |
||
pat | pat => expr, | ||
pat => expr | ||
} | ||
|
||
match val in { | ||
pat | pat => expr, | ||
pat => expr | ||
} else { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This deviation from the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It may be a bit unfortunate but if you consider the meaning of |
||
expr | ||
} | ||
``` | ||
|
||
Benefits of this syntax: | ||
1. No new keywords need to be used. This is good thing since it means for a relatively small | ||
addition there would be no breaks. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "there would be no breaks"? What's that supposed to mean? |
||
2. The `in` seems to imply that it sort of like an "iterator" of statements and will go through | ||
each of them in turn. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Instead of
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That is fair and I am pretty sure that adding an additional keyword here wouldn't break any code since if a variable was called what ever it was that would be able to be checked but that is definitely something to consider There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, adding the additional "keyword" should work w/o backwards incompatibility since there's no (iirc) parsing rule "<expr> <expr>". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For a correct DFA adding that state would be rather complicated but I agree is do able There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm... DFA for what? Lexing? The language is not regular... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have not actually looked that deep into Rust I didn't consider it to be that high order, though tbh I don't know why There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rust's lexical grammar is already context-sensitive due to Not a single real programming language is regular (not counting assembly and machine code). Fun fact: Even the simple-looking Lua grammar requires infinite lookahead to parse and is thus harder to parse than Rust. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. More details here: https://stackoverflow.com/a/43693150/1063961 and here https://stackoverflow.com/a/43693194/1063961 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think Instead I'd like to add |
||
Meaning of parts: | ||
1. The `else` is used in a similar sort of vein to that of the `_` pattern in normal matches and | ||
could include several of the same warnings. The expression enclosed within this is only executed | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "several of the same warnings" which warning are you referring to? |
||
if none of the patterns within the `match/in` statement. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Missing "match" verb at the end. |
||
|
||
Edge cases: | ||
1. If the `_` pattern in present in any of the contained matches and the `else` block is also | ||
present then a `unreachable_code` lint on the code within the `else` block | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another missing verb. Perhaps "is emitted"? Also it feels weird that it's not an unreachable pattern warning, but it's not a pattern, so it seems technically correct... |
||
2. Since the main reason for using a `match` is the exhaustiveness checks as long as there isn't | ||
an `else` block then the compiler will output an error for `non-exhaustive patterns`. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another possibility would be to use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So would it be the following (for clarification): match many x {
_ if x % 5 == 0 => expr,
! => expr
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I definitely see the appeal of it, since as you say it is much more terse, but I think that it would be more confusing since it is in the area of pat/expr of the expression but does not follow the same "rules" as the other patterns since it will only be executed if no other patterns are executed sort of how the |
||
Implementation Assumptions: | ||
1. Assuming that a `match` statement is currently implemented similar to a long chain of | ||
`if/else if` statements. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not, how would it be implemented in this way? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will rewrite this. I chose to use the word |
||
|
||
Implementation: | ||
1. This can be implemented as if it was a list of `if` statements. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, but it could be implemented as a list of matches. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How would that work? I do not see an implementation via matches There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Take a look at the macro I posted below |
||
2. To cover the `else` case the location to jump to at the end after checking all the branches | ||
can be stored, initially set to the start of the `else` block but if it enters any of the | ||
branches then it is set to immediately after the `else` block. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just storing a bool is sufficient. |
||
|
||
# How We Teach This | ||
[how-we-teach-this]: #how-we-teach-this | ||
|
||
This should be called `match/in` statements since that is the combination of keywords that are | ||
used similar to `for/in` statements. This idea would be best presented as a continuation of | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. None of these are actually statements, they're expressions. |
||
existing Rust patterns since it expands on the `match` statement. | ||
|
||
This proposal should be introduced to new users right after `match` statements are taught. This | ||
is the best time to teach it since it appears as an extension of that syntax and the ideas that | ||
are used when using `match` statements. | ||
|
||
Within the _Rust Book_ a section after the section on the `_` placeholder could be called | ||
_match/in Control Flow Operator Addition_. Within this section the syntax and differences would | ||
be outlined. These would most notable include the multiple branches can be executed. The reader | ||
should be able to understand by the end of this section that this allows for multiple branches | ||
to be executed but it still will check for exhaustiveness when able. He should also know that | ||
the branches are checked top first. | ||
|
||
An example that could be used within the section: | ||
|
||
You can turn this: | ||
```rust | ||
match cmp.compare(&array[left], &array[right]) { | ||
Less => { | ||
merged.push(array[left]); | ||
left += 1; | ||
}, | ||
Equal => { | ||
merged.push(array[left]); | ||
merged.push(array[right]); | ||
left += 1; | ||
right += 1; | ||
}, | ||
Greater => { | ||
merged.push(array[right]); | ||
right += 1; | ||
} | ||
} | ||
``` | ||
into | ||
```rust | ||
match cmp.compare(&array[left], &array[right]) in { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The semantic equality of these two versions implies that I feel this proposal is too close to the switch statements of C/C++/Java which, as far as I know, are confusing to newcomers. I think this is because intuitively, humans think of choice as exclusive - so "A or B" is parsed as "A xor B", which is why a lot of introductory books in discrete mathematics and logic has to clarify "or vs xor" (example: Logic in Computer Science, Huth & Ryan, p.4). I think the price of repetition is worth the potential unreadability of the proposed |
||
Less | Equal => { | ||
merged.push(array[left]); | ||
left += 1; | ||
}, | ||
Greater | Equal => { | ||
merged.push(array[right]); | ||
right += 1; | ||
} | ||
} | ||
``` | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
This should not be done because it increases the size of language and might not be used by | ||
everyone. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd like to see some discussion of the readability of inclusive top-down multiple-branch |
||
# Alternatives | ||
[alternatives]: #alternatives | ||
|
||
1. Instead of using `match` as a basis instead removing patterns from the equation and having | ||
some notation that asks the compiler to prove that some value will be set to true by the time | ||
a certain point in the code has been reached. This has some downfalls: | ||
1. It requires the compiler to prove something as true which the compiler currently does not | ||
do so that would require a lot more work. | ||
2. There does not seem to be any syntax that makes sense to use in this case without adding | ||
a new keyword and avoiding that is preferable | ||
2. Not doing anything, since the old code works and is somewhat usable this idea is not necessary | ||
to have and so not implementing it could be an option. | ||
|
||
# Unresolved questions | ||
[unresolved]: #unresolved-questions | ||
|
||
Whether or not `match/in` makes sense for this sort of control flow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"a whole new way" makes it sound like an advertisement and doesn't convey any information the reader might be interested in. It's also grammatically questionable.