match #34

Open
nrc opened this Issue Oct 25, 2016 · 16 comments

Projects

None yet

8 participants

@nrc
Collaborator
nrc commented Oct 25, 2016

basics

match $discr {
    $arm1
    $arm2
    ...
}

Spaces and newlines as shown. In particular, prefer not to line-break the discriminant if possible; each match arm must have its own line; arms are block indented.

simple arms

Short arm:

$pat if $expr => $expr,

Prefer the above form if the arm fits on one line.

Avoid single line blocks:

$pat if $expr => { $expr }   // No

The last match arm should have a comma if the body is not a block:

match $expr {
    $expr => foo,
    $expr => bar,
}

Block body:

$pat if $expr => {
    ...
}

Note no comma if using a block, prefer not to line-break pattern or if clause, block indentation of arm body.

Empty blocks:

$pat if $expr => {}

Note {} not (), no comma, no space in the block.

Question - multi-line single expressions

Should a block be required?

E.g.,

match foo {
    bar => 42,
    baz => a_very_long_expr(dsfsdfsdfa,
                            dsafdfa,
                            dsfsfdsfdsfsd),
}

// vs

match foo {
    bar => 42,
    baz => {
        a_very_long_expr(dsfsdfsdfa,
                         dsafdfa,
                         dsfsfdsfdsfsd)
    }
}

Rustfmt currently does the latter, I found that it sometimes makes it easier to read, helps prevent rightward drift, and gets around the slightly unfortunate issue of expressions with blocks needing a comma:

match foo {
    bar => 42,
    baz => if qux {
        ...
    },  // comma needed here :-(
}

I'm no longer 100% convinced this is worth the extra vertical space however. We might want to leave this question unanswered for a while since the benefit in preventing rightward drift would not be so important if we pick a style which other wise prevents it (e.g., by using two space indent, or using more block indentation):

match foo {
  bar => 42,
  baz => a_very_long_expr(dsfsdfsdfa,
    dsafdfa,
    dsfsfdsfdsfsd),
}

Breaking the arm

Prefer breaking before the body:

$pat =>
    $expr,

or

$pat => {
    ...
}

Then prefer breaking before the if (if present, see below).

Then prefer breaking the pattern at |s (if there are multiple sub-patterns, see below). If breaking the pattern, the rest of the arm should only be broken as necessary. E.g., the following are ok:

pat1 |
pat2 |
pat3 if $expr => {
    ...
}
pat1 |
pat2 |
pat3 => $expr,
pat1 |
pat2 |
pat3 if $expr => $expr,

Finally, break each pattern if necessary.

Breaking before if

If necessary to break before the if, it should be block indented, a block is required for the arm body, and the { of the block should be on a newline. This helps prevent the if clause be distinguished from the arm body. Never break after the if:

$pat
    if $expr =>
{
    ...
}

If $expr does not fit on one line, break it according to the rules for expressions.

large patterns

If using a compound pattern (pat1 | pat2), prefer to put the whole pattern on one line. If it does not fit, put each sub-pattern on a new line, with the line terminated by |:

pat1 |
pat2 |
pat3 => {
    ...
}

If any sub-pattern does not on the line, break it according to the rules for patterns.

Question - blank lines between arms

These are sometimes useful for indicating groups of arms, I propose Rustfmt leaves single blank lines where the user writes them and coalesces multiple blank lines into one. E.g.,

match $expr {
    $expr => foo,

    $expr => bar,
}

Question - consistency between arms

Should we allow a mix of block arms and single expression arms? Or should we use blocks for every arm if any arm requires it? The second alternative is more consistent, but wastes some space. It also looks odd in the relatively common case where there is a single large arm and many small arms.

Question - exhaustiveness

Should we offer any guidance about using _ in match arms? I feel there are strong idioms for using this or not depending on desired backwards compatibility. However, it may be beyond the remit of the style guide/process.

@steveklabnik

Note no comma if using a block,

I pretty much always use commas, I've found this behavior of rustfmt confusing.

Note {} not (), no comma, no space in the block.

I know I use () not {} here, since () is going to be the type/value of the return of {} anyway, so it feels more clear.

@joshtriplett
Member

Regarding the comment about "no single-line blocks", I think single-line blocks are fine for short cases like unsafe { u.x }, per the discussion on #21. Though that effectively turns into an expression, and the block lives inside the expression rather than directly in the match, so the expression/block rules apply. Worth thinking about though.

Also, I don't know what rustfmt currently does, but in the same spirit as writing

let x = long_expr
        + long_expr_2;

(with the + at the start of the second line), I prefer to write:

match expr {
    pattern1
    | pattern2 => ...
}
@nrc
Collaborator
nrc commented Oct 25, 2016

I pretty much always use commas, I've found this behavior of rustfmt confusing.

To expand on my reasoning - I find block + comma noisey and prefer to avoid this kind of duplication where possible (by analogy to e.g., parens in if expressions and braces around an expression in match arms).

I know I use () not {} here, since () is going to be the type/value of the return of {} anyway, so it feels more clear.

My analogy here is that for a function we don't usually write -> () for a void return type, nor do we write

fn foo() {
    ()
}

So, it feels weird to explicitly write the () in a match.

I also think that the intuition of an empty match arm is "do nothing" (i.e., execute an empty block), rather than "reduce to a value which is void", in particular because it is rare to assign the result of a match with void type into a value.

I also argue by analogy to block expressions in statement position - you wouldn't write if { ... };.

Regarding the comment about "no single-line blocks" ...

Yeah, this is specifically about the arm body and whether that should be a block. I don't consider this applying to either if { ... } or unsafe { ... }, etc. I don't think you would ever need a regular block with a single expression, since there is no scoping to be affected.

Also, I don't know what rustfmt currently does, but in the same spirit as writing

Hmm, this might get into a wider discussion so it might be worth peeling off. My opinion is that I prefer binops to be on the previous line rather than the next one. I don't feel too strongly about that, but I do feel strongly that commas should follow that behaviour, and I kind of want | (and by extension all binops) to follow commas. My reasoning here is that this is the behviour of punctuation in written English, and violating this rule makes the code look more alien and slower to parse. I believe the main argument against is better diffs and possibly better scanability.

@joshtriplett
Member
joshtriplett commented Oct 25, 2016 edited

I agree completely regarding commas, and semicolons; those operators should fade into the background as simple terminators, though.

Regarding other binops (and I do think we should split this discussion off): I find that putting a binop at the end of the previous line makes it easier to get lost, and harder to construct the tree in your head, especially when you break an expression in multiple places. If I look at something like this:

if really_complicated_expression
   || another_complicated_expression
   || yet_another_complicated_expression

(where the complicated expressions themselves have piles of operators and calls in them), having the || at the start of the next line makes the tree of operations more obvious. This has nothing to do with diffs; I consider it a standalone readability issue. I don't want those operators to fade into the background the way commas do; I want them in the foreground.

Typically, a comma or semicolon has importance because of its presence or absence, but for most binary operators, I need to know which operator (such as && versus ||).

@steveklabnik

I find block + comma noisey and prefer to avoid this kind of duplication where possible

I find the lack of consistency more noisy than the extra character. Every time I use JSON, it trips me up that I can't include the trailing comma. There's also the usual diff arguments, etc.

I also think that the intuition of an empty match arm is "do nothing" (i.e., execute an empty block), rather than "reduce to a value which is void", in particular because it is rare to assign the result of a match with void type into a value.

This might be the root of it, yeah, I think I do think this way.

I also argue by analogy to block expressions in statement position - you wouldn't write if { ... };.

I do actually write this a lot; it took a lot of habit-breaking to get out of it. I'd probably keep it in, but this isn't an issue about that 😉

@Alex-PK
Alex-PK commented Oct 25, 2016 edited

There is probably a dedicated RFC, but honestly, I don't like this syntax:

match foo {
  bar => 42,
  baz => a_very_long_expr(dsfsdfsdfa,
    dsafdfa,
    dsfsfdsfdsfsd),
}

I much prefer:

match foo {
  bar => 42,
  baz => a_very_long_expr(
    dsfsdfsdfa,
    dsafdfa,
    dsfsfdsfdsfsd
  ),
}

I think it's visually clearer they are all parameters to the function, and where the list ends. I don't really mind the vertical spacing. Sometimes I even leave blank lines between conditions to make them stand out more.

I really dislike the alignment in this:

match foo {
    bar => 42,
    baz => a_very_long_expr(dsfsdfsdfa,
                            dsafdfa,
                            dsfsfdsfdsfsd),
}

@nrc
Collaborator
nrc commented Oct 25, 2016

There is probably a dedicated RFC

Yes, there will be. For the match arm formatting I don't think it matters which of those we choose, but the difference between block and visual indenting might affect our decision.

@azerupi
azerupi commented Oct 25, 2016 edited

Note no comma if using a block,

I pretty much always use commas, I've found this behavior of rustfmt confusing.

I too, prefer the use of commas even for arm blocks. I like the consistency and that you can visually differentiate the closing match arm from any other closing block. The comma is one of the things I use to help distinguish between arms, especially when the arms are relatively long. It would be a shame to loose that IMHO.

@nrc
Collaborator
nrc commented Oct 26, 2016

Some discussion from the Rustfmt repo: rust-lang-nursery/rustfmt#1129 and rust-lang-nursery/rustfmt#490

@jethrogb

I often have match arms like this:

match x {
    long_const1 | long_const2 | long_const3 | long_const4 | long_const5
    | long_const6 | long_const7 | long_const8 | long_const9
    | long_const10 | long_const11 | long_const12 | long_const13
    => { ... }
}

Putting all the individual constants on a single line as proposed above takes up too much space IMO.

@chriskrycho

@jethrogb To me, that seems like a place to extract that into a function to check those conditions, rather than requiring the formatter to support that style.

@jethrogb

@chriskrycho I have an (error) enum with 100 variants, whatever I do, at the end there will be a match that looks like that.

@chriskrycho

I don't want to further derail this thread, but I'd be inclined to write a function that did something like this, instead of that long hairy match, if you need something to handle a bunch of different cases:

fn is_error_blah(&variant: MyError) -> bool {
  let allowed_variants = [MyError:A, MyError:B, MyError:C, ...];
  allowed_variants.into_iter().any(|v| v == variant)
}

But there will inevitably be some long lists in a match, so… ¯_(ツ)_/¯ I see your point.

@nrc
Collaborator
nrc commented Nov 29, 2016

@steveklabnik

There's also the usual diff arguments, etc.

I don't think there is a diff argument in this case - with a block you don't need a comma in any case, so adding or removing an arm has the same size diff with or without the extra comma.

@nrc
Collaborator
nrc commented Nov 29, 2016

I have an (error) enum with 100 variants, whatever I do, at the end there will be a match that looks like that.

My opinion is that there will always be edge cases where mechanical formatting does not work. Either you put such functions in a module by themselves and treat them like generated code, or you mark them with #[rustfmt_skip]. I don't think that we should let such edge cases dictate style for more common code.

@nrc
Collaborator
nrc commented Nov 29, 2016

I think this is ready for FCP. There is maybe a little more room for fine-tuning, but there hasn't been much movement recently.

@ubsan ubsan self-assigned this Dec 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment