New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracking issue for default binding modes in match (RFC 2005, match_default_bindings) #42640

Open
nikomatsakis opened this Issue Jun 13, 2017 · 123 comments

Comments

Projects
None yet
@nikomatsakis
Contributor

nikomatsakis commented Jun 13, 2017

This is a tracking issue for the "match ergonomics using default bindings mode" RFC (rust-lang/rfcs#2005).

Status: Awaiting stabilization PR and docs PR! Mentoring instructions here.

Steps:

Unresolved questions:

  • How to handle the logic around explicit bindings and coercion? #44848
  • When precisely should constants "autoderef"? #44849
  • Reset to by ref mode? #46688
@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Jun 13, 2017

Contributor

I'm actually not 100% sure the best way to implement this. It seems like we will need to add adjustments to patterns, and then integrate those into the various bits of the compiler. @eddyb, have any thoughts on that? You usually have some clever ideas when it comes to this sort of thing. =)

Contributor

nikomatsakis commented Jun 13, 2017

I'm actually not 100% sure the best way to implement this. It seems like we will need to add adjustments to patterns, and then integrate those into the various bits of the compiler. @eddyb, have any thoughts on that? You usually have some clever ideas when it comes to this sort of thing. =)

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Jun 13, 2017

Member

We should be able to treat adjustments more uniformly now, although it'd still be a lot of plumbing.

Member

eddyb commented Jun 13, 2017

We should be able to treat adjustments more uniformly now, although it'd still be a lot of plumbing.

@crazymykl

This comment has been minimized.

Show comment
Hide comment
@crazymykl

crazymykl Jun 15, 2017

How does this interact with slice patterns (#23121)?

crazymykl commented Jun 15, 2017

How does this interact with slice patterns (#23121)?

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Jun 15, 2017

Contributor

@crazymykl

My expectation would be that, when we encounter a pattern like [a, ..b], we check the default binding mode. If it is "ref", then a becomes &slice[0] and b becomes (effectively) &slice[1..].

Contributor

nikomatsakis commented Jun 15, 2017

@crazymykl

My expectation would be that, when we encounter a pattern like [a, ..b], we check the default binding mode. If it is "ref", then a becomes &slice[0] and b becomes (effectively) &slice[1..].

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Jun 15, 2017

Contributor

@eddyb do we want to use general purpose adjustments here, or something more limited (the current RFC, after all, only requires autoderef of & and &mut types). It seems like though if we can use adjustments, that'll lay us a nice foundation for the future potentially. But it may introduce a lot of questions (e.g., what does it mean to "unsize" a pattern) that are better left unasked.

Contributor

nikomatsakis commented Jun 15, 2017

@eddyb do we want to use general purpose adjustments here, or something more limited (the current RFC, after all, only requires autoderef of & and &mut types). It seems like though if we can use adjustments, that'll lay us a nice foundation for the future potentially. But it may introduce a lot of questions (e.g., what does it mean to "unsize" a pattern) that are better left unasked.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Jun 15, 2017

Member

@nikomatsakis Autoderef or autoref? If it's one bit per pattern making it separate for now is fine.

Member

eddyb commented Jun 15, 2017

@nikomatsakis Autoderef or autoref? If it's one bit per pattern making it separate for now is fine.

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Jun 16, 2017

Contributor

@eddyb

Autoderef or autoref? If it's one bit per pattern making it separate for now is fine.

autoderef -- that is, where you have a pattern like Some, the type you are matching might now be &Some or &mut Some (or &&Some, etc). I think I agree, I'm inclined to introduce a PatternAdjustment struct that just includes auto-deref and go from there.

Contributor

nikomatsakis commented Jun 16, 2017

@eddyb

Autoderef or autoref? If it's one bit per pattern making it separate for now is fine.

autoderef -- that is, where you have a pattern like Some, the type you are matching might now be &Some or &mut Some (or &&Some, etc). I think I agree, I'm inclined to introduce a PatternAdjustment struct that just includes auto-deref and go from there.

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Jul 6, 2017

Contributor

Here is a rough and incomplete implementation plan for the match ergonomics RFC.

There are a few high-level things we have to do:

  • figure out which patterns need to an "auto-deref" -- e.g., if you have match &foo { Some(x) ... }, then we want to record on the Some(x) pattern that there is an automatic "dereference", meaning that it is equivalent to &Some(x)
  • figure out the "binding mode" for patterns, which will no longer always be explicit.

Right now, the code is setup to scrape this information directly from the "HIR" (the compiler's AST). The HIR node for a pattern is hir::Pat, of which the most interesting part is the kind field of type PatKind. If you have a pattern like Some(x), then, that would be represented as a tree:

We want to make this equivalent to &Some(ref x), which would be encoded as a:

  • Ref pattern, encoding the &
    • TupleStruct, encoding the Some
      • Binding with BindingMode of ref, encoding the ref x

We don't however have enough information to do this when we construct the HIR, since we need type checking results to do it, so we can't change the HIR itself. The way we typically handle this sort of thing then is to have the typeck encode "side tables" with auxiliary information. These tables are stored in TypeckTables and they encode all kinds of information.

In this case, I think we want two tables:

  • pat_adjustments, sort of roughly analogous to the existing adjustments table. It would have a type like NodeMap<usize>, I think.
    • The key of such a map is the "id" of the Pat -- stored in the id field of the Pat struct.
    • The value would just be a number indicating how many implicit & patterns we insert before this pattern. So, in our example, the Some pattern would wind up with a value of 1. For the other patterns, we'd probably just have no entry, meaning "none". (You could also store 0, but why waste the memory.)
  • pat_binding_modes would have type NodeMap<hir::BindingMode>. It would, for each binding pattern, indicating the actual binding mode -- this may vary from what is found in the HIR, since we may be encoding a ref and so forth. We may want to change the HIR then to have some different type, or perhaps an Option<hir::BindingMode> -- where None means that nothing was explicitly written, which is the normal case -- so as to make it more obvious that when the user writes x this does not mean a "by value" binding mode anymore.

Probably a decent first PR is just to introduce the second table (pat_binding_modes) and rewrite all the existing code to use it. Right now, code that wants to find the binding mode of a binding extracts the value straight from the HIR, as you can see in the following examples:

  • the check_match code, which enforces various sanity checks.
  • determine_pat_move_mode, which scrapes info from the HIR and also checks the type of the values being matched by value to decide if this is a copy-or-move;
  • local_binding_mode, a helper in borrowck -- this uses the "HIR Map" to lookup, given the id, the node for a given binding, and extracts its binding mode; this would be rewritten to use the new table
  • the HAIR conversion code; this is a precursor to MIR construction, which constructs a lowered form of the patterns called the HAIR. The HAIR is intended to be a copy of the HIR taking into account all of the information encoded in various side-tables, so you don't want to change the HAIR itself, just this code which creates it by scraping the HIR.

This is not a comprehensive list, but it does have the major use-sites. You can get a more comprehensive list by doing rg 'BindByValue|BindByRef', which is what I did.

OK, no time for more, but hopefully this helps somebody get started! Please leave a note if you are interested in taking this on, and feel free to ping me on IRC (nmatsakis) or gitter (nikomatsakis) with any questions (or ask in #rustc).

Contributor

nikomatsakis commented Jul 6, 2017

Here is a rough and incomplete implementation plan for the match ergonomics RFC.

There are a few high-level things we have to do:

  • figure out which patterns need to an "auto-deref" -- e.g., if you have match &foo { Some(x) ... }, then we want to record on the Some(x) pattern that there is an automatic "dereference", meaning that it is equivalent to &Some(x)
  • figure out the "binding mode" for patterns, which will no longer always be explicit.

Right now, the code is setup to scrape this information directly from the "HIR" (the compiler's AST). The HIR node for a pattern is hir::Pat, of which the most interesting part is the kind field of type PatKind. If you have a pattern like Some(x), then, that would be represented as a tree:

We want to make this equivalent to &Some(ref x), which would be encoded as a:

  • Ref pattern, encoding the &
    • TupleStruct, encoding the Some
      • Binding with BindingMode of ref, encoding the ref x

We don't however have enough information to do this when we construct the HIR, since we need type checking results to do it, so we can't change the HIR itself. The way we typically handle this sort of thing then is to have the typeck encode "side tables" with auxiliary information. These tables are stored in TypeckTables and they encode all kinds of information.

In this case, I think we want two tables:

  • pat_adjustments, sort of roughly analogous to the existing adjustments table. It would have a type like NodeMap<usize>, I think.
    • The key of such a map is the "id" of the Pat -- stored in the id field of the Pat struct.
    • The value would just be a number indicating how many implicit & patterns we insert before this pattern. So, in our example, the Some pattern would wind up with a value of 1. For the other patterns, we'd probably just have no entry, meaning "none". (You could also store 0, but why waste the memory.)
  • pat_binding_modes would have type NodeMap<hir::BindingMode>. It would, for each binding pattern, indicating the actual binding mode -- this may vary from what is found in the HIR, since we may be encoding a ref and so forth. We may want to change the HIR then to have some different type, or perhaps an Option<hir::BindingMode> -- where None means that nothing was explicitly written, which is the normal case -- so as to make it more obvious that when the user writes x this does not mean a "by value" binding mode anymore.

Probably a decent first PR is just to introduce the second table (pat_binding_modes) and rewrite all the existing code to use it. Right now, code that wants to find the binding mode of a binding extracts the value straight from the HIR, as you can see in the following examples:

  • the check_match code, which enforces various sanity checks.
  • determine_pat_move_mode, which scrapes info from the HIR and also checks the type of the values being matched by value to decide if this is a copy-or-move;
  • local_binding_mode, a helper in borrowck -- this uses the "HIR Map" to lookup, given the id, the node for a given binding, and extracts its binding mode; this would be rewritten to use the new table
  • the HAIR conversion code; this is a precursor to MIR construction, which constructs a lowered form of the patterns called the HAIR. The HAIR is intended to be a copy of the HIR taking into account all of the information encoded in various side-tables, so you don't want to change the HAIR itself, just this code which creates it by scraping the HIR.

This is not a comprehensive list, but it does have the major use-sites. You can get a more comprehensive list by doing rg 'BindByValue|BindByRef', which is what I did.

OK, no time for more, but hopefully this helps somebody get started! Please leave a note if you are interested in taking this on, and feel free to ping me on IRC (nmatsakis) or gitter (nikomatsakis) with any questions (or ask in #rustc).

@tschottdorf

This comment has been minimized.

Show comment
Hide comment
@tschottdorf

tschottdorf Jul 14, 2017

Contributor

I'll look into the first step:

Probably a decent first PR is just to introduce the second table (pat_binding_modes) and rewrite all the existing code to use it.

Contributor

tschottdorf commented Jul 14, 2017

I'll look into the first step:

Probably a decent first PR is just to introduce the second table (pat_binding_modes) and rewrite all the existing code to use it.

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Jul 17, 2017

Contributor

@tschottdorf woohoo!

Contributor

nikomatsakis commented Jul 17, 2017

@tschottdorf woohoo!

tschottdorf added a commit to tschottdorf/rust that referenced this issue Jul 21, 2017

WIP: default binding modes: add pat_binding_modes
This PR kicks off the implementation of the [default binding modes RFC][1] by
introducing the `pat_binding_modes` typeck table mentioned in the [mentoring
instructions][2].

It is a WIP because I wasn't able to avoid all uses of the binding modes as
not all call sites are close enough to the typeck tables. I added marker
comments to any line matching `BindByRef|BindByValue` so that reviewers
are aware of all of them.

I will look into changing the HIR (as suggested in [][2]) to not carry a
`BindingMode` unless one was explicitly specified, but this PR is good for
a first round of comments.

The actual changes are quite small and CI will fail due to overlong lines
caused by the marker comments.

See #42640.

[1]: rust-lang/rfcs#2005
[2]: rust-lang#42640 (comment)

tschottdorf added a commit to tschottdorf/rust that referenced this issue Jul 28, 2017

WIP: default binding modes: add pat_binding_modes
This PR kicks off the implementation of the [default binding modes RFC][1] by
introducing the `pat_binding_modes` typeck table mentioned in the [mentoring
instructions][2].

It is a WIP because I wasn't able to avoid all uses of the binding modes as
not all call sites are close enough to the typeck tables. I added marker
comments to any line matching `BindByRef|BindByValue` so that reviewers
are aware of all of them.

I will look into changing the HIR (as suggested in [][2]) to not carry a
`BindingMode` unless one was explicitly specified, but this PR is good for
a first round of comments.

The actual changes are quite small and CI will fail due to overlong lines
caused by the marker comments.

See #42640.

[1]: rust-lang/rfcs#2005
[2]: rust-lang#42640 (comment)

tschottdorf added a commit to tschottdorf/rust that referenced this issue Jul 28, 2017

default binding modes: add pat_binding_modes
This PR kicks off the implementation of the [default binding modes RFC][1] by
introducing the `pat_binding_modes` typeck table mentioned in the [mentoring
instructions][2].

`pat_binding_modes` is populated in `librustc_typeck/check/_match.rs` and
used wherever the HIR would be scraped prior to this PR. Unfortunately, one
blemish, namely a two callers to `contains_explicit_ref_binding`, remains.
This will likely have to be removed when the second part of [1], the
`pat_adjustments` table, is tackled. Appropriate comments have been added.

See #42640.

[1]: rust-lang/rfcs#2005
[2]: rust-lang#42640 (comment)

tschottdorf added a commit to tschottdorf/rust that referenced this issue Jul 28, 2017

WIP: default binding modes: add pat_binding_modes
This PR kicks off the implementation of the [default binding modes RFC][1] by
introducing the `pat_binding_modes` typeck table mentioned in the [mentoring
instructions][2].

It is a WIP because I wasn't able to avoid all uses of the binding modes as
not all call sites are close enough to the typeck tables. I added marker
comments to any line matching `BindByRef|BindByValue` so that reviewers
are aware of all of them.

I will look into changing the HIR (as suggested in [][2]) to not carry a
`BindingMode` unless one was explicitly specified, but this PR is good for
a first round of comments.

The actual changes are quite small and CI will fail due to overlong lines
caused by the marker comments.

See #42640.

[1]: rust-lang/rfcs#2005
[2]: rust-lang#42640 (comment)

tschottdorf added a commit to tschottdorf/rust that referenced this issue Jul 28, 2017

default binding modes: add pat_binding_modes
This PR kicks off the implementation of the [default binding modes RFC][1] by
introducing the `pat_binding_modes` typeck table mentioned in the [mentoring
instructions][2].

`pat_binding_modes` is populated in `librustc_typeck/check/_match.rs` and
used wherever the HIR would be scraped prior to this PR. Unfortunately, one
blemish, namely a two callers to `contains_explicit_ref_binding`, remains.
This will likely have to be removed when the second part of [1], the
`pat_adjustments` table, is tackled. Appropriate comments have been added.

See #42640.

[1]: rust-lang/rfcs#2005
[2]: rust-lang#42640 (comment)

tschottdorf added a commit to tschottdorf/rust that referenced this issue Jul 28, 2017

default binding modes: add pat_binding_modes
This PR kicks off the implementation of the [default binding modes RFC][1] by
introducing the `pat_binding_modes` typeck table mentioned in the [mentoring
instructions][2].

`pat_binding_modes` is populated in `librustc_typeck/check/_match.rs` and
used wherever the HIR would be scraped prior to this PR. Unfortunately, one
blemish, namely a two callers to `contains_explicit_ref_binding`, remains.
This will likely have to be removed when the second part of [1], the
`pat_adjustments` table, is tackled. Appropriate comments have been added.

See #42640.

[1]: rust-lang/rfcs#2005
[2]: rust-lang#42640 (comment)

nikomatsakis added a commit to tschottdorf/rust that referenced this issue Jul 29, 2017

default binding modes: add pat_binding_modes
This PR kicks off the implementation of the [default binding modes RFC][1] by
introducing the `pat_binding_modes` typeck table mentioned in the [mentoring
instructions][2].

`pat_binding_modes` is populated in `librustc_typeck/check/_match.rs` and
used wherever the HIR would be scraped prior to this PR. Unfortunately, one
blemish, namely a two callers to `contains_explicit_ref_binding`, remains.
This will likely have to be removed when the second part of [1], the
`pat_adjustments` table, is tackled. Appropriate comments have been added.

See #42640.

[1]: rust-lang/rfcs#2005
[2]: rust-lang#42640 (comment)

bors added a commit that referenced this issue Jul 30, 2017

Auto merge of #43399 - tschottdorf:bndmode-pat-adjustments, r=nikomat…
…sakis

default binding modes: add pat_binding_modes

This PR kicks off the implementation of the [default binding modes RFC][1] by
introducing the `pat_binding_modes` typeck table mentioned in the [mentoring
instructions][2].

It is a WIP because I wasn't able to avoid all uses of the binding modes as
not all call sites are close enough to the typeck tables. I added marker
comments to any line matching `BindByRef|BindByValue` so that reviewers
are aware of all of them.

I will look into changing the HIR (as suggested in [2]) to not carry a
`BindingMode` unless one was explicitly specified, but this PR is good for
a first round of comments.

The actual changes are quite small and CI will fail due to overlong lines
caused by the marker comments.

See #42640.

cc @nikomatsakis

[1]: rust-lang/rfcs#2005
[2]: #42640 (comment)

bors added a commit that referenced this issue Jul 30, 2017

Auto merge of #43399 - tschottdorf:bndmode-pat-adjustments, r=nikomat…
…sakis

default binding modes: add pat_binding_modes

This PR kicks off the implementation of the [default binding modes RFC][1] by
introducing the `pat_binding_modes` typeck table mentioned in the [mentoring
instructions][2].

It is a WIP because I wasn't able to avoid all uses of the binding modes as
not all call sites are close enough to the typeck tables. I added marker
comments to any line matching `BindByRef|BindByValue` so that reviewers
are aware of all of them.

I will look into changing the HIR (as suggested in [2]) to not carry a
`BindingMode` unless one was explicitly specified, but this PR is good for
a first round of comments.

The actual changes are quite small and CI will fail due to overlong lines
caused by the marker comments.

See #42640.

cc @nikomatsakis

[1]: rust-lang/rfcs#2005
[2]: #42640 (comment)

tschottdorf added a commit to tschottdorf/rust that referenced this issue Jul 30, 2017

default binding modes: add pat_binding_modes
This PR kicks off the implementation of the [default binding modes RFC][1] by
introducing the `pat_binding_modes` typeck table mentioned in the [mentoring
instructions][2].

`pat_binding_modes` is populated in `librustc_typeck/check/_match.rs` and
used wherever the HIR would be scraped prior to this PR. Unfortunately, one
blemish, namely a two callers to `contains_explicit_ref_binding`, remains.
This will likely have to be removed when the second part of [1], the
`pat_adjustments` table, is tackled. Appropriate comments have been added.

See #42640.

[1]: rust-lang/rfcs#2005
[2]: rust-lang#42640 (comment)

tschottdorf added a commit to tschottdorf/rust that referenced this issue Jul 30, 2017

default binding modes: add pat_binding_modes
This PR kicks off the implementation of the [default binding modes RFC][1] by
introducing the `pat_binding_modes` typeck table mentioned in the [mentoring
instructions][2].

`pat_binding_modes` is populated in `librustc_typeck/check/_match.rs` and
used wherever the HIR would be scraped prior to this PR. Unfortunately, one
blemish, namely a two callers to `contains_explicit_ref_binding`, remains.
This will likely have to be removed when the second part of [1], the
`pat_adjustments` table, is tackled. Appropriate comments have been added.

See #42640.

[1]: rust-lang/rfcs#2005
[2]: rust-lang#42640 (comment)

bors added a commit that referenced this issue Jul 31, 2017

Auto merge of #43399 - tschottdorf:bndmode-pat-adjustments, r=nikomat…
…sakis

default binding modes: add pat_binding_modes

This PR kicks off the implementation of the [default binding modes RFC][1] by
introducing the `pat_binding_modes` typeck table mentioned in the [mentoring
instructions][2].

It is a WIP because I wasn't able to avoid all uses of the binding modes as
not all call sites are close enough to the typeck tables. I added marker
comments to any line matching `BindByRef|BindByValue` so that reviewers
are aware of all of them.

I will look into changing the HIR (as suggested in [2]) to not carry a
`BindingMode` unless one was explicitly specified, but this PR is good for
a first round of comments.

The actual changes are quite small and CI will fail due to overlong lines
caused by the marker comments.

See #42640.

cc @nikomatsakis

[1]: rust-lang/rfcs#2005
[2]: #42640 (comment)

bors added a commit that referenced this issue Jul 31, 2017

Auto merge of #43399 - tschottdorf:bndmode-pat-adjustments, r=nikomat…
…sakis

default binding modes: add pat_binding_modes

This PR kicks off the implementation of the [default binding modes RFC][1] by
introducing the `pat_binding_modes` typeck table mentioned in the [mentoring
instructions][2].

It is a WIP because I wasn't able to avoid all uses of the binding modes as
not all call sites are close enough to the typeck tables. I added marker
comments to any line matching `BindByRef|BindByValue` so that reviewers
are aware of all of them.

I will look into changing the HIR (as suggested in [2]) to not carry a
`BindingMode` unless one was explicitly specified, but this PR is good for
a first round of comments.

The actual changes are quite small and CI will fail due to overlong lines
caused by the marker comments.

See #42640.

cc @nikomatsakis

[1]: rust-lang/rfcs#2005
[2]: #42640 (comment)

matthewhammer added a commit to matthewhammer/rust that referenced this issue Aug 3, 2017

default binding modes: add pat_binding_modes
This PR kicks off the implementation of the [default binding modes RFC][1] by
introducing the `pat_binding_modes` typeck table mentioned in the [mentoring
instructions][2].

`pat_binding_modes` is populated in `librustc_typeck/check/_match.rs` and
used wherever the HIR would be scraped prior to this PR. Unfortunately, one
blemish, namely a two callers to `contains_explicit_ref_binding`, remains.
This will likely have to be removed when the second part of [1], the
`pat_adjustments` table, is tackled. Appropriate comments have been added.

See #42640.

[1]: rust-lang/rfcs#2005
[2]: rust-lang#42640 (comment)
@tschottdorf

This comment has been minimized.

Show comment
Hide comment
@tschottdorf

tschottdorf Aug 19, 2017

Contributor

Starting to look at this again. What do you think is a good next step, @nikomatsakis? I was considering another "plumbing" PR (introduce pat_adjustments which is always trivial), but that doesn't seem useful since it won't really be exercised. Instead I'll try to get a handle on what the computations that populate the typeck tables look like and where they'll live. I assume you'll have more pointers at some point!

Contributor

tschottdorf commented Aug 19, 2017

Starting to look at this again. What do you think is a good next step, @nikomatsakis? I was considering another "plumbing" PR (introduce pat_adjustments which is always trivial), but that doesn't seem useful since it won't really be exercised. Instead I'll try to get a handle on what the computations that populate the typeck tables look like and where they'll live. I assume you'll have more pointers at some point!

@tschottdorf

This comment has been minimized.

Show comment
Hide comment
@tschottdorf

tschottdorf Aug 19, 2017

Contributor

Seems that the changes that populate both tables are going to happen in the general vicinity of typecks match handling, correct? Putting some of that logic in (and testing it) could be a good way to get started, even if it doesn't populate the tables yet.

Am I correctly assuming that the major (additional) place where these tables would be used is during HIR->HAIR lowering so that the HAIR representation spells out the information in the tables explicitly (at which point we're "done" with the tables)?

Contributor

tschottdorf commented Aug 19, 2017

Seems that the changes that populate both tables are going to happen in the general vicinity of typecks match handling, correct? Putting some of that logic in (and testing it) could be a good way to get started, even if it doesn't populate the tables yet.

Am I correctly assuming that the major (additional) place where these tables would be used is during HIR->HAIR lowering so that the HAIR representation spells out the information in the tables explicitly (at which point we're "done" with the tables)?

@tschottdorf

This comment has been minimized.

Show comment
Hide comment
@tschottdorf

tschottdorf Aug 19, 2017

Contributor

Is the below correct (not sure if it's useful to have this method except for assertions, but it'll help me understand what's a non-referential type).

#[allow(dead_code)]
impl PatKind {
    /// Returns true if the pattern is a reference pattern. A reference pattern
    /// is any pattern which can match a reference without coercion. Reference
    /// patterns include bindings, wildcards (_), consts of reference types, and
    /// patterns beginning with & or &mut. All other patterns are non-reference
    /// patterns.
    ///
    /// See https://github.com/rust-lang/rfcs/blob/master/text/2005-match-ergonomics.md#definitions
    /// for rationale.
    fn is_reference_pattern(&self) -> bool {
        // NB: intentionally don't use a catchall arm because it's good to be
        // forced to consider the below when adding/changing `PatKind`.
        //
        // FIXME: is the below correct? In particular, where do "consts of reference types"
        // end up?
        match *self {
            PatKind::Wild |
            PatKind::Binding(..) |
            PatKind::Ref(..) => true,
            PatKind::Struct(..) |
            PatKind::TupleStruct(..) |
            PatKind::Path(_) |
            PatKind::Tuple(..) |
            PatKind::Box(_) |
            PatKind::Lit(_) |
            PatKind::Range(..) |
            PatKind::Slice(..) => false,
        }
    }
}
Contributor

tschottdorf commented Aug 19, 2017

Is the below correct (not sure if it's useful to have this method except for assertions, but it'll help me understand what's a non-referential type).

#[allow(dead_code)]
impl PatKind {
    /// Returns true if the pattern is a reference pattern. A reference pattern
    /// is any pattern which can match a reference without coercion. Reference
    /// patterns include bindings, wildcards (_), consts of reference types, and
    /// patterns beginning with & or &mut. All other patterns are non-reference
    /// patterns.
    ///
    /// See https://github.com/rust-lang/rfcs/blob/master/text/2005-match-ergonomics.md#definitions
    /// for rationale.
    fn is_reference_pattern(&self) -> bool {
        // NB: intentionally don't use a catchall arm because it's good to be
        // forced to consider the below when adding/changing `PatKind`.
        //
        // FIXME: is the below correct? In particular, where do "consts of reference types"
        // end up?
        match *self {
            PatKind::Wild |
            PatKind::Binding(..) |
            PatKind::Ref(..) => true,
            PatKind::Struct(..) |
            PatKind::TupleStruct(..) |
            PatKind::Path(_) |
            PatKind::Tuple(..) |
            PatKind::Box(_) |
            PatKind::Lit(_) |
            PatKind::Range(..) |
            PatKind::Slice(..) => false,
        }
    }
}
@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Aug 19, 2017

Member

In rust, names are usually "paths", because if you can say FOO you can also say std::i32::MIN.
So constants would be PatKind::Path.

Member

eddyb commented Aug 19, 2017

In rust, names are usually "paths", because if you can say FOO you can also say std::i32::MIN.
So constants would be PatKind::Path.

@tschottdorf

This comment has been minimized.

Show comment
Hide comment
@tschottdorf

tschottdorf Aug 19, 2017

Contributor

@eddyb gotcha. So for PatKind::Path(_) we'd have to look into the _ and (after some magic incantation) see if it's a const ref.

I still must be missing something. A "const of reference type" would be const CONST_OF_REF_TYPE: &u8 = &5? I don't even know how to use this in a pattern at all. Is there a simple example?

Contributor

tschottdorf commented Aug 19, 2017

@eddyb gotcha. So for PatKind::Path(_) we'd have to look into the _ and (after some magic incantation) see if it's a const ref.

I still must be missing something. A "const of reference type" would be const CONST_OF_REF_TYPE: &u8 = &5? I don't even know how to use this in a pattern at all. Is there a simple example?

tschottdorf added a commit to tschottdorf/rust that referenced this issue Aug 20, 2017

Introduce pat_adjustments table
Part of the impl of match default binding modes, see

rust-lang#42640

As of this commit the table exists, but it isn't populated
and also not propagated (in `WritebackCx`).
@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Aug 20, 2017

Member

You'd literally use that name in a pattern and it's equivalent to the pattern &5. What you want though is to look at the type, not the shape of the pattern, if you want to know whether it's a reference or not.

For coercions in expressions, we compute a type for an expression, then compare it with the "expected type" coming from the parent expression and if they don't match, we can perform some adjustments. I'd expect patterns to follow a similar route.

Member

eddyb commented Aug 20, 2017

You'd literally use that name in a pattern and it's equivalent to the pattern &5. What you want though is to look at the type, not the shape of the pattern, if you want to know whether it's a reference or not.

For coercions in expressions, we compute a type for an expression, then compare it with the "expected type" coming from the parent expression and if they don't match, we can perform some adjustments. I'd expect patterns to follow a similar route.

@tschottdorf

This comment has been minimized.

Show comment
Hide comment
@tschottdorf

tschottdorf Aug 20, 2017

Contributor

I assume you mean the below, but I've tried and failed to get an example that compiles. What am I missing?

const CONST_REF: &i64 = &5;

fn main() {
    print!("{}", CONST_REF);
    let f = &5i64;
    match f {
        CONST_REF => (),
        _ => (),
    };
}
error[E0080]: constant evaluation error
 --> src/main.rs:1:25
  |
1 | const CONST_REF: &i64 = &5;
  |                         ^^ unimplemented constant expression: address operator
  |
note: for pattern here
 --> src/main.rs:7:9
  |
7 |         CONST_REF => (),
  |         ^^^^^^^^^
Contributor

tschottdorf commented Aug 20, 2017

I assume you mean the below, but I've tried and failed to get an example that compiles. What am I missing?

const CONST_REF: &i64 = &5;

fn main() {
    print!("{}", CONST_REF);
    let f = &5i64;
    match f {
        CONST_REF => (),
        _ => (),
    };
}
error[E0080]: constant evaluation error
 --> src/main.rs:1:25
  |
1 | const CONST_REF: &i64 = &5;
  |                         ^^ unimplemented constant expression: address operator
  |
note: for pattern here
 --> src/main.rs:7:9
  |
7 |         CONST_REF => (),
  |         ^^^^^^^^^
@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Aug 20, 2017

Member

Oh, odd. Well, a &str literal should work nevertheless (or a &[u8; N] one).

Member

eddyb commented Aug 20, 2017

Oh, odd. Well, a &str literal should work nevertheless (or a &[u8; N] one).

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis May 16, 2018

Contributor

(And I think it is a backwards compatible extension)

Contributor

nikomatsakis commented May 16, 2018

(And I think it is a backwards compatible extension)

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb May 16, 2018

Member

I'm not entirely sure this is backwards-compatible, because you can use Some(&x) currently to match &Option<&T> and it results in x: T, whereas the proposal would make x: &T.

Member

eddyb commented May 16, 2018

I'm not entirely sure this is backwards-compatible, because you can use Some(&x) currently to match &Option<&T> and it results in x: T, whereas the proposal would make x: &T.

@rpjohnst

This comment has been minimized.

Show comment
Hide comment
@rpjohnst

rpjohnst May 16, 2018

Contributor

That's deeply unfortunate, how did we wind up with that? 😱

Contributor

rpjohnst commented May 16, 2018

That's deeply unfortunate, how did we wind up with that? 😱

@cramertj

This comment has been minimized.

Show comment
Hide comment
@cramertj

cramertj May 16, 2018

Member

I think we can avoid that by only performing the binding mode switch back to by-move when matching a non-reference with a reference pattern.

Member

cramertj commented May 16, 2018

I think we can avoid that by only performing the binding mode switch back to by-move when matching a non-reference with a reference pattern.

@cramertj

This comment has been minimized.

Show comment
Hide comment
@cramertj

cramertj May 16, 2018

Member

Basically, just like what we did with the first iteration of default-match-binding-modes, we only change the case that doesn't compile at all today.

Member

cramertj commented May 16, 2018

Basically, just like what we did with the first iteration of default-match-binding-modes, we only change the case that doesn't compile at all today.

@joshtriplett

This comment has been minimized.

Show comment
Hide comment
@joshtriplett

joshtriplett May 16, 2018

Member

@cramertj I'd like to suggest that "changing a case that doesn't compile to now compile" is not a heuristic we can or should apply universally. Sometimes it's the compiler's job to refuse to compile something and complain, so that code gets fixed.

In this particular case, if it works in some cases but not in others, and the demarcation between the two isn't extremely clear, or the behavior isn't consistent (which the case @eddyb mentioned suggests it wouldn't be), then that seems worse than simply giving an error about mismatched types. Just from @eddyb's description, I already feel like I can no longer successfully predict how matching will work in cases that involve references.

Member

joshtriplett commented May 16, 2018

@cramertj I'd like to suggest that "changing a case that doesn't compile to now compile" is not a heuristic we can or should apply universally. Sometimes it's the compiler's job to refuse to compile something and complain, so that code gets fixed.

In this particular case, if it works in some cases but not in others, and the demarcation between the two isn't extremely clear, or the behavior isn't consistent (which the case @eddyb mentioned suggests it wouldn't be), then that seems worse than simply giving an error about mismatched types. Just from @eddyb's description, I already feel like I can no longer successfully predict how matching will work in cases that involve references.

@leodasvacas

This comment has been minimized.

Show comment
Hide comment
@leodasvacas

leodasvacas May 16, 2018

Contributor

@rpjohnst It's the exact behaviour specified on the RFC.

Contributor

leodasvacas commented May 16, 2018

@rpjohnst It's the exact behaviour specified on the RFC.

@rpjohnst

This comment has been minimized.

Show comment
Hide comment
@rpjohnst

rpjohnst May 16, 2018

Contributor

@leodasvacas I mean, we already had a fairly large misunderstanding about what exactly the RFC specified here. In fact I'm still not at all sure why it behaves this way. That is, here's my interpretation of the RFC, in which the second case doesn't match the implementation:

match &Some(&0) {
    // starts out as move; looking at &Some(&0)
    Some( // non-reference pattern; deref + switch to ref; now looking at &0
         x // reference pattern; mode stays ref; was looking at &0 so x: &&i32
          ) => ...

    // starts out as move; looking at &Some(&0)
    Some( // non-reference pattern, deref + switch to ref; now looking at &0
         & // reference pattern; mode stays ref; now looking at 0
          x // reference pattern; mode stays ref; was looking at 0 so x: &i32
           ) => ...
}

Or the desugaring:

match &Some(&0) {
    &Some(ref x) => /* x: &&i32 */
    &Some(&ref x) => /* x: &i32 */
}

This desugaring for the second case here (i.e. @eddyb's example) does give x: &i32, so that's clearly not what the implementation is actually doing. Where does the implementation's interpretation of the RFC differ from mine, and why? This looks like a bug to me right now.

Contributor

rpjohnst commented May 16, 2018

@leodasvacas I mean, we already had a fairly large misunderstanding about what exactly the RFC specified here. In fact I'm still not at all sure why it behaves this way. That is, here's my interpretation of the RFC, in which the second case doesn't match the implementation:

match &Some(&0) {
    // starts out as move; looking at &Some(&0)
    Some( // non-reference pattern; deref + switch to ref; now looking at &0
         x // reference pattern; mode stays ref; was looking at &0 so x: &&i32
          ) => ...

    // starts out as move; looking at &Some(&0)
    Some( // non-reference pattern, deref + switch to ref; now looking at &0
         & // reference pattern; mode stays ref; now looking at 0
          x // reference pattern; mode stays ref; was looking at 0 so x: &i32
           ) => ...
}

Or the desugaring:

match &Some(&0) {
    &Some(ref x) => /* x: &&i32 */
    &Some(&ref x) => /* x: &i32 */
}

This desugaring for the second case here (i.e. @eddyb's example) does give x: &i32, so that's clearly not what the implementation is actually doing. Where does the implementation's interpretation of the RFC differ from mine, and why? This looks like a bug to me right now.

@sgrif

This comment has been minimized.

Show comment
Hide comment
@sgrif

sgrif May 16, 2018

Contributor

Is there any way to turn this off within a codebase? For libraries that support <1.26, this is going to be a pain point. CI ensures that we don't accidentally rely on it, but it'd be great if there were a way to make sure that builds fail locally for new contributors regardless of which Rust version they're using. Otherwise we'll just continue to get an increasing number of PRs which depend on this feature only to fail on CI

Contributor

sgrif commented May 16, 2018

Is there any way to turn this off within a codebase? For libraries that support <1.26, this is going to be a pain point. CI ensures that we don't accidentally rely on it, but it'd be great if there were a way to make sure that builds fail locally for new contributors regardless of which Rust version they're using. Otherwise we'll just continue to get an increasing number of PRs which depend on this feature only to fail on CI

@leodasvacas

This comment has been minimized.

Show comment
Hide comment
@leodasvacas

leodasvacas May 16, 2018

Contributor

@rpjohnst You've convinced me, the implementation does not match the RFC specification. So we should specify the implementation and amend the RFC. or we fix the implementation, which should be done sooner rather than later, perhaps even taking advantage of the possibility of a point release #50756.

Actually, this is was an explicitly noted change from RFC semantics so we should just amend the RFC.

Contributor

leodasvacas commented May 16, 2018

@rpjohnst You've convinced me, the implementation does not match the RFC specification. So we should specify the implementation and amend the RFC. or we fix the implementation, which should be done sooner rather than later, perhaps even taking advantage of the possibility of a point release #50756.

Actually, this is was an explicitly noted change from RFC semantics so we should just amend the RFC.

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis May 16, 2018

Contributor

@rpjohnst I believe that was the behavior we opted for in #46688

Contributor

nikomatsakis commented May 16, 2018

@rpjohnst I believe that was the behavior we opted for in #46688

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis May 16, 2018

Contributor

@sgrif there is not presently a way to turn this (or other) features off that I know of; interesting idea though.

Contributor

nikomatsakis commented May 16, 2018

@sgrif there is not presently a way to turn this (or other) features off that I know of; interesting idea though.

@rpjohnst

This comment has been minimized.

Show comment
Hide comment
@rpjohnst

rpjohnst May 16, 2018

Contributor

Ah, so the extra rule is that & patterns no longer preserve the binding mode, but unconditionally reset it. I'd still rather revert that change- it's an extra rule to remember, and it was never in the RFC so it looks like it didn't get discussed much. :(

Though on the other hand, I'm not actually sure whether it makes @Boscop's proposal backwards-incompatible. Could you expand on that @eddyb? Matching a & against an actual & would have the existing meaning (switch to move and deref) while matching a & against a non-& while in ref mode would just switch to move.

On the gripping hand, I'm also strongly in agreement with @joshtriplett in this case. Giving & both those meanings seems very confusing. So I kind of feel like it should be either-or- do we want the existing behavior or do we want the new behavior? I really like the consistency of the new behavior (e.g. #42640 (comment)), and as I've said it's how I always expected it to work.

Contributor

rpjohnst commented May 16, 2018

Ah, so the extra rule is that & patterns no longer preserve the binding mode, but unconditionally reset it. I'd still rather revert that change- it's an extra rule to remember, and it was never in the RFC so it looks like it didn't get discussed much. :(

Though on the other hand, I'm not actually sure whether it makes @Boscop's proposal backwards-incompatible. Could you expand on that @eddyb? Matching a & against an actual & would have the existing meaning (switch to move and deref) while matching a & against a non-& while in ref mode would just switch to move.

On the gripping hand, I'm also strongly in agreement with @joshtriplett in this case. Giving & both those meanings seems very confusing. So I kind of feel like it should be either-or- do we want the existing behavior or do we want the new behavior? I really like the consistency of the new behavior (e.g. #42640 (comment)), and as I've said it's how I always expected it to work.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb May 17, 2018

Member

@rpjohnst Maybe I inaccurately represented @Boscop's proposal - my version doesn't even have a "binding mode" - just that matching &T by a T pattern results in extra & pushed to sub-patterns.

Member

eddyb commented May 17, 2018

@rpjohnst Maybe I inaccurately represented @Boscop's proposal - my version doesn't even have a "binding mode" - just that matching &T by a T pattern results in extra & pushed to sub-patterns.

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis May 17, 2018

Contributor

I'm a bit torn on all this. On the one hand, I find @eddyb's idea interesting — though I think there are many variants and I'm not 100% sure what is being proposed. My understanding was roughly "if you have a value of type &T and a ctor for T, you can "propogate" the & inside". I like that idea but it has a lot of ramifications that are quite different from the existing proposal — and they go beyond #46688.

For example, if I am matching a value of type &Option<&Option<T>> with the pattern Some(Some(x)), the current master gives x the type &T. If I understood @eddyb's proposal, it would give &&T — or perhaps error, it's not entirely clear to me. In particular, we would begin by propagating the & through the first Some(..) pattern, giving a value of type &&Option<T>. Now we match that with Some(..), which either errors or gives a value of type &&T.

So basically it seems to me that we are really talking about rolling back the whole match default bindings change. This is a much bigger thing than #46688.

(When I thought it was just #46688 that was under discussion, I was considering this to be rather similar to the "course correction" that we took for Object Lifetime Default rules. The reasoning here is slightly difference, but ultimately I think that was the right call. But this seems to be a much larger proposal.)

Now, maybe there are other proposals where reverting #46688 would be enough. Perhaps @Boscop's proposal, for example?

Contributor

nikomatsakis commented May 17, 2018

I'm a bit torn on all this. On the one hand, I find @eddyb's idea interesting — though I think there are many variants and I'm not 100% sure what is being proposed. My understanding was roughly "if you have a value of type &T and a ctor for T, you can "propogate" the & inside". I like that idea but it has a lot of ramifications that are quite different from the existing proposal — and they go beyond #46688.

For example, if I am matching a value of type &Option<&Option<T>> with the pattern Some(Some(x)), the current master gives x the type &T. If I understood @eddyb's proposal, it would give &&T — or perhaps error, it's not entirely clear to me. In particular, we would begin by propagating the & through the first Some(..) pattern, giving a value of type &&Option<T>. Now we match that with Some(..), which either errors or gives a value of type &&T.

So basically it seems to me that we are really talking about rolling back the whole match default bindings change. This is a much bigger thing than #46688.

(When I thought it was just #46688 that was under discussion, I was considering this to be rather similar to the "course correction" that we took for Object Lifetime Default rules. The reasoning here is slightly difference, but ultimately I think that was the right call. But this seems to be a much larger proposal.)

Now, maybe there are other proposals where reverting #46688 would be enough. Perhaps @Boscop's proposal, for example?

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb May 17, 2018

Member

@nikomatsakis I would expect an error and require Some(&Some(x)), which has always been a valid pattern for the type behind the reference, but the possibility of a valid Some(&&Some(x)) pattern makes me uneasy and I begin to agree with the motivation for a more "deeper flattening" of references.

Member

eddyb commented May 17, 2018

@nikomatsakis I would expect an error and require Some(&Some(x)), which has always been a valid pattern for the type behind the reference, but the possibility of a valid Some(&&Some(x)) pattern makes me uneasy and I begin to agree with the motivation for a more "deeper flattening" of references.

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis May 17, 2018

Contributor

@eddyb the need to sometimes introduce dummy temps just to make "doubly indirect" references feels bad to me, if that's what you mean (i.e., if we were going to produce a &&T value for x)

Contributor

nikomatsakis commented May 17, 2018

@eddyb the need to sometimes introduce dummy temps just to make "doubly indirect" references feels bad to me, if that's what you mean (i.e., if we were going to produce a &&T value for x)

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb May 17, 2018

Member

@nikomatsakis No, in my view, you would never add more than one reference to bindings, because you wouldn't add more than one reference to sub-patterns on a constructor matching a single level of reference to the type of the constructor.

Member

eddyb commented May 17, 2018

@nikomatsakis No, in my view, you would never add more than one reference to bindings, because you wouldn't add more than one reference to sub-patterns on a constructor matching a single level of reference to the type of the constructor.

@leodasvacas

This comment has been minimized.

Show comment
Hide comment
@leodasvacas

leodasvacas May 17, 2018

Contributor

I see nothing wrong with the current rules, and certainly see no motive to rollback stabilized functionality.

In fact I like the extension that makes strictly more things compile, it's a DWIM approach. The interpretation is that a & in a pattern always means by-move. If it's matching a &T, then move T. If it's matching T in ref binding mode, then move T.

Contributor

leodasvacas commented May 17, 2018

I see nothing wrong with the current rules, and certainly see no motive to rollback stabilized functionality.

In fact I like the extension that makes strictly more things compile, it's a DWIM approach. The interpretation is that a & in a pattern always means by-move. If it's matching a &T, then move T. If it's matching T in ref binding mode, then move T.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb May 17, 2018

Member

@leodasvacas I agree with that sentiment, now that I've seen examples with nested references.
My idea was "interesting" in the very strict sense that it had a "bijective" sugar, but nothing more.

Member

eddyb commented May 17, 2018

@leodasvacas I agree with that sentiment, now that I've seen examples with nested references.
My idea was "interesting" in the very strict sense that it had a "bijective" sugar, but nothing more.

@rpjohnst

This comment has been minimized.

Show comment
Hide comment
@rpjohnst

rpjohnst May 17, 2018

Contributor

Ah, I hadn't quite realized the difference between @Boscop's and @eddyb's proposals. I agree we probably want to keep the ability to match multiple non-reference patterns without "accumulating" references. (Those extra "layers" of pointers do exist in the match-ee, but they can't be reused in the &&T, so we'd be inventing them out of thin air anyway.)

I do still think that a) we definitely want something like @Boscop's extension (which should be backwards-compatible) and that b) it interacts weirdly with the change from #46688. That is, I would like a single way to reset the binding mode that applies anywhere; not just at pre-existing references.

The confusion in #46688 comes from the fact that the RFC "hasn't quite made up its mind" on whether &s are part of the "structure" of a type- they become optional rather than invisible, or in other words the new match behavior is more like a coercion than a mode.

So here's an idea to solve both of these problems: once we're in a ref binding mode, always remove &s from the "visible structure" of a type, and let all &s in the pattern work like @Boscop's extension. That is, bindings are now "non-reference patterns" just like constructors, and & is no longer a "reference pattern" that can match against &T values because those values are no longer visible.

This gives the same result as the current implementation for #46688- matching (_, b) against &(1, &2) desguars to &(_, &ref b) rather than &(_, b), giving b: &i32; matching (_, &b) desguars to &(_, &<ref cancelled out by &>b), giving b: i32. But it also gives us the ability to handle @Boscop's #50008- matching (i, foo) against &(usize, Foo) still desguars to &(ref i, ref foo), giving i: &usize; but matching (&i, foo) desguars to &(<ref cancelled out by &>i, ref foo), giving i: usize as desired.

Or in other words, you can no longer match against actual &s, but only against the artificial &s generated by being in ref binding mode.

Edit: Actually the more I think about this version, the more it feels like 50% pedagogical change and 50% just adding @Boscop's extension. Which is great, because that means it's more likely to be compatible and also more likely to offer better error messages!

Contributor

rpjohnst commented May 17, 2018

Ah, I hadn't quite realized the difference between @Boscop's and @eddyb's proposals. I agree we probably want to keep the ability to match multiple non-reference patterns without "accumulating" references. (Those extra "layers" of pointers do exist in the match-ee, but they can't be reused in the &&T, so we'd be inventing them out of thin air anyway.)

I do still think that a) we definitely want something like @Boscop's extension (which should be backwards-compatible) and that b) it interacts weirdly with the change from #46688. That is, I would like a single way to reset the binding mode that applies anywhere; not just at pre-existing references.

The confusion in #46688 comes from the fact that the RFC "hasn't quite made up its mind" on whether &s are part of the "structure" of a type- they become optional rather than invisible, or in other words the new match behavior is more like a coercion than a mode.

So here's an idea to solve both of these problems: once we're in a ref binding mode, always remove &s from the "visible structure" of a type, and let all &s in the pattern work like @Boscop's extension. That is, bindings are now "non-reference patterns" just like constructors, and & is no longer a "reference pattern" that can match against &T values because those values are no longer visible.

This gives the same result as the current implementation for #46688- matching (_, b) against &(1, &2) desguars to &(_, &ref b) rather than &(_, b), giving b: &i32; matching (_, &b) desguars to &(_, &<ref cancelled out by &>b), giving b: i32. But it also gives us the ability to handle @Boscop's #50008- matching (i, foo) against &(usize, Foo) still desguars to &(ref i, ref foo), giving i: &usize; but matching (&i, foo) desguars to &(<ref cancelled out by &>i, ref foo), giving i: usize as desired.

Or in other words, you can no longer match against actual &s, but only against the artificial &s generated by being in ref binding mode.

Edit: Actually the more I think about this version, the more it feels like 50% pedagogical change and 50% just adding @Boscop's extension. Which is great, because that means it's more likely to be compatible and also more likely to offer better error messages!

@Boscop

This comment has been minimized.

Show comment
Hide comment
@Boscop

Boscop May 17, 2018

@rpjohnst Just to clarify: You propose that all &s of the constituent's type will be undone, and then the implicit ref will add one level of &, right?
So matching (a, b) against &(1, &&2) will desugar to &(ref a, &&ref b) (which will result in a: &i32 and b: &i32), correct? (Similar with more levels of &)
So one could remove the & introduced by the implicit ref, too, by writing (a, &b).
The rule that all constituents will only have one level of & is easy to remember.

It works well for read-only (&) references because for those it makes no difference how many levels of references the type has.
But for mutable references it makes a difference, e.g.:
Matching (a, b) against &mut (1, &mut &mut 2) will desugar to &mut (ref mut a, &mut &mut ref mut b), resulting in b: &mut i32 (with that rule).
But you can do less with a &mut T than with a &mut &mut T, so in some cases, users might want to keep multiple levels of &mut.

So, if we want to preserve the ability to keep all the &mut-ref levels of the original type, we have to do it for read-only refs, too (if we want to preserve consistency between read-only refs and mutable refs).

So then we can't auto-remove all the & levels of the original type, but we can keep the current rule of match_default_bindings that's equally easy to remember ("each constituent's type will get one additional ref") but allow the user to undo each level by using & in the pattern. E.g. matching (_, &&&b) against &(1, &&2) will result in b: i32 (all & levels undone, including the implicit ref), and matching (_, &mut &mut &mut b) against &mut (1, &mut &mut 2) will also result in b: i32.

(But I think it makes sense to allow undoing a &mut by writing &. E.g. matching (_, &&&b) against &mut (1, &mut &mut 2) to get b: i32.)

Boscop commented May 17, 2018

@rpjohnst Just to clarify: You propose that all &s of the constituent's type will be undone, and then the implicit ref will add one level of &, right?
So matching (a, b) against &(1, &&2) will desugar to &(ref a, &&ref b) (which will result in a: &i32 and b: &i32), correct? (Similar with more levels of &)
So one could remove the & introduced by the implicit ref, too, by writing (a, &b).
The rule that all constituents will only have one level of & is easy to remember.

It works well for read-only (&) references because for those it makes no difference how many levels of references the type has.
But for mutable references it makes a difference, e.g.:
Matching (a, b) against &mut (1, &mut &mut 2) will desugar to &mut (ref mut a, &mut &mut ref mut b), resulting in b: &mut i32 (with that rule).
But you can do less with a &mut T than with a &mut &mut T, so in some cases, users might want to keep multiple levels of &mut.

So, if we want to preserve the ability to keep all the &mut-ref levels of the original type, we have to do it for read-only refs, too (if we want to preserve consistency between read-only refs and mutable refs).

So then we can't auto-remove all the & levels of the original type, but we can keep the current rule of match_default_bindings that's equally easy to remember ("each constituent's type will get one additional ref") but allow the user to undo each level by using & in the pattern. E.g. matching (_, &&&b) against &(1, &&2) will result in b: i32 (all & levels undone, including the implicit ref), and matching (_, &mut &mut &mut b) against &mut (1, &mut &mut 2) will also result in b: i32.

(But I think it makes sense to allow undoing a &mut by writing &. E.g. matching (_, &&&b) against &mut (1, &mut &mut 2) to get b: i32.)

@rpjohnst

This comment has been minimized.

Show comment
Hide comment
@rpjohnst

rpjohnst May 17, 2018

Contributor

So matching (a, b) against &(1, &&2) will desugar to &(ref a, &&ref b) (which will result in a: &i32 and b: &i32), correct? (Similar with more levels of &)

Yes, exactly. Kind of doubling down on the idea that &s aren't part of the type's structure while in ref mode. This is one place that the behavior changes, though, so we may want to consider it for a point-release.

So, if we want to preserve the ability to keep all the &mut-ref levels of the original type, we have to do it for read-only refs, too (if we want to preserve consistency between read-only refs and mutable refs).

IMO we shouldn't do this at all- the move binding mode is sufficient to control the levels of references. If you need that control, just switch back to that mode with a & before you reach the binding. Straightforwardness and consistency for & vs &mut for the ref binding modes, and full control for the move binding mode.

The idea of accumulating extra reference levels, as you propose, is basically the thing that we (mis)interpreted @eddyb's version as, and which has the problem of introducing dummy temporaries, and which is backwards-incompatible with the #46688 fix.

Contributor

rpjohnst commented May 17, 2018

So matching (a, b) against &(1, &&2) will desugar to &(ref a, &&ref b) (which will result in a: &i32 and b: &i32), correct? (Similar with more levels of &)

Yes, exactly. Kind of doubling down on the idea that &s aren't part of the type's structure while in ref mode. This is one place that the behavior changes, though, so we may want to consider it for a point-release.

So, if we want to preserve the ability to keep all the &mut-ref levels of the original type, we have to do it for read-only refs, too (if we want to preserve consistency between read-only refs and mutable refs).

IMO we shouldn't do this at all- the move binding mode is sufficient to control the levels of references. If you need that control, just switch back to that mode with a & before you reach the binding. Straightforwardness and consistency for & vs &mut for the ref binding modes, and full control for the move binding mode.

The idea of accumulating extra reference levels, as you propose, is basically the thing that we (mis)interpreted @eddyb's version as, and which has the problem of introducing dummy temporaries, and which is backwards-incompatible with the #46688 fix.

@Boscop

This comment has been minimized.

Show comment
Hide comment
@Boscop

Boscop May 17, 2018

@rpjohnst

The idea of accumulating extra reference levels, as you propose, is basically the thing that we (mis)interpreted @eddyb's version as, and which has the problem of introducing dummy temporaries, and which is backwards-incompatible with the #46688 fix.

Hm, which dummy temporaries would it introduce?

Wouldn't your proposal also be backwards incompatible (because currently, matching (_, b) against &(1, &2) results in b: &&i32 but afterwards, it would result in b: &i32)?

Boscop commented May 17, 2018

@rpjohnst

The idea of accumulating extra reference levels, as you propose, is basically the thing that we (mis)interpreted @eddyb's version as, and which has the problem of introducing dummy temporaries, and which is backwards-incompatible with the #46688 fix.

Hm, which dummy temporaries would it introduce?

Wouldn't your proposal also be backwards incompatible (because currently, matching (_, b) against &(1, &2) results in b: &&i32 but afterwards, it would result in b: &i32)?

@rpjohnst

This comment has been minimized.

Show comment
Hide comment
@rpjohnst

rpjohnst May 17, 2018

Contributor

It introduces dummy temporaries when you have some structure in between the layers of references. If you're just matching against an &&&i32 or whatever then that's unnecessary.

And yes, my proposal is backwards-incompatible in cases like matching (_, b) against &(1, &2) or .. x .. against .. &&1 .., which is why I mentioned a point release. I'm not sure how else to simultaneously get your proposal working, fix #46688, and not give & multiple meanings.

Contributor

rpjohnst commented May 17, 2018

It introduces dummy temporaries when you have some structure in between the layers of references. If you're just matching against an &&&i32 or whatever then that's unnecessary.

And yes, my proposal is backwards-incompatible in cases like matching (_, b) against &(1, &2) or .. x .. against .. &&1 .., which is why I mentioned a point release. I'm not sure how else to simultaneously get your proposal working, fix #46688, and not give & multiple meanings.

@Boscop

This comment has been minimized.

Show comment
Hide comment
@Boscop

Boscop Jul 19, 2018

Any update on this issue? :)

Boscop commented Jul 19, 2018

Any update on this issue? :)

@rpjohnst

This comment has been minimized.

Show comment
Hide comment
@rpjohnst

rpjohnst Jul 24, 2018

Contributor

So I think the place we're at now is @Boscop's original proposal, summarized by @nikomatsakis in #42640 (comment) as "make &P patterns, when matching a non-reference type and in a "default binding mode" of ref, go back to a default binding mode of move."

While in a sense this gives & two meanings, as I expressed in #42640 (comment), AFAICT it's the only thing that's backwards-compatible, unlike @eddyb's and my later proposals. So it's probably the best we can do if we want to solve the problem in #50008.

I'd like to solve this---it came up again on Reddit and it's kind of a pain. Is the next step just to write an RFC for it?

Contributor

rpjohnst commented Jul 24, 2018

So I think the place we're at now is @Boscop's original proposal, summarized by @nikomatsakis in #42640 (comment) as "make &P patterns, when matching a non-reference type and in a "default binding mode" of ref, go back to a default binding mode of move."

While in a sense this gives & two meanings, as I expressed in #42640 (comment), AFAICT it's the only thing that's backwards-compatible, unlike @eddyb's and my later proposals. So it's probably the best we can do if we want to solve the problem in #50008.

I'd like to solve this---it came up again on Reddit and it's kind of a pain. Is the next step just to write an RFC for it?

@withoutboats

This comment has been minimized.

Show comment
Hide comment
@withoutboats

withoutboats Jul 26, 2018

Contributor

@rpjohnst The blocker I think is that there are several possible solutions to this problem and considerable dissensus among the lang team (and certainly the community also) about what would be best:

  1. As you said, using the & sigil is one option.
  2. Another option would be to introduce a new pattern syntax, like (a, move b), instead of (a, &b).
  3. Finally, it's been argued that another, currently postponed proposal to insert dereferences to convert &T to T where T: Copy would resolve this problem in practice.

Personally, I lean toward one of the latter two options because I think the use of & with inverted meaning in patterns is, despite its conceptual elegance, very confusing for many users, who don't seem to think of the reference operator as a type constructor/destructurer per se.

Contributor

withoutboats commented Jul 26, 2018

@rpjohnst The blocker I think is that there are several possible solutions to this problem and considerable dissensus among the lang team (and certainly the community also) about what would be best:

  1. As you said, using the & sigil is one option.
  2. Another option would be to introduce a new pattern syntax, like (a, move b), instead of (a, &b).
  3. Finally, it's been argued that another, currently postponed proposal to insert dereferences to convert &T to T where T: Copy would resolve this problem in practice.

Personally, I lean toward one of the latter two options because I think the use of & with inverted meaning in patterns is, despite its conceptual elegance, very confusing for many users, who don't seem to think of the reference operator as a type constructor/destructurer per se.

@rpjohnst

This comment has been minimized.

Show comment
Hide comment
@rpjohnst

rpjohnst Jul 26, 2018

Contributor

@withoutboats Ah, I didn't realize this would be in conflict with the second two options, especially in light of #48448.

So would the next step be to get consensus on which approach to take? Do we need to stick with just one, or could we have both to round out the existing behavior?

Contributor

rpjohnst commented Jul 26, 2018

@withoutboats Ah, I didn't realize this would be in conflict with the second two options, especially in light of #48448.

So would the next step be to get consensus on which approach to take? Do we need to stick with just one, or could we have both to round out the existing behavior?

@withoutboats

This comment has been minimized.

Show comment
Hide comment
@withoutboats

withoutboats Jul 27, 2018

Contributor

@rpjohnst yea, we need consensus building to move forward here. as far as i know, we don't even have consensus that we shouldn't do more than one of them.

Contributor

withoutboats commented Jul 27, 2018

@rpjohnst yea, we need consensus building to move forward here. as far as i know, we don't even have consensus that we shouldn't do more than one of them.

@Boscop

This comment has been minimized.

Show comment
Hide comment
@Boscop

Boscop Jul 27, 2018

@withoutboats

  1. Finally, it's been argued that another, currently postponed proposal to insert dereferences to convert &T to T where T: Copy would resolve this problem in practice.

Always, unconditionally? But &mut T couldn't be unconditionally dereferenced/moved, because the code in the body might want to mutate the original.

(Also, wouldn't this solution break existing code (on stable) that uses match_default_bindings, where code in the body is now manually dereferencing the reference?)

  1. Another option would be to introduce a new pattern syntax, like (a, move b), instead of (a, &b).

Please don't forget, it should be possible to mark the moved-out copy as mut locally, like &(mut x) to copy the &T as a T and make it locally mutable. (Using the existing &(mut x) and &mut (mut x) syntax.)

  1. As you said, using the & sigil is one option.

I really think the best option is to use the & sigil because that's the most consistent/familiar way and what people would expect.
And people would expect the &(mut x) syntax to work with the & sigil in this case, too!
(If we don't use the & sigil, we'd also have to special-case another syntax to make the move-copied value locally mut!)

Btw, what I would like (in addition to going with the & syntax option), is being able to use & also for move-copying a &mut T (instead of having to write &mut x) (this would also allow using &(mut x) to move-copy a &mut T and make it mut in the local scope).


Personally, I lean toward one of the latter two options because I think the use of & with inverted meaning in patterns is, despite its conceptual elegance, very confusing for many users, who don't seem to think of the reference operator as a type constructor/destructurer per se.

I've talked to several people who expected it to already work like this because it also works like this when the reference wasn't created implicitly by match_default_bindings but inherent in the type before.
The reasoning difficulty for this is not the & syntax but recognizing that match_default_bindings implicitly turned the constituents into references. After that, the reader's reasoning can follow the normal rules that apply when matching a &T with a &x pattern (including the &(mut x) syntax), so if we settle on the & syntax, it only requires the reader to recognize that match_default_bindings is active here, but uses familiar and consistent syntax (unlike the other 2 proposed solutions).
(Now you may say that seeing a move keyword there would make it clearer to the reader what's going on, BUT: In many instances where match_default_bindings is used, all constituents will be left as a reference, so Rust coders have to be able to spot instances where match_default_bindings is active anyway! We always have to keep a type checker running in our head while reading/writing Rust code anyway.)

Boscop commented Jul 27, 2018

@withoutboats

  1. Finally, it's been argued that another, currently postponed proposal to insert dereferences to convert &T to T where T: Copy would resolve this problem in practice.

Always, unconditionally? But &mut T couldn't be unconditionally dereferenced/moved, because the code in the body might want to mutate the original.

(Also, wouldn't this solution break existing code (on stable) that uses match_default_bindings, where code in the body is now manually dereferencing the reference?)

  1. Another option would be to introduce a new pattern syntax, like (a, move b), instead of (a, &b).

Please don't forget, it should be possible to mark the moved-out copy as mut locally, like &(mut x) to copy the &T as a T and make it locally mutable. (Using the existing &(mut x) and &mut (mut x) syntax.)

  1. As you said, using the & sigil is one option.

I really think the best option is to use the & sigil because that's the most consistent/familiar way and what people would expect.
And people would expect the &(mut x) syntax to work with the & sigil in this case, too!
(If we don't use the & sigil, we'd also have to special-case another syntax to make the move-copied value locally mut!)

Btw, what I would like (in addition to going with the & syntax option), is being able to use & also for move-copying a &mut T (instead of having to write &mut x) (this would also allow using &(mut x) to move-copy a &mut T and make it mut in the local scope).


Personally, I lean toward one of the latter two options because I think the use of & with inverted meaning in patterns is, despite its conceptual elegance, very confusing for many users, who don't seem to think of the reference operator as a type constructor/destructurer per se.

I've talked to several people who expected it to already work like this because it also works like this when the reference wasn't created implicitly by match_default_bindings but inherent in the type before.
The reasoning difficulty for this is not the & syntax but recognizing that match_default_bindings implicitly turned the constituents into references. After that, the reader's reasoning can follow the normal rules that apply when matching a &T with a &x pattern (including the &(mut x) syntax), so if we settle on the & syntax, it only requires the reader to recognize that match_default_bindings is active here, but uses familiar and consistent syntax (unlike the other 2 proposed solutions).
(Now you may say that seeing a move keyword there would make it clearer to the reader what's going on, BUT: In many instances where match_default_bindings is used, all constituents will be left as a reference, so Rust coders have to be able to spot instances where match_default_bindings is active anyway! We always have to keep a type checker running in our head while reading/writing Rust code anyway.)

@rpjohnst

This comment has been minimized.

Show comment
Hide comment
@rpjohnst

rpjohnst Jul 27, 2018

Contributor

Having thought through this some more, this is one place in the language where I can easily see supporting both & and move like I mentioned above. We already have two "modes" and two prominent corresponding mental models- old-style patterns with "&T is an address," and ergonomics-style patterns with "&T is a less-permissive T."

So people (and, more importantly IMO, particular pieces of programs given their context) can use the one that expresses their intent more clearly. When & (and later, Box/Rc/etc) matter as part of the structure of your type, use &; when they don't and you just want to talk about permissions, use ref/move.

Does this make sense to anyone else? Or would people rather stick to just one solution? Or do people like the "coerce &T to T where T: Copy" idea better? In the last case, would such a coercion be more than something like autoderef-on-operators (e.g. in argument position), and does that really seem like something we could get consensus on (I would be really surprised)?

Contributor

rpjohnst commented Jul 27, 2018

Having thought through this some more, this is one place in the language where I can easily see supporting both & and move like I mentioned above. We already have two "modes" and two prominent corresponding mental models- old-style patterns with "&T is an address," and ergonomics-style patterns with "&T is a less-permissive T."

So people (and, more importantly IMO, particular pieces of programs given their context) can use the one that expresses their intent more clearly. When & (and later, Box/Rc/etc) matter as part of the structure of your type, use &; when they don't and you just want to talk about permissions, use ref/move.

Does this make sense to anyone else? Or would people rather stick to just one solution? Or do people like the "coerce &T to T where T: Copy" idea better? In the last case, would such a coercion be more than something like autoderef-on-operators (e.g. in argument position), and does that really seem like something we could get consensus on (I would be really surprised)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment