Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ambiguous map patterns a dynamic error? #2657

Closed
eernstg opened this issue Nov 24, 2022 · 14 comments · Fixed by #2675
Closed

Ambiguous map patterns a dynamic error? #2657

eernstg opened this issue Nov 24, 2022 · 14 comments · Fixed by #2675
Labels
patterns Issues related to pattern matching. question Further information is requested

Comments

@eernstg
Copy link
Member

eernstg commented Nov 24, 2022

Thanks to @sgrekhov for bringing up this topic. Consider the following program:

void main() {
  switch ({1.5: true, 2.5: false}) {
    case {1.5: var x, 1.5: var y}: break;
  }
}

The class double does not have primitive equality. This means that it is not a compile-time error to have two keys in a map pattern that are equal (basically, we say that this can't be known at compile-time, and that is indeed true in the general case). The run-time behavior is to check that the given scrutinee is a map with suitable type arguments (there are a few possible choices, but they don't matter here), and the length of the map matches the number of subpatterns, and that each subpattern matches.

So the match succeeds (and binds x and y to true).

However, the 'success' is misleading: The length requirement certainly doesn't establish any useful guarantees about the scrutinee, and it is confusing (and non-obvious in the general case) that we're binding both x and y to the value of the same key.

A similar scenario can be created for any key type whose equality isn't primitive. In particular, we could use any key type which is a class C where Object.== has been overridden by C or any of its superclasses. For example:

class C {
  final int i;
  final String comment;
  const C(this.i, [this.comment = '']);
  bool operator ==(Object other) {
    print('    (invoked ==)');
    if (other.runtimeType != C) return false;
    return (other as C).i == i;
  }
  int get hashCode => classHash ^ i.hashCode;
  static const classHash = 0x123456789;
}

void main() {
  const c1 = C(1, 'Pattern key 1');
  const c2 = C(1, 'Pattern key 2');
  var cs1 = C(1, 'In scrutinee, matched twice');
  var cs2 = C(2, 'In scrutinee, ignored');
  var map = <C, int>{cs1: 42, cs2: 0};

  switch (map) {
    case {c1: var a, c2: var b}: print('$a, $b'); // '42, 42'.
  }
}

This matching behavior seems confusing and bug-prone, so we might want to report a problem at some point. However, it is an undecidable problem in general whether or not there is a match, because it runs a developer-written function body.

Would it be better to raise a dynamic error in the case where multiple keys in a map pattern are identical or equal? (We can't make it an error to get the same value twice, because there is nothing wrong with {1: true, 2: true}).

The 'correct' solution seems to be that we detect the situation where the k key constant expressions of a map pattern do not look up every distinct key of the scrutinee (even though the length equality test seems to imply that we're getting all of them). However, there's no easy and performant implementation of "looked up every distinct key of a map". Alternatively, we could check that the matching process obtains every member of the values.toSet() of the given scrutinee (in the example above, the pattern matching process never retrieves the value 0), but that's of course an imprecise check because accidental key duplication could have caused us to omit some values that are duplicates (again {C(1): true, C(2): true} could match C(1) twice and never match C(2), and still get all values).

@munificent, @stereotype441, @kallentu, @natebosch, @leafpetersen, @jakemac53, @lrhn, WDYT? Should we report a dynamic error because the "precise length required, but multiple matches of one key and some other keys ignored" situation is a likely bug? Alternatively, should we just let it pass (and perhaps lint against using map keys with non-primitive equality)?

@eernstg eernstg added question Further information is requested patterns Issues related to pattern matching. labels Nov 24, 2022
@lrhn
Copy link
Member

lrhn commented Nov 24, 2022

The general problem cannot be solved at compile-time.
It might also be unsolvable at run-time.

Would it be better to raise a dynamic error in the case where multiple keys in a map pattern are identical or equal

Maybe for identical. We should assume that == is always reflexive (blatantly ignoring NaN).

For equality, we probably shouldn't.

To dynamically check whether any two of N keys are equal, we need to do N*(N-1)/2 checks (and that's assuming symmetry).

We cannot tell which equality a map uses for lookup, so comparing the lookup keys might be irrelevant.

  • The map might use the == of the map keys, which can do anything.
  • Or it might use a custom equals function passed to the map constructor. The common case is an identity map.

And we cannot tell from the outside which one it is.

What we know, for absolutely sure, is that the == of the map pattern/lookup keys is the least relevant equality to consider. It's unlikely to be called at all.

The one hypothetical reason to look at the map pattern key's equalities anyway would be that we assume (without evidence) that the actual map keys and the map pattern keys will have the same type, and that the map uses the == of the keys.
The former assumption is probably somewhat reasonable, the latter is more questionable.
Warning about equal values with primitive equality is already a stretch, because it does this. It's just that common maps and common keys will fit into that pattern, and primitive equality is very close to identity, except for a few select classes that we control, and where we don't think you should care about identity.

In this particular case, using doubles, it looks like we should be able to do something.
Should we just say that double constants have primitive equality, except for NaN values?

(We'll have to compare integers and doubles in constant primitive equality comparisons then. It's probably not a problem. I hope it won't make anything behave differently between the web and native, not any more than they already do.)

All in all, I don't want to complicate map patterns more than necessary, because the general issue is unsolvable, and the common case works well.

I'd be fine with warning about identical constant keys in a map pattern, even without primitive equality. Anything more than that will not carry its own weight, the potential benefit will be too small.

If you think of {k1: p1, ..., kn: pn} as equivalent to Map(length: == n, [k1]: p1, ..., [kn] = pn), where [kn] is a selector we should allow in any object pattern, then map patterns are just shorthands for object patterns.
I can explain that, without getting into details about how map values can differ.

We can also warn if you have the same property name twice in Foo(foo: p1, foo: p2) pattern. Warnings about identical key constants is the same thing if you look at the shorthand.

@sgrekhov
Copy link
Contributor

const m = {3.14: "pi"}; 
// Error. The type of a key in a constant map can't override the '==' operator, but the class 'double' does. 
// Try using a different value for the key, or removing the keyword 'const' from the map. 
// • const_map_key_expression_type_implements_equals

Could it be an option to produce the same error in case of map pattern?

@eernstg
Copy link
Member Author

eernstg commented Nov 25, 2022

@lrhn wrote:

The general problem cannot be solved at compile-time.
It might also be unsolvable at run-time.

Exactly. I think this illustrates the point that it is misguided to pretend that we provide a guarantee about matching every key in a map pattern with no rest pattern, in the case where some key expressions have non-primitive equality. One possible approach could be to say that it is a compile-time error, something like:

A compile-time error occurs if a map pattern with no rest pattern has one or more key expressions whose value does not have a primitive operator ==.

This means that we will actually match every key in a map when the pattern seems to promise that we do just that (that is, all keys have primitive equality), and in the (rare?) cases where no such guarantee exists, the pattern must explicitly opt out of the length equality check by having ... at the end.

It is still possible to have a confusing pattern match failure at run time because the map might have length 1, but the map pattern could have 2 mapPatternEntrys with non-primitive equality that are equal to the same key. So the match fails, even though we could just have allowed it to succeed, just because we insist that there must be two different keys, and then we still only use one of them.

@eernstg
Copy link
Member Author

eernstg commented Nov 25, 2022

@sgrekhov wrote:

produce the same error in case of map pattern

That would certainly be a safe bet, but I suppose there's significant support behind the idea that the language should be more permissive than today's switches.

On the other hand, if we restrict map patterns like that now, we could generalize them later in any way we want.

@lrhn
Copy link
Member

lrhn commented Nov 25, 2022

If we "desugar" a map pattern like {k1: p1, k2: p2, ...} to Map(length: >= 2, [k1]: p1, [k2]: p2), we can get into the situation where k1 and k2 both match the single key of {someKey: someValue}, but the pattern won't match because of the >= 2.

I think that's OK. If we teach people to read map patterns as shorthands for checking length and values, then they'll understand the cases where that fails.
If we try to pretend that map patterns are clever, people will get confused when they aren't.

Hopefully most users will never see the edge cases, because they just use normal Dart ==-based maps.

A compile-time error occurs if a map pattern with no rest pattern has one or more key expressions whose value
does not have a primitive operator ==.

I think that's going too far. There are lots of potentially useful classes with non-primitive == which you'd want to match against a map key.
If the keys are weird, we can't guarantee that the matching won't be weird, but it'll be easy to explain why weirdness propagates.
Most keys aren't weird, and it will just work precisely as you expected.

(Who am I kidding? Map patterns will almost exclusively be used for JSON, which has string keys with primitive ==. It won't matter what we do apart from that.)

I think it's fine to tell users when they do something that looks like it's a mistake:

  • The same (identical) key twice.
  • Equal keys with primitive == (which is identical-adjacent).

If you use keys without primitive equality, we can't tell you at compile time whether you are maybe repeating yourself.
We shouldn't waste time at runtime doing it.

That seems like a good-enough compromise for me.

@munificent
Copy link
Member

munificent commented Nov 28, 2022

I'm with Lasse. I agree that it's possible to write map patterns that look and act funny. But you can write all sorts of Dart code that looks and acts funny. If you do that, you get what you wrote. In particular:

void main() {
  switch ({1.5: true, 2.5: false}) {
    case {1.5: var x, 1.5: var y}: break;
  }
}

Even a cursory inspection should reveal that this pattern is not what we'd call good code. So, yes, if you write bad code, you might get behavior you didn't expect.

I think the best way to address that is to make the behavior simple and easy to predict instead of trying to figure out ways to prohibit writing bad code. Right now, the semantics for map patterns are pretty simple and work well:

  1. If the map pattern doesn't have ..., then fail if the length isn't equal to the number of subpatterns.
  2. Look up each key in the subpatterns and test the resulting values against the corresponding subpattern.

Once a user knows those two points, they can reason correctly about a map pattern behaves even when it is strangely formed or uses unusual key types or a strange implementation of Map. It may not do what they want, but at least they can understand what it does do so that they can then change their code to do what they want.

I do think it might be useful to have a lint that fires on map keys that "seem" to be equal but where we can't tell for certain because they don't use primitive equality. I'm not sure how feasible that is, but we could probably get it mostly working for doubles.

@eernstg
Copy link
Member Author

eernstg commented Nov 29, 2022

it might be useful to have a lint that fires on map keys that "seem" to be equal but where
we can't tell for certain because they don't use primitive equality
... [which could be] mostly working for doubles.

Agreed—reporting duplicates for keys of type double would probably be helpful, and failing to report them would most likely be considered to be a bug. It could be a hint or an error because (presumably) there are no false positives.

The broader issue about keys with a non-primitive operator == is more tricky, but I still think it could be helpful to flag the length equality check:

A compile-time error occurs if a map pattern with no rest pattern has one or more key expressions whose value does not have a primitive operator == and whose type is not double.

The point is that the length equality test has no other motivation than ascertaining that we are matching against every single key/value pair in this map, but that assumption is unsound in the case where == is non-primitive. So we require the ... pattern at the end, which means that the code isn't implying anything that isn't true. (We wouldn't require this for keys of type double, because they are handled separately. There would be similar exceptions for record types containing double fields, etc.)

It is possible that the non-primitive operator == will be a rare exception. However, that would just make it even more important to report code which is considered misleading (bugs in rare corners of the language are less likely to be known & spotted by a reviewer). This treatment is similar to the treatment of a?.b() when a is non-nullable: we do this not because a?.b() is meaningless or will fail, we do it because a.b() conveys a simpler and correct conclusion about the properties of the subexpressions.

@stereotype441
Copy link
Member

CC @pq since we're talking about possible lints

@munificent
Copy link
Member

The point is that the length equality test has no other motivation than ascertaining that we are matching against every single key/value pair in this map,

Agreed. (And the reason we make map patterns default to checking the length unless you add ... is to be consistent with list patterns.)

but that assumption is unsound in the case where == is non-primitive.

It's not provably true when you have non-primitive ==. But for all reasonable user-defined classes that implement == and are used as map keys and in map patterns, it will almost certainly do what the user intends. The language just can't prove that it's so.

I think that's OK because the compiler isn't relying on any provable property of this for soundness. (For example, exhaustiveness checking doesn't care about map length.)

So we require the ... pattern at the end, which means that the code isn't implying anything that isn't true. (We wouldn't require this for keys of type double, because they are handled separately. There would be similar exceptions for record types containing double fields, etc.)

It is possible that the non-primitive operator == will be a rare exception.

I don't think non-primitive == will be rare. (And if we get data classes or value types, it will probably become more common.) But I think non-primitive == methods that are used in map patterns and act weird in ways that interfere with user intuition will be rare.

@lrhn
Copy link
Member

lrhn commented Nov 29, 2022

The point is that the length equality test has no other motivation than ascertaining that we are matching against every single key/value pair in this map, but that assumption is unsound in the case where == is non-primitive.

It's unsound even if the == is primitive, because you don't know what equality the map actually uses. If it's a map which equates two different values with primitive equality (say a class with no custom equality, but some non-unique ID field, and a map which equates values with the same ID), then distinct objects with primitive equality can still look up the same map entry.

The only thing that is "guaranteed" to work is identity (assuming that map lookup is at least consistent over time).

We don't promise that switch (map) { case {k1: var v1, k2: var v2} : ... will definitely look up two different entries for any two keys and any map.
That requires the users to cooperate, to provide a map and keys that are compatible and well-behaved. Only the user can know that, the language cannot know how an object typed as Map really behaves, it's just an interface.

If everybody behaves well, then {k1: var v1, k2: var v2} will match a map with two entries, where both k1 and k2 are keys, and where their values will be bound to v1 and v2. And if the user isn't making a big mistake for no good reason, those will be different keys with different values. We can't prove that in all cases, but we don't need to.

In practice, I think the restriction that map pattern keys must be constant values will make almost every concern moot.
The keys will be enum values or strings, or the odd token constant with no override of ==.
What I expect to see is:

switch (json) {
  case {"kind": "foo", "foo": var f}: ...
  case {"kind": "foo", "foo": var f, "optionalFoo": var fo}: ...
  case {"kind": "bar", "bar1": var b1, "bar2": var b2}: ..
}

For those, the map length check is essential.

I don't want to explain why some keys behave differently, and require a ... even if you know that the map has precisely two entries, and why you should use keys with primitive equality. It's simply not worth anyone's effort to make a distinction.

We can still give warnings if you do have identical keys, or equal keys with primitive equality. But you can ignore those if you know what you are doing. (Which is why they should be analyzer warnings, so you can ignore them.)

@munificent
Copy link
Member

It's unsound even if the == is primitive, because you don't know what equality the map actually uses.

And, for what it's worth, a weird implementation of List could make a list pattern behave equally strangely.

@eernstg
Copy link
Member Author

eernstg commented Nov 30, 2022

OK, so nobody else cares about misleading map key counts.

But duplicate keys of type double, are we going to let those go undetected, too?

@stereotype441
Copy link
Member

stereotype441 commented Nov 30, 2022

OK, so nobody else cares about misleading map key counts.

But duplicate keys of type double, are we going to let those go undetected, too?

I would be in favor of an analyzer hint that flags duplicate map pattern keys. I've filed a feature request: dart-lang/sdk#50588.

@lrhn
Copy link
Member

lrhn commented Nov 30, 2022

I'd still recommend a warning (from the analyzer) if two map pattern keys in the same map pattern

  • are identical (they're constant, so that's detectable at compile-time),
  • Or both have primitive == and are equal (primitive == is computable at compile-time).

Since double constants with the same value are identical, they'll be caught.

I'd be willing to make it a language error if two keys are identical, because I really cannot see any use for that.

I won't agree to making it an error to be equal using primitive ==, because there can exist a map which distinguishes them anyway, maybe just an identity map. Probably won't happen in practice, but it's not our job to police that. You can ignore an analyzer warning, but not a misguided language error.

(Heaven forbid that someone makes an identity map mapping different NaN representations to something. At least I don't think we can create other constant NaN values than double.nan at compile time.)

munificent added a commit that referenced this issue Dec 1, 2022
- Specify exhaustiveness checking of switch elements.
- Resolve ambiguity with `=>` in switch expression guards.
- Compile error if map pattern has identical keys.

Fix #2672.
Fix #2657.
munificent added a commit that referenced this issue Dec 5, 2022
* [patterns] More small fixes.

- Specify exhaustiveness checking of switch elements.
- Resolve ambiguity with `=>` in switch expression guards.
- Compile error if map pattern has identical keys.

Fix #2672.
Fix #2657.

* Clarify when restriction on "=>" in guards comes into play.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
patterns Issues related to pattern matching. question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants