'Mapping' destructuring #10

gvanrossum · 2020-04-12T06:02:15Z

Just like case [a, b] is a Sequence destructuring, we could define case {key: value, ...} as a Mapping destructuring. Semantics:

The most useful would probably be to ignore extra keys, so e.g. {"a": 1, "b": 2} is a match for the pattern {"a": a}.
It would match anything that subclasses collections.abc.Mapping.
For simple cases we might also write case dict(key1=a, key2=b), (implemented by dict.__match__()) but this would probably only match actual dict instances (or dict subclass instances).

Questions:

Do the keys have to be constants? Else it would be a bit tricky to decide what case {k: v} would match.
Could the keys at least be alternatives, e.g. case {"a"|"A": a}?
Is there a need for extracting the remaining keys, e.g. case {"a": a, **rest}?

The text was updated successfully, but these errors were encountered:

viridia · 2020-05-01T03:37:44Z

Also, you could have:

case {"a": 1, "b": 2}: # match ignoring extra values
case Exact({"a": 1, "b": 2}): # don't allow extra values

gvanrossum · 2020-05-01T03:47:11Z

Oh, neat! I guess the implementation could be

class Exact:
    @staticmethod
    def __match__(target):
        if isinstance(target, Mapping):
            return ExactProxy(target)

class ExactProxy:
    __match_args__ = ["target"]
    def __init__(self, target):
        self.target = target

This sure beats the work of implementing support for

case {"a": 1, "b": 2, **rest} if not rest: ...

gvanrossum · 2020-05-01T03:48:28Z

(...and explaining why if you leave out **rest it ignores extra keys, whereas for lists, if you leave out *rest it insists there's nothing extra.)

ilevkivskyi · 2020-05-15T08:32:04Z

In my notes I require **rest for a match to succeed, but I agree that actually mappings are different from sequences: they have "structural subtyping" behavior, i.e. passing a dictionary with extra keys somewhere will likely just work, while passing a tuple with extra items somewhere will likely just crash.

I also proposed to allow name patterns in key positions, but this is mostly to allow fixed length match. But this can easily be checked with a guard, so I also withdraw this idea.

gvanrossum · 2020-05-20T04:16:30Z

I think there's enough agreement here to mark it as 'accepted'.

(The minor issue of whether to support dict(key1=a, key2=b) is separate. Let's postpone that.)

gvanrossum · 2020-05-20T04:31:14Z

Whoa, I take it back. Ivan's PEP currently doesn't support **rest, and it allows | in keys.

gvanrossum · 2020-05-22T22:10:57Z

Okay, in the end we opted for | in keys and for **rest. So marking as "accepted" again.

brandtbucher · 2020-05-28T17:16:32Z

I've begun implementing this, and have come across a couple of questions that the PEP does not explicitly address. The answers seem clear to me, but they should probably be included in the spec:

Is using **x multiple times in the same pattern illegal? I think yes.
Is using **x anywhere other than the end of the pattern illegal? I think yes.
Does order matter? I think no.

Finally, it's not clear to me how | in key patterns is supposed to work. Does it stop after a successful key match, or does it keep trying keys until it gets a value match as well?

match {"x": False, "y": True}:
    case {"x" | "y": True}:  # Does this match?
        ...

gvanrossum · 2020-05-28T17:23:48Z

Maybe skip the OR in keys for now?

brandtbucher · 2020-05-28T17:28:28Z

Maybe skip the OR in keys for now?

Yeah, I think that's a good idea. It's also worth considering omitting them from the proposal, since it seems they have few real-world use cases and could easily be added later.

gvanrossum · 2020-05-28T19:35:42Z

If we were to do it, some kind of distributive law should apply, like {X|Y} should be like {X}|{Y}. But fine to move to Rejected/Postponed for now.

brandtbucher · 2020-06-02T06:16:12Z

Flagging for revision since we're dropping | for keys.

brandtbucher · 2020-06-02T19:54:10Z

I've come across a couple of questions that I'd like input on. Consider the following example:

from collections import defaultdict

match defaultdict(int):
    case {"XXX": _}:
        ...

Does this case match? The current implementation copies the target into a dict and pops keys from that. As a result, the above case doesn't match.

But I'm starting to wonder if it should. It would be pretty straightforward to turn the key pops into regular lookups on the target itself. There would just be more bookkeeping involved.

Which leads me to my second question. How should we handle duplicate keys?

xxx = "XXX"
match d:
    case {"XXX": _, "XXX": _}:
        pass
    case {"XXX": _, .xxx: _}:
        pass

Currently, neither of these cases match. That seems fine to me, but perhaps it makes more sense to raise... I'm not sure. Again, we would just require more bookkeeping to catch these.

Thoughts?

gvanrossum · 2020-06-02T22:28:10Z

I think the first example should only match if there's an existing key in the target, as found by target.__contains__(key).
I think duplicate keys should raise. (Do we allow {.key: pat}? Then {.key1: pat1, .key2: pat2} should also raise if key1 and key2 happen to be equal.)

brandtbucher · 2020-06-02T22:33:14Z

I think I agree.

We do currently allow .key, so we will need to keep track of previously-seen keys at runtime.

gvanrossum · 2020-06-10T19:01:05Z

In particular the PEP still shows {"route"|"Route": route} as valid syntax. We decided to defer that.

brandtbucher · 2020-06-11T05:04:06Z

In my code review you pointed out that some cases don't catch duplicate keys. It turns out that we do in fact check for duplicate keys, but many cases short-circuit and fail before we can get that far. Some cases are when:

The target isn't a mapping:

>>> match None:
...     case {"x": _, "x": _}:
...         ...
...
>>>

The target is too small (so it couldn't possibly match):

>>> match {}:
...     case {"x": _, "x": _}:
...         ...
...
>>>

An earlier key lookup fails:

>>> match {"y": None, "z": None}:
...     case {"x": _, "x": _}:
...         ...
...
>>>

It's only when all of these steps succeed that we actually get to the duplicate and raise:

>>> match {"x": None, "y": None}:
...     case {"x": _, "x": _}:
...         ...
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: mapping pattern checks duplicate key ('x')
>>>

I think that the first three cases are best caught by linters, since they have no effect on the match at runtime. If a particular match gets far enough that we actually hit the second lookup (like the last example), then sure, we should absolutely raise. But a lot of the design choices in the implementation have favored doing as little extra work as possible when failing a match, so it seems wasteful to try to check the keys of a pattern that isn't going to match anyways. This could come with a significant runtime penalty for many heavily nested cases... even the ones with no duplicates!

Further, checking this in the compiler feels wrong. It would only work when literals (not name lookups) conflict. We also don't do this anywhere else: the statement x = {[]} will obviously fail at runtime, but that's outside the scope of a compile error.

So I vote we leave the current behavior as-is. It's performant, and still catches the cases that matter.

gvanrossum · 2020-06-11T18:31:48Z

Hm, I still want to push back a little here.

Outside patterns, duplicate dict keys are not errors -- we use "last one wins".

In patterns, we reject duplicate keys when we get to matching. So I think we have decided that mappin patterns with duplicate keys are statically invalid, and for cases where we can detect this just by parsing the thing (i.e. when both keys are literals) making this a compilation error seems right. (Just like we do for duplicate keyword args in calls. Which match also should reject statically.)

Obviously we'd still need the runtime check to catch cases like

foo = 'foo'
match x:
    case {'foo': 1, .foo: 2}:
        print("Quantum mechanics is real!")

brandtbucher · 2020-06-11T18:40:53Z

Hm. Would the compiler raise a SyntaxError? Because if so, this would have different exception types when caught at compile-time vs. run-time.

I'm not aware of any cases where the compiler raises anything besides SyntaxError and SystemError (except one pathological check for name mangling overflows). And I don't think SyntaxError ever occurs at runtime.

gvanrossum · 2020-06-11T18:54:45Z

If the compiler detects it, it should raise SyntaxError. When it's detected at runtime it should raise some other exception of your choice.

But frankly I could also live with only the runtime detection -- it's unlikely that people write the same literal key twice, and it will still be caught when actually matching. (That's why I wrote "push back a little.")

* #52 - Allow binary +/- for complex numbers? * #58 - Allow missing __match_args__ to accept a single positional arg * #46 - Allow '**_' and '_ :='? * #10 - 'Mapping' destructuring Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>

natelust · 2020-06-22T20:57:27Z

I see in the pep that only constant value patterns or literals are allowed as keys, but I don't see a justification their or here on why (I can make several guesses, and it seems reasonable). Is there any reason that tuples of these are also disallowed? They should be unchangeable, unnamed, and, at least in principal, possible to check for uniqueness.

As an additional feedback on the wording in the pep, as noted above, it states that constant value patterns or literals can be used. This would send the reader to the section on Constant value patterns. That section mentions that this is used for comparing actual constants and Enum values. It may not be obvious to readers that it is possible to check for the existence of a key using this pattern that is not a constant or an Enum. For instance:

class Foo:
    '''Fixed hash otherwise non constant'''
    def __hash__(self):
        return 12

f = Foo()
tup_key = (1,2)
mapping = {f: 'hello', tup_key: "world"}

match mapping:
    case {.f: first, .tup_key: second}:
        print(first, second)
    case _:
        print('default')

gvanrossum · 2020-06-22T21:03:24Z

Yeah, this went through a few revisions and the text is not entirely consistent between sections. We hadn't thought of using (constant) tuples for keys, maybe we could add that in the future (or if there's a groundswell of demand during the review phase :-).

PS. In order to make the example valid you'd have to add an __eq__ to class Foo that returns true for any two Foo instances.

natelust · 2020-06-22T21:08:51Z

Oh good point on equals, I was playing too fast when typing this up. It happened to work in my interpreter, but I guess that's just because I didn't hit any collisions in the hash table.

I personally am a +1 on constant tuples, as I use them all the time. However, I am also a +1 in that it probably is not a reason to delay, as there is a way to get the behavior with no changes.

brandtbucher · 2020-06-22T21:15:40Z

I guess that's just because I didn't hit any collisions in the hash table.

It's because the dict lookup will try an identity check first, and the same object f is used both places here. It wouldn't work between different Foo instances:

>>> class Foo:
...     '''Fixed hash otherwise non constant'''
...     def __hash__(self):
...         return 12
... 
>>> f1 = Foo()
>>> f2 = Foo()
>>> tup_key = (1,2)
>>> mapping = {f1: 'hello', tup_key: "world"}
>>> 
>>> match mapping:
...     case {.f2: first, .tup_key: second}:
...         print("f2")
...     case {.f1: first, .tup_key: second}:
...         print("f1")
... 
f1

gvanrossum · 2020-06-22T22:37:13Z

Yeah, but if you only use one object you don't need to define __hash__ to make the example work either. Anyway, the intention here is to show that "constant pattern" is a misnomer, and I must agree. Maybe "value pattern"?

brandtbucher · 2020-06-22T22:40:28Z

Yeah, I must admit “constant” never really sat right with me. “Value” captures the meaning well.

natelust · 2020-06-22T23:09:30Z

Yeah the hash was not really necessary to convey the idea of "constness", I'm sorry for the extra, I was playing in interactive mode and just copied what I had. (it was not even working how I was thinking anyway as you two were kind to point out, it was falling back to id) I will try to be clearer and slow down a bit in any future posts. "value pattern" does seem an easier term to teach. +1

gvanrossum · 2020-06-22T23:26:16Z

See #40 (comment) about the terminology.

gvanrossum added accepted Discussion leading to a final decision to include in the PEP and removed accepted Discussion leading to a final decision to include in the PEP labels May 20, 2020

gvanrossum mentioned this issue May 20, 2020

Update descriptions of sequence and mapping patterns #33

Merged

gvanrossum added the accepted Discussion leading to a final decision to include in the PEP label May 22, 2020

viridia added the fully pepped Issues that have been fully documented in the PEP label Jun 1, 2020

brandtbucher added needs more pep An issue which needs to be documented in the PEP and removed fully pepped Issues that have been fully documented in the PEP labels Jun 2, 2020

brandtbucher self-assigned this Jun 3, 2020

brandtbucher added fully pepped Issues that have been fully documented in the PEP and removed needs more pep An issue which needs to be documented in the PEP labels Jun 11, 2020

brandtbucher added a commit that referenced this issue Jun 11, 2020

#10 - 'Mapping' destructuring

ee763d6

brandtbucher mentioned this issue Nov 6, 2020

Thomas Wouters' objections #165

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'Mapping' destructuring #10

'Mapping' destructuring #10

gvanrossum commented Apr 12, 2020 •

edited

Loading

viridia commented May 1, 2020

gvanrossum commented May 1, 2020

gvanrossum commented May 1, 2020

ilevkivskyi commented May 15, 2020 •

edited by gvanrossum

Loading

gvanrossum commented May 20, 2020

gvanrossum commented May 20, 2020

gvanrossum commented May 22, 2020

brandtbucher commented May 28, 2020

gvanrossum commented May 28, 2020 via email •

edited

Loading

brandtbucher commented May 28, 2020

gvanrossum commented May 28, 2020

brandtbucher commented Jun 2, 2020

brandtbucher commented Jun 2, 2020 •

edited

Loading

gvanrossum commented Jun 2, 2020 •

edited

Loading

brandtbucher commented Jun 2, 2020 •

edited

Loading

gvanrossum commented Jun 10, 2020

brandtbucher commented Jun 11, 2020 •

edited

Loading

gvanrossum commented Jun 11, 2020

brandtbucher commented Jun 11, 2020

gvanrossum commented Jun 11, 2020

natelust commented Jun 22, 2020 •

edited

Loading

gvanrossum commented Jun 22, 2020

natelust commented Jun 22, 2020

brandtbucher commented Jun 22, 2020 •

edited

Loading

gvanrossum commented Jun 22, 2020

brandtbucher commented Jun 22, 2020

natelust commented Jun 22, 2020

gvanrossum commented Jun 22, 2020

'Mapping' destructuring #10

'Mapping' destructuring #10

Comments

gvanrossum commented Apr 12, 2020 • edited Loading

viridia commented May 1, 2020

gvanrossum commented May 1, 2020

gvanrossum commented May 1, 2020

ilevkivskyi commented May 15, 2020 • edited by gvanrossum Loading

gvanrossum commented May 20, 2020

gvanrossum commented May 20, 2020

gvanrossum commented May 22, 2020

brandtbucher commented May 28, 2020

gvanrossum commented May 28, 2020 via email • edited Loading

brandtbucher commented May 28, 2020

gvanrossum commented May 28, 2020

brandtbucher commented Jun 2, 2020

brandtbucher commented Jun 2, 2020 • edited Loading

gvanrossum commented Jun 2, 2020 • edited Loading

brandtbucher commented Jun 2, 2020 • edited Loading

gvanrossum commented Jun 10, 2020

brandtbucher commented Jun 11, 2020 • edited Loading

gvanrossum commented Jun 11, 2020

brandtbucher commented Jun 11, 2020

gvanrossum commented Jun 11, 2020

natelust commented Jun 22, 2020 • edited Loading

gvanrossum commented Jun 22, 2020

natelust commented Jun 22, 2020

brandtbucher commented Jun 22, 2020 • edited Loading

gvanrossum commented Jun 22, 2020

brandtbucher commented Jun 22, 2020

natelust commented Jun 22, 2020

gvanrossum commented Jun 22, 2020

gvanrossum commented Apr 12, 2020 •

edited

Loading

ilevkivskyi commented May 15, 2020 •

edited by gvanrossum

Loading

gvanrossum commented May 28, 2020 via email •

edited

Loading

brandtbucher commented Jun 2, 2020 •

edited

Loading

gvanrossum commented Jun 2, 2020 •

edited

Loading

brandtbucher commented Jun 2, 2020 •

edited

Loading

brandtbucher commented Jun 11, 2020 •

edited

Loading

natelust commented Jun 22, 2020 •

edited

Loading

brandtbucher commented Jun 22, 2020 •

edited

Loading