Pattern Matching Clause Order and Forwarding #387
Consider the following Silver grammar:
The result of
We first match
In a way, it seems that we should take the first match rule in the original match above. While the first child of
So it seems we want to determine when two patterns match "equally well" and take the first one. However, this gets really complicated with more things being matched. Suppose we are matching
The current method of left precedence (whatever matches best furthest left) gives us a method of determining the best match without needing to consider the complexities of matching "equally well", but it can give us unintuitive results. I'm not sure if this should be considered as a bug or just a strange bit of Silver matching behavior that should be documented.
That's where I was going to end this. However, I tried a related match and found something that is at least a bit related which I'm more certain should be considered a bug. Perhaps this should go in its own issue, but I'll put it here to start.
Suppose we change our
What should be our result for
Why does this fail? Let's look at how it compiles:
Once we have matched the
The issue which causes this unintuitive behavior is that we implemented pattern matching based on Phil Wadler's chapter in The Implementation of Functional Programming Languages, where if an input matches a certain constructor, it cannot match a different constructor. Since we will match through forwarding, this assumption does not hold in our case.
So @tedinski and @ericvanwyk, have I found at least one bug? Is this expected behavior and I just need to read the functional-programming part of Ted's thesis to understand what is going on here? Even if there are no bugs here, we need to work on the documentation so it describes these cases.
The text was updated successfully, but these errors were encountered:
It's been too long, and I've forgotten a fair bit. I seem to recall that we had clear and desirable semantics for matching against a single value, and then I was just like "and then we can apply the standard compilation method for matching against multiple values" and figured that whatever semantics that resulted in was fine.
failing... I think this is actually a choice in semantics, not necessarily a bug. We could argue that if
IIRC, I thought about worrying about this, but basically came back after I had an idea of what non-interference was all about, and that was basically "hey, pattern matching on forwarding productions is probably interfering anyway, so maybe it just doesn't matter that much." I don't know if that thought appeared in my thesis anywhere.
My suggestion: the most important thing here is whether we have a consistent notion of completeness with our semantics. (I... did we ever implement a pattern matching completeness check? I can't recall!) Given such a thing, I think it's less that we're surprised into finding this pattern failing to match (it would report incomplete), and more that programmers might be surprised when it insists they need to handle another case in their
I think being surprised they need to add some additional matching cases is much less of a concern compared to surprising runtime behavior. So maybe this surprising semantics isn't such a problem.
Does that help, or have I missed something?
I think it helps answer your earlier question? We probably don't want to have to deal with a notion of "more specific match" beyond our current "constructor/variable" distinction. And if we don't "backtrack" then we don't have to, I think? Hopefully?
It seems to me that we should be more precise in what "matches better" means. But I think there will be cases in which two patterns match "equally well" - so how do we break such a tie? It is hard, in my mind, to argue against keeping the standard notion of "take the first pattern that matches."
Compiling by dropping each successive constructor would result in a combinatorial explosion of matching levels, since we would need to successively drop each constructor. I'm not sure this is the wrong thing to do, though. It would make it more obvious for the user how their code should work. There are two things that might be considered problems with this combinatorial explosion:
We don't have a problem with this if we have a single match on a single level of constructors, which is probably the most common use case. The other large use is probably matching on two or more things at once (e.g. matching on
Some of the explosion could be reduced by identifying non-forwarding productions and not putting backtracking cases for other productions when matching them, since matching a non-forwarding production would mean there was no forwarding production pattern which the tree could match (e.g. if
We have an issue for checking incomplete matching (#317), but it hasn't been done yet. The obvious way to implement it would catch this case. However, I'm not sure a completeness check is going to fix the problem that pattern matching works in non-obvious ways. For example, if my second matching example has a default case
I agree that we probably don't want to deal with a notion of better/worse/equal matches. I don't think there is a better way to do this than left precedence, as the current compilation does, even if that is slightly unintuitive, since I don't think there is an obvious definition of comparing matches in the general case. This behavior should be thoroughly documented.
I think comparing matches is separate from backtracking, though. The idea with backtracking would be to find a match in the non-variable patterns, rather than to find a better match.
Something we could do, which I don't think we want to do but would fit Eric's comment to "take the first pattern that matches", would be to check each pattern in order to determine if it is possible for it to match, taking the first one that is possible to match. Under this, the expression
would give the result
My two cents on this - until now I thought the behavior was to just take the first pattern that could match, without any notion of a more specific match. So I generally agree with @ericvanwyk here - I don't find that behavior to be at all unintuitive.
By analogy with attributes, one might argue that the case of two patterns that match at different levels of forwarding is like having an equation both on a forwarding production and on what it forwards to - in this case we would get the value of the equation on the forwarding production. But the fact that we care about the order of rules at all sort of invalidates this line of thinking - for example
Instead I think the appropriate analogy is that each pattern is like a seperate attribute of type
In fact is there a good reason to not just implement pattern matching by translation into attributes in this way? This would let us ditch the whole "primitive match" extension. I suspect the performance wouldn't be too bad, however good error messages would be tricky and the MWDA may struggle to cope with such a large number of generated attributes.
In our meeting today we decided that the correct behavior of Silver's pattern matching is to take the first pattern which can match, whether directly or by forwarding somewhere in the structure. This makes the match
be equivalent to
This makes the semantics be more in-line with what we see in functional languages like OCaml. Each pattern is compiled as if it is its own match under the current system.
This requires changing Silver's compilation of
We might also consider whether we can somehow warn about having forwarding patterns occur after non-forwarding patterns, since these would be useless. Some difficulty might arise with nested patterns.
Completeness checking is fine as it is, since it only checks for non-forwarding productions and ignores forwarding productions.