Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Warn on ambiguous variables used in a when clause #7031
Original bug ID: 7031
Martin Clochard, through the intermediary of Arthur Charguéraud,
( (Var x, t) | (t, Var x) ) when x = my_var -> ...
The problem is that the code author (and many readers of this code,
Not matching is the expected semantics of "when" guards: they are
This is however confusing, so we would like to warn in this situation
| C[p | q] when test -> foo
is not equivalent to the distributive semantics, that is
| C[p] when test -> foo
(I'm using a pattern-context C[..] to account for the general case
After discussing it with Luc and Damien, I propose the following
[overlap]: there is a value that can match both sides of the
[ambiguous variables]: the variables of (p | q) used by the guard
| (Var x, t) | (t, Var x) -> subst x t prog
which is perfectly fine if an overlapping value (Var x1, Var x2) is
Similarly, warning in all cases where or-pattern and when-guards are
((Var x, p) | (Var x, q)) when test x
and we noted that it would be fine to warn in this case, as this would
(Var x, (p | q)) when test x
Unfortunately, this transformation cannot be applied in general, see
((p1, Var x, p2) | (q1, Var x, q2)) when test x
which cannot be factorized in a semantics-preserving way.
Comment author: @gasche
I was just thinking of this bug yesterday, and had a small idea for a neat implementation of the [ambiguous variable] criterion -- because it is actually not trivial to decide it.
We are trying to know whether the shared variables of (p | q) appear in the same place in both patterns. The "natural" idea is to recurse on p and q, collect the path from the root along the way, and return a map from variables to a list of reaching paths. But that seems a bit costly, and defining the type of paths will be a pain.
The idea is as follows: implement a function (intersection : pattern -> pattern -> pattern), such as (intersection p q) is the most precise pattern matching all values of both p and q. This is rather easy to define:
intersection x x = x
Then it suffices to check that all variables of (p | q) appear in (intersection p q). Those that do not appear there are ambiguous in the sense of [ambiguous variables].
Comment author: @alainfrisch
I guess you meant something else here.
I'm not sure to understand the definition. Cannot we define a function that satisfy this property as:
intersection p q = p | q
Comment author: @gasche
Indeed, I meant a disjunction-free pattern. It returns the shared constructor tree prefix. For our original example,
intersection (Var x, t) (t, Var x) = (_, _)
(note that if we knew that Var is the only constructor, then an equally valid result would be (Var _, Var _), but I don't think that affects the ambiguity criterion derived from computing the intersection.)