Skip to content

Conversation

jvoigtlaender
Copy link
Contributor

The current compiler, when given this program:

type A = B | C | D | E

f x y =
    case (x, y) of
        (B, _) -> 0
        (C, False) -> 0
        (D, False) -> 0

complains:

You need to account for the following values:

    (Main.C, True)
    (Main.C, True)
    (Main.D, True)
    (Main.D, True)
    ...

Add branches to cover each of these patterns!

It's very weird that it mentions each of the patterns (Main.C, True) and (Main.D, True) twice. Moreover, because of that, information about the also missing pattern (Main.E, _) is hidden away from the user, inside the cut off "...".

Far better would be if the list of unhandled patterns would first be cleaned up, eliminating duplicates, so that then there would be room for showing additional interesting information.

That's what this pull request does, with a two lines change.

@evancz?

@evancz
Copy link
Member

evancz commented Feb 13, 2016

  1. Why are there duplicates in the first place? Does this point to an inefficiency in the underlying implementation?
  2. nub uses Eq instead of Ord so it is O(n^2). I'd prefer to use something that was O(n log n).

@jvoigtlaender
Copy link
Contributor Author

jvoigtlaender commented Feb 13, 2016

Ad 1.: I don't know, but that could be the case. I have no idea about the implementation of the pattern-matcher. But thinning out the duplicates as proposed here would be good for the user, no matter what the general issue with efficiency of the pattern-match compiler implementation is.

Ad 2.: Yes, if you add a deriving Ord to the Pattern type, instead of nub this could be used: https://hackage.haskell.org/package/extra-1.4.3/docs/Data-List-Extra.html#v:nubOrd

However, this concern about complexity of nub is actually unfounded here. The result of that nub-call is never fully computed. Only its take 4 (actually, splitAt 4, but then the second list is only checked for being empty or not, which amounts to the same degree of evaluation as take 4 or take 5). And this is Haskell code, so evaluated lazily. So, what we have here is not O(n^2), not even O(n), but O(the number of elements in the list until we have seen four or five different ones). By switching from nub to something else, relying on Ord etc., you won't be able to get asymptotically better than that at all, ever.

@jvoigtlaender
Copy link
Contributor Author

This is still happening in 0.17. The pull request when merged would have an immediate user-facing beneficial effect (particularly for learners, who are prone to making that kind of forgotten-case errors). As discussed above, the use of Data.List.nub does not come with asymptotic complexity disadvantages here.

@jvoigtlaender
Copy link
Contributor Author

Just rebased against current master.

@jvoigtlaender
Copy link
Contributor Author

I'm closing this in favor of #1470.

@jvoigtlaender jvoigtlaender deleted the patch-1 branch August 24, 2016 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants