New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for rows in instance head under fundeps #2451
Add support for rows in instance head under fundeps #2451
Conversation
return d | ||
env <- getEnv | ||
case M.lookup className (typeClasses env) of | ||
Nothing -> internalError "typeCheckAll: Encountered unknown type class in instance declaration" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct in the assumption that by this point the referenced type class should already have been added to the environment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I believe so.
@@ -724,7 +724,7 @@ prettyPrintSingleError (PPEOptions codeColor full level showWiki) e = flip evalS | |||
renderSimpleErrorMessage (InvalidInstanceHead ty) = | |||
paras [ line "Type class instance head is invalid due to use of type" | |||
, markCodeBox $ indent $ typeAsBox ty | |||
, line "All types appearing in instance declarations must be of the form T a_1 .. a_n, where each type a_i is of the same form." | |||
, line "All types appearing in instance declarations must be of the form T a_1 .. a_n, where each type a_i is of the same form. Unless the type is fully determined by other type class arguments via functional dependencies." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wording nitpick: I'd use a comma or parens here rather than a period. Although I wonder if we should just link to the wiki for further details here.
isDeterminedInGroup fd = not (i `elem` fdDeterminers fd) && (i `elem` fdDetermined fd) | ||
|
||
-- is this argument fully determined via fundeps | ||
isFunDepDetermined = Just (All True) == foldMap (Just . All . isDeterminedInGroup) (typeClassDependencies cls) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this needs to be determined by all fundeps, just by some type variable which is not itself determined. This means that variable must be solved by the solver, which means that we know the variable we care about will be determined.
Eventually, we might want to allow the variable to be determined transitively (you have an example where this would be useful), but we don't need to deal with that yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I messed this up. Hmm, shouldn't it be: if the type is detemined in at least one fundep and never a determiner? For example in class C a b c | a -> b, b -> c
we should allow rows in c
right? However this sounds like it goes against what you said. Or is that the transitive case you're talking about?
return () | ||
TypeApp t1 t2 -> check t1 >> check t2 | ||
REmpty | isFunDepDetermined -> return () | ||
RCons _ hd tl | isFunDepDetermined -> check hd >> check tl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll keep performing the check here for each RCons
. Maybe pull this out into a checkRow
function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it will check if True
or if False
each time, but only actually compute it for the first time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, good point.
-- is this argument fully determined via fundeps | ||
isFunDepDetermined = (Any False, Any True) == foldMap determining (typeClassDependencies cls) | ||
where determining fd = (Any (i `elem` fdDeterminers fd), | ||
Any (i `elem` fdDetermined fd)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is exactly what we want, but it makes more sense than what I had before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I think I understand what you were saying before, I was originally reading it differently.
Here are a few examples cases though and what I think the outcome is:
fundeps | fully determined |
---|---|
a -> b |
b |
a -> b, b -> a |
|
a -> b, b -> a, c -> a |
a, b |
a -> b, b -> c |
b, c |
a -> b, b -> a, a -> c |
c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well let's write out explicitly what we want.
We want to prevent the user from casing on types in rows (i.e. labels in rows shouldn't determine behavior).
So we can't have two different instances with different rows in some type argument and every other type argument the same.
One way to achieve that with a single functional dependency (e.g. MonadState
, where m -> s
) is to say that we have a functional relationship on type arguments (m
determines s
), so changing the result of the function forces the argument to change (if s
differs, m
must differ too). If the input types are the same but the outputs differ, that will lead to overlap.
In general, there must be some type argument which is forced to differ between the two instances. We can find the set of such type arguments by following functional dependencies backwards, but I think to do this properly we need a full (static) overlap check, which is why I suggest limiting things to a single functional dependency for now.
checkTypeClassInstance cls i = check where | ||
-- is this argument fully determined via fundeps | ||
isFunDepDetermined = (Any False, Any True) == foldMap determining (typeClassDependencies cls) | ||
where determining fd = (Any (i `elem` fdDeterminers fd), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is i `elem` fdDeterminers fd
enough if fdDeterminers
has more than one thing in it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I'm not sure that i `elem` fdDeterminers fd
is relevant at all, is it?
I would say this: i
is determined by some fundep, but none of the determiners for that fundep can themselves be determined by some other fundep (so the solver is forced to case on at least one determiner).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that makes sense 😕, wouldn't that mean x -> y, y -> z
and x -> y z
are different? Using that algorithm:
Walkthrough of checking z
in x -> y z
determined by: x ("determiners for that fundep")
x isn't determined by anything
therefore, c is determined
Walkthrough of checking z
in x -> y, y -> z
:
determined by: y ("determiners for that fundep", from: y -> z)
y is determined by: x ("determined by some other fundep")
therefore, z is not determined ???
I have come up with what I think is a correct general solution: I'll push a new commit in a minute for you to look at. It uses a graph to compute contributing dependencies between variables. We will want to move computing the determined args out of the instance check, but it's there for now whilst we discuss the algorithm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, with the approximation I wrote down, they wouldn't behave the same. In terms of a graph, I would say the correct general version is:
There is some node (let's call it an initial node, for want of a better name, but it'll be the one which the solver cases on to make its decision) which is not determined by any fundep, and a path from that node to the one we're interested in.
In the case of fundeps with multiple inputs, it's slightly trickier:
Working backwards from the target, every path ends in an initial node.
Either way, this means that every pair of instances with different types in the target node will be forced to differ in some initial node, or else will be overlapping.
Okay I've moved where we compute the determined arguments and stored the result in |
@LiamGoodacre Looks great, but I'd like to go over the |
I tried to write up how I'm thinking about the algorithm in a more descriptive way. Hope this helps get my idea across :) Cycles in a fundep constitute arguments of the same determinacy. That is, say From the point of view of calculating determinacy, we can consider cycles of arguments together. So consider each cyclic group as a node in a DAG. Where an edge exists between nodes A and B if an argument in A determines and argument in B. Implicitly each argument depends on itself. Therefore each argument must be in exactly one group. Then the collection of initial nodes of each disconnected path is the set of determiners of everything else. Everything else being determined. Relating to the implementation: for an argument X, if there exists a node (in a cyclic group) that determines X, which X does not determine (isn't in the same cyclic group, or implied by the cyclic group), then X is determined. If there doesn't exist such an argument, then the cyclic group in which X resides is at the root of a path - as such it is a determiner. |
It sounds like we might be able to phrase that in terms of the strongly-connected components? But again, things are possibly complicated by edges like Actually, now that I think about it in terms of SCC, I think your algorithm must be right after all. Could you just please add a comment explaining what we discussed? |
Thanks very much! |
For #2376