Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wildcard binders in type declarations (amend 425) #641
Wildcard binders in type declarations (amend 425) #641
Changes from all commits
bf5cda4
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
This proposal was discussed at this pull request and amended by #641.
Invisible binders in type declarations
Contents
class
andinstance
StandaloneKindSignatures
We propose to allow invisible type variable binders (i.e.
@k
) in type declarations. Here are a few examples:We then propose to use these binders to simplify scoping rules and arity inference.
Relation to #155
This proposal is essentially the type-level counterpart to #155 "Binding type variables in lambda-expressions". While we do not have lambdas at the type level, we do have definition left-hand sides. Compare:
The subject of both proposals is the
@c
to the left of=
. The difference is that #155 is about functions, while this proposal is abouttype
,data
,newtype
,class
,type family
, anddata family
declarations.#155 has been accepted, but it is not implemented at the time of writing. Nevertheless, for the purposes of presentation, the text of this proposal assumes that term-level ``@t``-binders are already a part of the language.
Motivation
The goal of this proposal is to tidy up the language and to simplify certain aspects of it related to name resolution, implicit quantification, and arity inference.
We shall consider a number of various ways to arrive at the idea of
@k
-binders. While each individual argument may seem weak, together they form a compelling reason for the addition.Argument 1: Symmetry between Terms and Types
Let us consider a kind-polymorphic class
C
, such as the following:In this declaration,
a
is a binding site for a type variable, whereask
is a usage site. You can easily verify this claim by trying to duplicate the binder:Where is
k
bound, then? That is where implicit quantification comes into play. Compare with the following term-level definition:Here,
x
andy
are variable binders, whilea
is a usage of an implicitly quantified type variable. However, in terms there is a way to binda
explicitly:This proposal introduces the type-level equivalent of that feature:
As a consequence, the new syntax makes the language more uniform and consistent.
Argument 2: Symmetry between
class
andinstance
Let us once again consider the kind-polymorphic class introduced earlier:
The kind of
C
isforall k. k -> Constraint
. With this definition, all of the following instances are permitted:Note how
Int
,Maybe
, andFalse
are all of different kinds. That is possible because we can instantiatek
differently in each instance. If we also enableTypeApplications
, we can make this clear:This choice of syntax makes it apparent that
C
is in fact a multi-parameter type class of kindforall k. k -> Constraint
. The first parameter ofC
isk
, the second parameter isa
, and both can be instantiated.One might expect that if it is possible to instantiate
k
by writingC @Type
,C @(Type -> Type)
, orC @Bool
, then the syntax to abstract overk
would beC @k
. Unfortunately, that is not the case:This proposal lifts this restriction.
Argument 3: Deduplication with
StandaloneKindSignatures
Now let us consider a slightly more complicated example:
The point of interest here is the
i
parameter ofC
. You will notice two things about it:i
is used as part of a lengthy, syntactically large type (i -> i -> i
)i
is also used in the body of the class declaration (in the type of its method)Now, assume we want for one reason or another to add a standalone kind signature to
C
:This works, but now the
i -> i -> i
part is repeated twice, resulting in undesirable code duplication. One might attempt to deduplicate by removing the inline kind annotation ona
:Alas, now GHC will interpret this code very differently! The
i
in the standalone kind signature forC
and thei
in the class methodp
are no longer in any way related, and the type ofp
changes as follows:This wouldn't happen if
i
from the standalone kind signature scoped over the class body, as one might expect withScopedTypeVariables
enabled. However, this is simply not the case:StandaloneKindSignatures
andScopedTypeVariables
do not interact. One might argue that this is a reason to make them interact in the expected way, but this proposal offers no judgement in this regard. Instead, it offers an alternative that sidesteps the issue entirely:By writing
@i
, we bind the type variable, making it scope over the class body; at the same time, we avoid repeatingi -> i -> i
.Argument 4: Quantification in Type Synonyms
Consider:
T1
is currently legal, yieldingT :: forall a. Maybe a
. The general rule is that the free variables of a top-level kind annotation on the RHS are brought into scope implicitly, and will be quantified in the final kind of the type constructor.In constrast,
T2
is currently illegal, because the kind annotation is not at the top level.We propose to drop this exotic form of implicit quantification from the language. Both
T1
andT2
would become illegal, but with@k
-binders the programmer can rewrite them as follows:This way all the variables occurring on the RHS are bound on the LHS. We exploit the new syntax to allow a nice, simple, uniform scoping rule. To cite the User's Guide, "The reason for this exception [the strange, ad-hoc rule about top-level kind annotations] is that there may be no other way to bind k".
Argument 5: Arity Inference
Arity is a property of type synonyms and type families that determines how many arguments are required at use sites (partial application is not allowed). The notion of arity is described in more detail in section 6.4.9.2.1. "Type family declarations" of the User's Guide.
Importantly, arity cannot be determined by looking at the kind of a type constructor. Consider
F
declared as follows:The compiler can either assign it the arity of 1 or 2, and this choice will determine whether the equations of the type family can pattern match on
k
. This will also determine whether a higher-kinded usage ofF
is possible.Thus, both arities can make sense depending on intended usage. Currently, GHC expects the programmer to employ an unsightly technique to specify the arity. By default, arity inference tries to include as many forall-bound variables as possible, to maximize the expressivity at definition site (at the cost of higher-kinded usage). However, the user may opt out by duplicating the return kind of the type family in its header:
With
@
-binders we can do the opposite. We propose that by default, arity inference would include as few forall-bound variables as possible, to allow higher-kinded usage. However, it shall also include all@
-bound variables:This would simultaneously reduce code duplication and simplify the rules for arity inference.
Argument 6: Explicit Binding Sites
One might expect that for any implicitly quantified (type) variable, it would be possible to bind it explicitly. For example, in ordinary type signatures we can use
ExplicitForAll
to do it:That is currently not the case in type declarations. Let us once again consider a kind-polymorphic type class:
How would one bind
k
? This proposal provides an answer:This also increases expressivity in the presence of ambiguous variables:
In practice, it is easy to work around this problem by using
forall a ->
instead offorall a.
, and yet the need for the workaround reveals a rough edge in the design of the language.Motivation Summary
Support for
@k
-binders in type declarations will have the following positive effects:class
andinstance
declarations can be more symmetricNow let us propose two additional changes that are only tangentially related to
@k
-binders, but follow the spirit of "Argument 4" about using the LHS exclusively to determine scope.Addendum: Quantification in Type Family Instances
Consider:
The definitions and instances of
F1
, andF2
, andF3
are equivalent, and all of them are already allowed today.Notice that in
F2
, the@(j -> j)
is not a binding site forj
. If we want to bindj
explicitly in an instance (as opposed to a declaration), we use an explicitforall j.
, as inF3
. That is why it is possible today and does not require the@k
-binders introduced in this proposal.In
F1
, thej
is only mentioned on the right-hand side, and yet is implicitly quantified. This implicit quantification behavior is sometimes counterintuitive, so we propose that all type variables must be bound on the LHS. That is,F1
would become illegal, whileF2
andF3
would remain accepted.This is quite similar to the implicit quantification rules for type synonyms that we presented in "Argument 4".
Addendum: Instantiation of Type Family Instances
Consider:
From the family declaration we see that
F :: forall k. Type -> k
. The twotype instance
declarations appear to have an identical head, but by looking at the RHS we can infer that the invisible kind argument ofF
isType
in the first instance, andType -> Type
in the second. It would be much clearer to write:and indeed this is already legal.
We propose to require that the type instance be fully determined by the LHS, so that the programmer sees two visibly distinct instance heads. For the purpose of determining the LHS, the RHS is ignored. So under this proposal the instance:
would mean:
(where the LHS instantation is at an unconstrained kind
k
). Now the kind of the RHS if fixed to bek
, and the kind ofChar
does not match that, so the declaration is rejected.The principle is that it should be possible to see what instance the programmer intended by looking only at the instance head (the LHS). This property already holds for data family instances. Suppose
D
is a data family of kind:Now consider
So what kind do
p
andq
have? No clues from the header, but from the data constructor we can clearly see thatr :: Type->Type
. Does that mean that the the entire data instance is instantiated atType
like this:Or does it mean that the GADT data constructor specialises that kind argument, thus:
(It might be specialised differently in some other data constructor
MkD2
). GHC avoids this question by determining the instance header solely from the header. This proposal simply extends the same principle to type family instances.Proposed Change Specification: Declarations
Syntax
RelaxGeneralize thesyntactic checkgrammar of type variable binders indata
,newtype
,type
,class
,type family
, anddata family
declarations to allow@k
-bindersin their headers, wildcards, and combinations thereof:The occurrences of
@
must be prefix, as defined by #229.GuardedNew forms of binders are guarded behind a new flag,-XTypeAbstractions
.Scope
In type synonym declarations, require that every variable mentioned on the RHS must be bound on the LHS. For three releases before this change takes place, include a new warning
-Wimplicit-rhs-quantification
in-Wcompat
, to inform users of affected definitions.Kind Inference
When a declaration has no standalone kind signature, a
@k
-binder gives rise to aforall k.
quantifier in the inferred kind signature. The inferredforall k.
does not float to the left; the order of quantifiers continues to match the order of binders in the header.Kind Checking
To kind-check a declaration that has a standalone kind signature (SAKS), we must associate the quantifiers of the kind signature with the binders of the type declaration. We call this SAKS zipping. For example, consider the following declaration:
Here we produce the following pairs:
Notice that each quantifier is associated with either one binder or none.
This association plays two roles:
z
, which is paired with the sixth quantifierforall d ->
, so the arity is6
(see also "Arity Inference" below).y
is associated with the quantifier(b,c) ->
, soy
must have kind(b,c)
. Similarly the binder@t
is associated with the quantifierforall b.
, sot
is simply a name forb
.SAKS zipping works over two lists: quantifiers (from the signature) and binders (from the declaration). Let us define it in pseudo-code:
Where the predicates are defined as follows (with
⟦ ... ⟧
denoting AST quotation):Arity Inference
When SAKS zipping is done, some quantifiers may remain. Consider:
The produced pairs are:
Zipping stops when binders are exhausted, so the
forall b.
does not yield a pair. Instead, it becomes a part of the return type. We call the remaining quantifiers trailing.In today's GHC, there is an additional step called arity inference to decide which of the trailing quantifiers to include in the arity in addition to the zipped ones.
We propose to remove this step entirely, so that the arity is fully determined by SAKS zipping, as
@k
-binders provide the same control over arity but in a more principled way.Inferred Variables
In addition to visible (
forall x ->
) and invisible (forall x.
) quantification, GHC features inferred quantificationforall {x}.
.We leave it out of scope of this proposal and intentionally do not introduce
@{k}
-binders. See "Alternatives" for reasoning.Proposed Change Specification: Instances
The changes to instances are not directly related to the main body of the proposal, but they are close to it in spirit, so we include them here.
Scope
In type family and data family instances, require that every variable mentioned on the RHS must also occur on the LHS.
Instantiation
In type family and data family instances, the instantiation is fully determined by the left hand side, without looking at the right hand side.
Examples
Here's an example from the wild (thanks Jakob Bruenker):
In current GHC this typechecks, but the type family is not total. Why? Because the fully-explicit version is:
Notice the repeated
b
on the LHS. The author was entirely unaware that the resulting type family was partial, because the equation he wrote looked total. With the proposed change to instantiation, the original program:would be rejected. Why? Because the LHS imposes no kind constraints, so we get:
so the RHS must have kind
Relation a c
. But it doesn't;rel
has kindRelation a b
. So the declaration is rejected, which would have saved Jakob some time.Effect and Interactions
The proposed changes provide the programmer with a more principled way of brining type variables into scope in certain corner cases, simplify arity inference and scoping rules.
Costs and Drawbacks
The proposed changes break existing code in a few ways.
Breakage 1: Scoping in Type Synonyms
The first source of breakage is a change in the scoping of type synonyms. Consider:
This is no longer accepted, as
a
is not bound on the LHS. Instead, the user must write:See "Argument 4: Quantification in Type Synonyms" for the motivation.
There's no backwards-compatible way to rewrite this example, so we introduce the
-Wimplicit-rhs-quantification
warning, wait for three releases (in accordance with 3-Release-Policy), and only then make the change.Breakage 2: Arity Inference
Background. Every type synonym and type family has an arity, which specifies the number of arguments that must be supplied at every usage of the type synonym or family. Note that arity is distinct from a kind. For example:
Even though
F1
andF2
have the same kind (Type -> Type -> Type
), they have a different arity. Thus, writing e.g.StateT Int (F1 b) a
is allowed, whileStateT Int (F2 b) a
would not be: the latter does not fully applyF2
to all of its arguments, and GHC does not (yet!) support unsaturated type families (or synonyms).Importantly, arity applies both to visible arguments, like the ones above, and invisible arguments, such as appear in e.g.
type F3 :: forall k. k -> k
.Breakage. The second source of breakage is the change to arity inference. Consider:
This definition is currently assigned the arity of 2, but with the proposed changes will be assigned arity of 1. The arity of 2 is important, because the equations match on the kind variable. (The second equation does this, too, by choosing
k = Type -> Type -> Type
, according to the RHS. This confusing situation is also barred by this proposal.) To keep the arity of 2, the user must rewrite it thus:See "Argument 5: Arity Inference" for the motivation.
We assume that this is an obscure situation and the change will go unnoticed by most users, because in order to encounter it, one must use
StandaloneKindSignatures
– a relatively recent addition in itself.No migration strategy is provided: GHC must assign some arity to definition with a trailing invisible variable, and having this behavior user-configurable seems undesirable.
Breakage 3: Scoping in Instances
The third source of breakage is the new requirement that variables mentioned on the RHS must also occur on the LHS. Consider:
This is no longer accepted, as
j
is not mentioned on the LHS. The user must rewrite it as follows:See "Addendum: Quantification in Type Family Instances" for the motivation.
The
@
-binders in type instances are already legal, so there's no need for a migration strategy: the fix is backwards-compatible.Breakage 4: Instantiation
The fourth source of breakage is that instantiation of type/data family instances is fully determined by the left hand side, without looking at the right hand side. Consider:
This is no longer accepted, as the LHS
F Int
does not fully determine if the instance matches, as evidenced by the presence of two instances with identical LHSs.The user must rewrite this code as follows:
See "Addendum: Instantiation of Type Family Instances" for the motivation.
The
@
-binders in type instances are already legal, so there's no need for a migration strategy: the fix is backwards-compatible.Alternatives
We could also introduce
@{k}
-binders forforall {x}.
-quantifiers, but that is not actually symmetric with use sites, wheref @{x}
is not possible. In fact, it would be against the spirit offorall {x}.
, as the reason inferred variables are not subject to type applications is that we don't want their order to matter.Some thoughts on the topic can be found in GitHub comment 326/634791269.
The grammar of binders could be further generalized to allow nested parentheses and nested kind annotations:
The committee decided against this option to avoid the introduction of another recursive data type to the AST.
Unresolved Questions
None at the moment.