-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
typebind
and autobind
modifiers for operator fixity
#3113
Comments
You might want to draw similarities with, and learn lessons from, the OCaml extension for user-defined binding operations. |
Anything in particular you think should be referenced? |
If so, why bother with Apart from that, I don't understand much need of What I mean is the following. Consider, I declared your original autobind infixr 7 **
record (**) (x : Type) (y : x -> Type) where
constructor MkDPair
fst : x
snd : y fst As far as I understand, expression So, why not to merge UPD: Or, even, desugar |
Thank you for your feedback.
The answer is in the text in two places:
There is no user-facing benefit in providing two syntaxes for the same thing. From a teaching and learning perspective, this is even a detriment. Why explain a student that they can write I invite you to look at the error message the distinctions provide, they showcase the ability to identify syntax misuse, and provide helpful diagnostic to obtain a resolution. All while teaching the user the necessary concepts to customise the behavior of the operator :
What's more this proposal leaves enough room to write |
I think the distinction between
The distinction is only about the allowed token
The distinction between |
Thank you for your comment, there is a couple things that I would like to address: About overloading syntax
If there is no difference between those two statements, why even have two statements in the first place? I appreciate that you aim to simplify things, but in that effort, you seem to have lost the original motivation for this feature. To reiterate the design goal mentionned above:
Introducing two overlapping syntaxes does not make things easier to learn, or to use. Nor does it address the hard cases where elaboration is insufficient to infer a type. About what users should typeYou say
But I do not see any motivation for this argument. Why is the distinction suboptimal and the overload in syntax optimal? What is the problem this is attempting to solve? Without precisely setting the scope of your idea, it is imposible to make any assess its validity. A very big question your suggestion rises is "Why are we breaking the cultural norm that things on the right side of a To concludeI am not sure I see the benefit in the suggestion you put forward. It also strikes me that it does not take into account the above suggestion to combine the two keywords to enable the overloaded syntax. Nor does it provide compelling use-cases for existing or future code. Additionally, either of those points woul probably be part of a separate orthogonal proposal, rather than this one, since it seems to tackle a very different design space. I'm happy to revisit this once more details are presented. |
I think embedding a type theory in Idris is a valid use case which requires
The two can still be combined, but my point was the embedded type theory use case I had in mind, so I did not mention that. For example here is a strawman proposal for combining the two modes:
This also keeps open the possibility of supporting other binder modes such as |
Hi there, It would be nice if your answer was a little be more comprehensive. For scoping reasons, I will ignore the alternative proposal and invite you to write you own via the issue tracker. With that said, you write
I think so too! This is why it is also part of the motivating examples of the proposal. The code samples using However, your answer fails to address any of the weaknesses presented above: Why is it worse to write You've also not said anything about duplicated syntax, ease of use, or error messages. Is it because you do not care about those things? You did not have time to write them out? You agree with me that they should be part of the design process? Again, an explicit and comprehentive answer would be great. Text communication is hard enough without having to read in between the lines, let's make it as easy as possible for everyone. |
Not really, rather I did not discuss these because these concerns are (mostly) orthogonal to my main concern. My main concern is explicitly:
My side concern is:
|
I agree with this. If the declaration-site distinction between the two modes were to be dropped the same operator I totally support the declaration-site distinction between |
If your suggestion does not take into account those aspects, I do not think it can be taken seriously. We're building a language to help people write better software, and making the syntax clear, consistent, easy to use, and provide great feedback, is an integral part of the design process. If you think it isn't, then you are not in the right place. About your main concernI think you've said it yourself
edit: I just saw your second reply.
I am very thankful for this statement, it makes crystal clear what you are looking for. About your side concern
This is a single desugaring step. There is no interaction with the elaborator, or the typechecker.
The error messages designed are displayed above, maybe you could point toward improvements to make. They are also a large part of the diff of the PR, more than the feature itself. Unless you come up with better error messages, saying "keep the errors less confusing" does not in fact keep the errors less confusing. I will ignore the coment about type universes because this thread is not about type universes. |
There is no difference other than the token used, We can spend time bikeshedding if the declaration-site distinction should be spelled with the exact keywords |
Current status
clean up the parser for -> as typebindprobably not worth the effortSummary
This proposal is for adding two new modifiers for fixity declarations that transform the behavior of operators to allow them to bind their left argument to a name such that it can be used on its right-hand-side. This feature mimics the syntax of
->
but make it user-definable for any operator:typebind infixr 0 =@
.A prototype of this feature is available here: https://github.com/andrevidela/idris2/tree/autobind
Motivation
One of the lost features between idris1 and idris2 is arbitrary rebindable syntax blocks. While powerful, they were slow, a source of errors, and increase the complexity of the language, for little benefit in real-world programs. In simple terms: they did not pull their own weight.
The motivation for arbitrary rebindable syntax hasn't gone away: the ability to write Domain Specific Languages (DSL) that rely on existing notation conventions to reduce the cognitive load of learning a language within a language.
typebind
andautobind
modifiers aim to address this original motivation, without introducing the reasons why arbitrary syntax blocks were abandoned.Proposal
Augment operator fixity/precedence declaration with a modifier
typebind
/autobind
telling the parser to desugar this operator as a binding operator.The
typebind
keywordA type-binding operator allows binary function with the type
(x : Type) -> (x -> Type) -> …
to be written like a binder. Examples of such types are sigma types:DPair : (x : Type) -> (x -> Type) -> Type
, pi-typesPi : (x : Type) -> (x -> Type) -> Type
, linear dependent arrowLPi : (x : Type) -> (x -> Type) -> Type
, ContainerMkContainer : (shape : Type) -> (positions : shape -> Type) -> Container
Here is a definition for Sigma that uses
(**)
as an infix operator for its type constructor.and can be used the same way we expect:
Because of their nature, binding operators can only be
infixr
and must be used in conjunction with a function that binds a name in its second argument.Those restrictions make the following definition erroneous:
Unfortunately, because operator fixity definitions and function definitions are separate, we cannot detect this misuse at the time of declaration. Only when the function is used by the parser do we notice that the function does not bind any names in its second argument. See Error messages.
Because
typebind
operators cannot be confused with regular operators, they require the use of parenthesis to be distinguished:Because of the binding structure, typebind operators cannot be used with sections.
The
autobind
keywordtypebind
works for representing operators that have a type dependency between their left and right argument. But the patternx -> (a -> b) -> c
has many more instances where there is no direct type dependency, yet we would like to provide the ability to make the operator look like a binder. The section Technical Implementation explains the difference in terms of implementation.A practical example that is not covered by
typebind
is a constructorVPi : Value -> (Value -> Value) -> Value
for a language where the first argument is the type of the pi-type represented by the constructor and the second argument is the right-hand-side of the pi-type which depends on the type given as the first argument. Without any binding operator, nestingVPI
looks like this:We can achieve this with a more general binding operator which we write
(x := e) =>> fn x
. The syntactic difference withtypebind
is that we use:=
instead of:
, the technical difference is that the type of the binder does not have to match the type of the lamda. With this we can rewrite our example in a much more readable style:Which is much more readable as it closely resembles the type it represents. What is more, by combining multiple binding operators we can clearly see important difference, that are difficult to read without operators. For example, if our language has implicit arrows
=?>
and explicit arrows=>>
we can write :From which we can more easily see which arguments are implicit compared to:
Error messages
One of the benefits of this approach is to provide helpful error messages when the syntax is slightly incorrect or when the expected modules are not imported.
Writing
MkSing : (x : ty) =@ Sing x
without importingData.Linear.Notation
will result in an errorSimilarly, using a non-typebind operator in typebind style like
(n : True) && False
will result in an error:Using any combination of
autobind
instead oftypebind
or regular operator is similarly detected and reported. For example writing(n : Nat := Nat) ** Vect n String
, while equivalent to(n : Nat) ** Vect n String
by our desugaring rules, will result in an error:Example use-cases
Make
-@
dependentCurrently, we have a linear arrow for linear type, but it is not dependent. This is because it is simply defined as a non-dependent operator.
Defining a dependent operator would not fix the need for a custom linear dependent arrow since it would look like this:
However, using
typebind
our functionreplicate
can look like this:Use different operators for quantitative pairs
By "quantitative pairs" I mean pair-types in which left and right arguments have different quantities. For example. Assuming we use the following convention:
!
means "unrestricted"#
means "linear"-
means "erased"Then we can build a collection of product types with different quantity semantics:
But more interestingly, instead of using them like this:
Nat -# \x -> Vect x String
We can now write
(x : Nat) -# Vect x String
.Complex DSLs
One of the barriers to implementing OpenGames in Idris is the lack of customizable binding syntax. The Haskell version makes heavy use of template-Haskell and quasi-quotes to achieve this goal, but this implementation comes with a number of caveats that we won't have to deal with were it to be implemented in Idris with autobind.
Technical implementation
typebind
is implemented as a desugaring step:(n : a) ** b n
gets rewritten as(**) a (\n : a => b n)
This does not capture the entirety of the design space around binders. When writing a DSL, it is often the case that we want to bind something in a way that looks like a let-binding, which isn't captured by the desugaring above since the right-side of the colon is used as the type of the lambda. For this we introduce additional syntax which is reminiscent of the current
let
binder.
Whenever the type of the binder does not match the type of the lambda we can write
(n := a) >>= op n
which is desugared as(>>=) a (\n : ? => b n)
The type of the argument of the lambda being left as an inferred type allows the bound name to be of a different type.
Whenever the type is different, but we want to explicitly give it, we can write:
(x : ty := val) *-* fn x
which is desugared as(*-*) val (\x : ty => fn x)
This syntactic pattern matches the current intuition around let-binder which also use
binder : type := expr
.This can be used as a DSL for loops:
(x : Nat := [1,2,3]) `for` show n
Neither of those two new syntax changes are breaking and they do not require additional elaboration steps as they are part of th desugaring process.
Limitations around quantities
We could support quantities in binders by desugaring
(1 n : a) =@ b n
asa =@ (\1 n => b n)
However, this syntax is not supported in this proposal because it would cause confusion between the syntax of
->
and binding operators.Take the linear function
(1 n : a) -> Sing n
which constructs a singleton from a linear value. Writing it with the new linear arrow operator looks like this:(n : a) =@ Sing n
which is the expected syntax.However, writing
(1 n : a) =@ Sing n
does not have an obvious meaning:Additionally, the linearity binding is already given by the typing context: a function
fn : (x : Type) -> (x -@ Type) -> Type
already declared the binding structure to be linear. The ability to give a different quantity at call-site does not provide a relevant additional level of expressivity and is purely a source of errors.Because of this, binders for operators are not allowed to define any quantity.
Alternative Considered
Attach
typebind
/autobind
to functions rather than fixitiesThis should be technically possible but would incur a level of integration with the typechecker that I foresee as difficult and slow. This would make any overloading of normal/autobind operator rely on the typechecker, which means the file cannot be entirely parsed until we are done typechecking and vice-versa. However one could imagine an
%autobind
directive attached to a function definition such that, if the operator is used in a traditional infix way, but the function call is resolved to an%autobind
-annotated one, we could emit an error saying "the function is meant to be used inautobind
position but the operator is not declared as such, please addautobind
to the operator's fixity". It is not clear to me that this is bringing anything to the table other than nicer error messages. Not that error messages can't get better, but the design-space is already quite delicate to work around and%autobind
-directive at function-definition side could be used for something slightly more useful, like allowing overloading of normal and autobind operators. Something we briefly explore in Potential extension.Syntax blocks
This feature overlaps with previously defined syntax blocs, but those were buggy, hard to read, slow, and difficult to maintain. This feature aims to side-step those issues by relying on an existing mechanism of the language, and making it available to the user, rather than built-in and reserved. I also posit that the main uses for syntax-block can be re-created with autobind operators, albeit not exactly in the same way, in a way that is clean enough to implement real-world domain-specific languages.
Do-nothing/do-notation
Do-notation already provides customisable binder syntax and the DSL writers have been surviving just fine. Could it be that this feature is unnecessary?
I believe that the examples provide the necessary use cases for such a feature that cannot be achieved today with do-notation.
Another benefit is that, while do-notation allows interleaving different (>>=) operations. Using autobind operators would allow you to visually distinguish between different binding structures, rather than rely on Idris's overloading mechanism.
Allow right-to-left autobind
Technically speaking, it would be possible to allow
autobind
to bind arguments from right-to-left withinfixl
operators. This could be added at a later date without breakage. It is not part of this proposal because the use case is not widely documented in existing code at the time of writing. Whenever the case for it appears, we could re-visit this decision and add right-to-left binding, covering even more of the feature space provided by syntax blocks. There are some interesting considerations to take into account, for example, facilitating the elaboration of terms where the dependency propagates right-to-left like for equational reasoning in Agda.Q&A
Why the name?
typebind
refers to the ability to bind a type to a name,autobind
refers to the ability to automatically infer the type of the binder. Coming up with names is difficult, especially keywords that eat into the namespace of possible identifiers. I think both display a nice balance between a descriptive name, and one that won't intrude into existing, and future, code-bases.bind
andbinder
are common names that are likely to introduce breaking in existing libraries, whiletypebind
andautobind
aren't existing english words or constructs in the language.What's up with
**
,DPair
andMkDPair
?**
could be defined asUnfortunately, this will not work since
(x : a) ** f x
is incompatible with(x ** y)
. We need special handling of the overloading of(**)
as both a typebind and a regular operator. I believe this can be achieved in the same way that-
is both a prefix and an infix operator. In the prototype, I did not touch any of the special code for(**)
Why not rewrite
->
as a binding operator?->
behaves quite differently from other binding operators, in particular when it comes to quantities: it is not the case that(1 x : a) -> b x
desugars toa -> (\1 x => b x)
, rather it indicates that the inhabitant of that type uses its argument linearly, which is not what binding operators allow.This is nice but it's not going far enough
It would be great if we could completely remove the special cases for
**
and unify-@
and=@
. Technically speaking, this could completely supplant do-notation and let-syntax. However, this proposal errs on the side of caution by making conservative choices with regard to overloading and syntax disambiguation. Eager syntax changes have led to difficult-to-solve inconsistencies which are still at play today (see issues about namespaces and dot-projections). Despite those restrictions, I think this proposal still provides a substantial improvement in expressive power, while avoiding pitfalls previously encountered. Namely, slow disambiguation processes, ambiguous syntax, and confusing hard-to-read code. If those restrictions prove superfluous, lifting them will be a backward-compatible change and can be done without fear.Why do we need this? Nobody's going to use it?
This feature will see the most use as users employ functions such as
(**)
and=@
, as well as new operators for quantitative pairs. I expect the use of this feature in fixity-delcarations to be, niche for most users, since we do not expect many users to write DSLs. But for language designers and library writers, the ability to customise binders is extremely useful, and I expect to see this feature used by this corner of the community. Additionally, one could be tempted to think that adding this feature will add complexity and cruft to the existing parser, but this feature already exists in the form of(**)
. What this proposal does is render this implementation more principled, and customisable. Avoiding the pitfall of adding complex special-case code that was introduced in idris1 with syntax-blocks.Why can't we overload
-@
for both dependent and non-dependent arrow?We could but the error message would get worse. The current prototype implementation detects when a binding operator is used in non-binding position and report an error explaining how to fix the program. Allowing the overload would mean that we desugar
Nat -@ Nat
as(-@) Nat (\_ : Nat => Nat)
in addition to the previously described desugaring. However, this is problematic for the following reasons:x -@ z
and it does not typecheck, we do not know if it's because the user meant(x : _) -@ z
or(_ : x) -@ z
and so we cannot report the correct fix to the user. In the current version, we tell the user to write(_ : x) -@ z
.let x = Nat in x -@ y x
, we don't know if we mean to shadowx
or to referencex
. In the current implementation,-@
would have to be regular andx
is a reference. If-@
was typebind, we report the error apropriately.(**)
we woudl be tempted to writex ** Fin x
but with the above rule, this would be desugared to(**) x (\_ : x => Fin x)
, but what the user meant is(**) ? (\x => FIn x)
.All of those are solvable problems, but I do not wish to attempt to solve them in this proposal.
Why do we need so much special syntax for binders?
Technical details should answer this from a technical perspective, from a more philosophical one, the goal of this feature is to rely on existing user-intuition about binders and types to provide more expressive power, without creating confusion or unexpected behavior. In essence, the goal is to make easy things easy, and hard things possible. Most users and library designers will get along just fine with
typebind
but for more sophisticated usesautobind
allows greater expressivity, while keeping a familiar look and feel.autobind
itself has two modes(x := e) ** v
and(x : ty := e) ** v
to properly encapsulate everything a user might want to do with a binder. Because autobind behave exactly like a let-binder, I have repurposed the syntax to piggyback on existing knowledge from users.Funny things you probably shouldn't do
Custom
let
bindingsForego all do-notation
Potential extension
In Alternative considered we mention the use of an
%typebind
and%autobind
directive for functions definition. While this does not seem to fit in the design space of operators, it can be used to declare function definitions as binding operators, allowing it to cover even more use-cases of syntax blocks.Indeed one could write the constructor for sigma types as
Or the clean up the syntax for the for-loop DSL:
We could also completely replace let-syntax:
Because this syntax conflicts with function application and requires more elaborate changes to the parser, it is not part of the proposal.
Allow dependent-pair-like syntax
A previous version of this proposal allowed for
(x : a ** b)
for a typebind operator. However due to the syntactic inconsistency that would bring when paired with other binding operators, it has been demoted to a future extension.The problem can be seen with this expression
(x : a ** b) =@ f x
Should this be parsed as
((x : a) ** b) =@ c
valid if**
is typebind and=@
regular(x : (a ** b)) =@ c
valid if**
is regular and=@
is typebind?Both are valid given two different binding structures. I think the feature still has value in this form, but we should be able to detect and return errors when ambiguous syntax is used and ask the user to disambiguate. In that case that would mean write either
((x : a) ** b) =@ c
or(x : (a ** b)) =@ c
The text was updated successfully, but these errors were encountered: