Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

typebind and autobind modifiers for operator fixity #3113

Open
1 of 3 tasks
andrevidela opened this issue Oct 21, 2023 · 12 comments
Open
1 of 3 tasks

typebind and autobind modifiers for operator fixity #3113

andrevidela opened this issue Oct 21, 2023 · 12 comments

Comments

@andrevidela
Copy link
Collaborator

andrevidela commented Oct 21, 2023

Current status

Summary

This proposal is for adding two new modifiers for fixity declarations that transform the behavior of operators to allow them to bind their left argument to a name such that it can be used on its right-hand-side. This feature mimics the syntax of -> but make it user-definable for any operator: typebind infixr 0 =@.

A prototype of this feature is available here: https://github.com/andrevidela/idris2/tree/autobind

Motivation

One of the lost features between idris1 and idris2 is arbitrary rebindable syntax blocks. While powerful, they were slow, a source of errors, and increase the complexity of the language, for little benefit in real-world programs. In simple terms: they did not pull their own weight.

The motivation for arbitrary rebindable syntax hasn't gone away: the ability to write Domain Specific Languages (DSL) that rely on existing notation conventions to reduce the cognitive load of learning a language within a language. typebind and autobind modifiers aim to address this original motivation, without introducing the reasons why arbitrary syntax blocks were abandoned.

Proposal

Augment operator fixity/precedence declaration with a modifier typebind/autobind telling the parser to desugar this operator as a binding operator.

The typebind keyword

A type-binding operator allows binary function with the type (x : Type) -> (x -> Type) -> … to be written like a binder. Examples of such types are sigma types: DPair : (x : Type) -> (x -> Type) -> Type, pi-types Pi : (x : Type) -> (x -> Type) -> Type, linear dependent arrow LPi : (x : Type) -> (x -> Type) -> Type, Container MkContainer : (shape : Type) -> (positions : shape -> Type) -> Container

Here is a definition for Sigma that uses (**) as an infix operator for its type constructor.

typebind infixr 0 **

record (**) (x : Type) (y : x -> Type) where
  constructor MkDPair
  fst : x
  snd : y fst

and can be used the same way we expect:

pair : (n : Nat) ** Fin n
pair = MkDPair 3 FZ

Because of their nature, binding operators can only be infixr and must be used in conjunction with a function that binds a name in its second argument.
Those restrictions make the following definition erroneous:

typebind infixr 7 /\

record /\ (a, b : Type) where
  constructor And
  fst : a
  snd : b

pair : (x : Int) /\ String
pair = 3 `And` "four"

Unfortunately, because operator fixity definitions and function definitions are separate, we cannot detect this misuse at the time of declaration. Only when the function is used by the parser do we notice that the function does not bind any names in its second argument. See Error messages.

Because typebind operators cannot be confused with regular operators, they require the use of parenthesis to be distinguished:

(x : a) ** b -- ok, new syntax, only works if compatible with surrounding operators, existing operator rules apply
(a ** b)     -- not ok, (**) is typebind
x : a ** b   -- not ok, this will be parsed as (x : (a ** b)) 
(x : a) ** (y : b) ** c -- ok, parsed as (x : a) ** ((y : b) ** c)

Because of the binding structure, typebind operators cannot be used with sections.

(a **) -- not ok, (**) is binding
(x : a **) -- not ok
((x : a) **) -- not ok
(** b) -- still not ok

The autobind keyword

typebind works for representing operators that have a type dependency between their left and right argument. But the pattern x -> (a -> b) -> c has many more instances where there is no direct type dependency, yet we would like to provide the ability to make the operator look like a binder. The section Technical Implementation explains the difference in terms of implementation.

A practical example that is not covered by typebind is a constructor VPi : Value -> (Value -> Value) -> Value for a language where the first argument is the type of the pi-type represented by the constructor and the second argument is the right-hand-side of the pi-type which depends on the type given as the first argument. Without any binding operator, nesting VPI looks like this:

VPi VStar                 (\fstTy -> VPi
(VPi fstTy (const VStar)) (\sndTy -> VPi
fstTy                     (\val1 -> VPi
(sndTy `vapp` val1)       (\val2 ->
VSigma fstTy sndTy)))))

We can achieve this with a more general binding operator which we write (x := e) =>> fn x. The syntactic difference with typebind is that we use := instead of :, the technical difference is that the type of the binder does not have to match the type of the lamda. With this we can rewrite our example in a much more readable style:

autobind infixr 0 =>>
(=>>) : Value -> (Value -> Value) -> Value
(=>>) = VPi

mkSigma : Value
mkSigma = 
    (fstTy := VStar) =>>
    (sndTy := (_ := fstTy) =>> VStar) =>>
    (val1 := fstTy) =>>
    (val2 := sndTy `vapp` val1) =>>
    VSgima fstTy sndTy

Which is much more readable as it closely resembles the type it represents. What is more, by combining multiple binding operators we can clearly see important difference, that are difficult to read without operators. For example, if our language has implicit arrows =?> and explicit arrows =>> we can write :

-- explicit arrow
autobind infixr 0 =>>
(=>>) : Value -> (Value -> Value) -> Value
(=>>) = VPi

-- implicit arrow
autobind infixr 0 =?>
(=?>) : Value -> (Value -> Value) -> Value
(=?>) = ImpPi

mkSigma : Value
mkSigma = 
    (fstTy := VStar) =?>
    (sndTy := (_ := fstTy) =>> VStar) =?>
    (val1 := fstTy) =>>
    (val2 := sndTy `vapp` val1) =>>
    VSgima fstTy sndTy

From which we can more easily see which arguments are implicit compared to:

ImpPi VStar               (\fstTy -> ImpPI
(VPi fstTy (const VStar)) (\sndTy -> VPi
fstTy                     (\val1 -> VPi
(sndTy `vapp` val1)       (\val2 ->
VSigma fstTy sndTy)))))

Error messages

One of the benefits of this approach is to provide helpful error messages when the syntax is slightly incorrect or when the expected modules are not imported.

Writing MkSing : (x : ty) =@ Sing x without importing Data.Linear.Notation will result in an error

(=@) is not typebind. Ensure you have imported the proper modules to use this operator, or define (=@) 
as a typebind operator.

Similarly, using a non-typebind operator in typebind style like (n : True) && False will result in an error:

(&&) is not a type-binding operator, but a regular operator, please write (True && False) or import the 
proper module that defined (&&) as type-binding

Using any combination of autobind instead of typebind or regular operator is similarly detected and reported. For example writing (n : Nat := Nat) ** Vect n String, while equivalent to (n : Nat) ** Vect n String by our desugaring rules, will result in an error:

(**) is not an automatically-binding operator, bur a type-binding operator, please write 
(n : Nat ** Vect n String), or import a module that defines (**) as automatically-binding

Example use-cases

Make -@ dependent

Currently, we have a linear arrow for linear type, but it is not dependent. This is because it is simply defined as a non-dependent operator.

Defining a dependent operator would not fix the need for a custom linear dependent arrow since it would look like this:

-- =@ for dependent arrow -@ for non-dependent
0 (=@) : (x : Type) -> (x -> Type) -> Type
(=@) a b = (1 v : a) -> b v

replicate : {0 a : Type} -> {0 v : a} ->
    Nat =@ (\n => n `Copies` v -@ LVect n x)

However, using typebind our function replicate can look like this:

replicate : {0 a : Type} -> {0 v : a} ->
    (n : Nat) =@ n `Copies` v -@ LVect n x

Use different operators for quantitative pairs

By "quantitative pairs" I mean pair-types in which left and right arguments have different quantities. For example. Assuming we use the following convention:

  • ! means "unrestricted"
  • # means "linear"
  • - means "erased"
    Then we can build a collection of product types with different quantity semantics:
a !! b -- unrestricted on the left, unrestricted on the right, isomorphic to Pair
a !# b -- unrestricted on the left, linear on the right
a !- b -- unrestricted on the left, erased on the right
a #! b -- linear on the left, unrestricted on the right
a ## b -- linear on the left, linear on the right, isomorphic to LPair
a #- b -- linear on the left, erased on the right
a -! b -- erased on the left, unrestricted on the right
a -# b -- erased on the left, linear on the right
There is no (--) since it would be parsed as a comment and it's already expressible with (0 _ : Pair a b)

But more interestingly, instead of using them like this: Nat -# \x -> Vect x String
We can now write (x : Nat) -# Vect x String.

Complex DSLs

One of the barriers to implementing OpenGames in Idris is the lack of customizable binding syntax. The Haskell version makes heavy use of template-Haskell and quasi-quotes to achieve this goal, but this implementation comes with a number of caveats that we won't have to deal with were it to be implemented in Idris with autobind.

Technical implementation

typebind is implemented as a desugaring step:

(n : a) ** b n gets rewritten as (**) a (\n : a => b n)

This does not capture the entirety of the design space around binders. When writing a DSL, it is often the case that we want to bind something in a way that looks like a let-binding, which isn't captured by the desugaring above since the right-side of the colon is used as the type of the lambda. For this we introduce additional syntax which is reminiscent of the current let
binder.

Whenever the type of the binder does not match the type of the lambda we can write

(n := a) >>= op n which is desugared as (>>=) a (\n : ? => b n)
The type of the argument of the lambda being left as an inferred type allows the bound name to be of a different type.

Whenever the type is different, but we want to explicitly give it, we can write:

(x : ty := val) *-* fn x which is desugared as (*-*) val (\x : ty => fn x)

This syntactic pattern matches the current intuition around let-binder which also use binder : type := expr.
This can be used as a DSL for loops: (x : Nat := [1,2,3]) `for` show n

Neither of those two new syntax changes are breaking and they do not require additional elaboration steps as they are part of th desugaring process.

Limitations around quantities

We could support quantities in binders by desugaring (1 n : a) =@ b n as a =@ (\1 n => b n)

However, this syntax is not supported in this proposal because it would cause confusion between the syntax of -> and binding operators.

Take the linear function (1 n : a) -> Sing n which constructs a singleton from a linear value. Writing it with the new linear arrow operator looks like this: (n : a) =@ Sing n which is the expected syntax.
However, writing (1 n : a) =@ Sing n does not have an obvious meaning:

  • Is it the same as without the quantity and behaves like a linear arrow?
  • Does it mean the type constructor takes a linear argument like the desugaring suggests?

Additionally, the linearity binding is already given by the typing context: a function fn : (x : Type) -> (x -@ Type) -> Type already declared the binding structure to be linear. The ability to give a different quantity at call-site does not provide a relevant additional level of expressivity and is purely a source of errors.

Because of this, binders for operators are not allowed to define any quantity.

Alternative Considered

Attach typebind/autobind to functions rather than fixities

This should be technically possible but would incur a level of integration with the typechecker that I foresee as difficult and slow. This would make any overloading of normal/autobind operator rely on the typechecker, which means the file cannot be entirely parsed until we are done typechecking and vice-versa. However one could imagine an %autobind directive attached to a function definition such that, if the operator is used in a traditional infix way, but the function call is resolved to an %autobind-annotated one, we could emit an error saying "the function is meant to be used in autobind position but the operator is not declared as such, please add autobind to the operator's fixity". It is not clear to me that this is bringing anything to the table other than nicer error messages. Not that error messages can't get better, but the design-space is already quite delicate to work around and %autobind-directive at function-definition side could be used for something slightly more useful, like allowing overloading of normal and autobind operators. Something we briefly explore in Potential extension.

Syntax blocks

This feature overlaps with previously defined syntax blocs, but those were buggy, hard to read, slow, and difficult to maintain. This feature aims to side-step those issues by relying on an existing mechanism of the language, and making it available to the user, rather than built-in and reserved. I also posit that the main uses for syntax-block can be re-created with autobind operators, albeit not exactly in the same way, in a way that is clean enough to implement real-world domain-specific languages.

Do-nothing/do-notation

Do-notation already provides customisable binder syntax and the DSL writers have been surviving just fine. Could it be that this feature is unnecessary?
I believe that the examples provide the necessary use cases for such a feature that cannot be achieved today with do-notation.
Another benefit is that, while do-notation allows interleaving different (>>=) operations. Using autobind operators would allow you to visually distinguish between different binding structures, rather than rely on Idris's overloading mechanism.

Allow right-to-left autobind

Technically speaking, it would be possible to allow autobind to bind arguments from right-to-left with infixl operators. This could be added at a later date without breakage. It is not part of this proposal because the use case is not widely documented in existing code at the time of writing. Whenever the case for it appears, we could re-visit this decision and add right-to-left binding, covering even more of the feature space provided by syntax blocks. There are some interesting considerations to take into account, for example, facilitating the elaboration of terms where the dependency propagates right-to-left like for equational reasoning in Agda.

Q&A

Why the name?

typebind refers to the ability to bind a type to a name,autobind refers to the ability to automatically infer the type of the binder. Coming up with names is difficult, especially keywords that eat into the namespace of possible identifiers. I think both display a nice balance between a descriptive name, and one that won't intrude into existing, and future, code-bases. bind and binder are common names that are likely to introduce breaking in existing libraries, while typebind and autobind aren't existing english words or constructs in the language.

What's up with **, DPair and MkDPair?

** could be defined as

typebind infixr 0 **
namespace TypeConstructor
  (**) : (x : Type) -> (x -> Type) -> Type
  (**) = DPair

namespace ValueConstructor
  (**) : {0 a : Type} -> {0 b : a -> Type} -> (x : a) -> b x -> DPair a b
  (**) = MkDPair

Unfortunately, this will not work since (x : a) ** f x is incompatible with (x ** y). We need special handling of the overloading of (**) as both a typebind and a regular operator. I believe this can be achieved in the same way that - is both a prefix and an infix operator. In the prototype, I did not touch any of the special code for (**)

Why not rewrite -> as a binding operator?

-> behaves quite differently from other binding operators, in particular when it comes to quantities: it is not the case that
(1 x : a) -> b x desugars to a -> (\1 x => b x), rather it indicates that the inhabitant of that type uses its argument linearly, which is not what binding operators allow.

This is nice but it's not going far enough

It would be great if we could completely remove the special cases for ** and unify -@ and =@. Technically speaking, this could completely supplant do-notation and let-syntax. However, this proposal errs on the side of caution by making conservative choices with regard to overloading and syntax disambiguation. Eager syntax changes have led to difficult-to-solve inconsistencies which are still at play today (see issues about namespaces and dot-projections). Despite those restrictions, I think this proposal still provides a substantial improvement in expressive power, while avoiding pitfalls previously encountered. Namely, slow disambiguation processes, ambiguous syntax, and confusing hard-to-read code. If those restrictions prove superfluous, lifting them will be a backward-compatible change and can be done without fear.

Why do we need this? Nobody's going to use it?

This feature will see the most use as users employ functions such as (**) and =@, as well as new operators for quantitative pairs. I expect the use of this feature in fixity-delcarations to be, niche for most users, since we do not expect many users to write DSLs. But for language designers and library writers, the ability to customise binders is extremely useful, and I expect to see this feature used by this corner of the community. Additionally, one could be tempted to think that adding this feature will add complexity and cruft to the existing parser, but this feature already exists in the form of (**). What this proposal does is render this implementation more principled, and customisable. Avoiding the pitfall of adding complex special-case code that was introduced in idris1 with syntax-blocks.

Why can't we overload -@ for both dependent and non-dependent arrow?

We could but the error message would get worse. The current prototype implementation detects when a binding operator is used in non-binding position and report an error explaining how to fix the program. Allowing the overload would mean that we desugar Nat -@ Nat as (-@) Nat (\_ : Nat => Nat) in addition to the previously described desugaring. However, this is problematic for the following reasons:

  • When using a typebind operator as regular x -@ z and it does not typecheck, we do not know if it's because the user meant (x : _) -@ z or (_ : x) -@ z and so we cannot report the correct fix to the user. In the current version, we tell the user to write (_ : x) -@ z.
  • If we write let x = Nat in x -@ y x, we don't know if we mean to shadow x or to reference x. In the current implementation, -@ would have to be regular and x is a reference. If -@ was typebind, we report the error apropriately.
  • When using the type constructor for sigma (**) we woudl be tempted to write x ** Fin x but with the above rule, this would be desugared to (**) x (\_ : x => Fin x), but what the user meant is (**) ? (\x => FIn x).

All of those are solvable problems, but I do not wish to attempt to solve them in this proposal.

Why do we need so much special syntax for binders?

Technical details should answer this from a technical perspective, from a more philosophical one, the goal of this feature is to rely on existing user-intuition about binders and types to provide more expressive power, without creating confusion or unexpected behavior. In essence, the goal is to make easy things easy, and hard things possible. Most users and library designers will get along just fine with typebind but for more sophisticated uses autobind allows greater expressivity, while keeping a familiar look and feel. autobind itself has two modes (x := e) ** v and (x : ty := e) ** v to properly encapsulate everything a user might want to do with a binder. Because autobind behave exactly like a let-binder, I have repurposed the syntax to piggyback on existing knowledge from users.

Funny things you probably shouldn't do

Custom let bindings

autobind infixr 7 `MyLet`

MyLet : (val) -> (val -> rest) -> rest
MyLet arg fn = fn arg

program : Nat
program = (n := 3) `MyLet` 2 + n

Forego all do-notation

autobind infixr 2 >>=
(>>=) : m a -> (a -> m b) -> m b

program : Maybe (String, String)
program = 
  (p1 := parseName) >>=
  (_ := skipSpace) >>=
  (p2 := parseName) >>=
  Just (p1, p2)

Potential extension

In Alternative considered we mention the use of an %typebind and %autobind directive for functions definition. While this does not seem to fit in the design space of operators, it can be used to declare function definitions as binding operators, allowing it to cover even more use-cases of syntax blocks.

Indeed one could write the constructor for sigma types as

%typebind
Σ : (x : Type) -> (x -> Type) -> Type
Σ = DPair

number : Σ (n : Nat) Fin n
number = MkDPair 3 FZ

Or the clean up the syntax for the for-loop DSL:

%autobind
for : Traversable t => Applicative f => t a -> (a -> f b) -> f (t b)
for x f = traverse f x

-- assuming we have `Applicative (\x => x)``
doubleAll : List Nat -> List Nat
doubleAll xs = for (x : Nat := xs)
    x + 2

We could also completely replace let-syntax:

%autobind
let : (arg : ty) -> (ty -> result) -> result
let x f = f x

double : Nat -> Nat
double v = let (x : Nat := v) x * 2

Because this syntax conflicts with function application and requires more elaborate changes to the parser, it is not part of the proposal.

Allow dependent-pair-like syntax

A previous version of this proposal allowed for (x : a ** b) for a typebind operator. However due to the syntactic inconsistency that would bring when paired with other binding operators, it has been demoted to a future extension.

The problem can be seen with this expression (x : a ** b) =@ f x
Should this be parsed as

  • ((x : a) ** b) =@ c valid if ** is typebind and =@ regular
  • or as (x : (a ** b)) =@ c valid if ** is regular and =@ is typebind?

Both are valid given two different binding structures. I think the feature still has value in this form, but we should be able to detect and return errors when ambiguous syntax is used and ask the user to disambiguate. In that case that would mean write either ((x : a) ** b) =@ c or (x : (a ** b)) =@ c

@ohad
Copy link
Collaborator

ohad commented Oct 23, 2023

You might want to draw similarities with, and learn lessons from, the OCaml extension for user-defined binding operations.

@andrevidela
Copy link
Collaborator Author

Anything in particular you think should be referenced?

@buzden
Copy link
Contributor

buzden commented Oct 30, 2023

Because of their nature, binding operators can only be infixr and must be used in conjunction with a function that binds a name in its second argument.

If so, why bother with typebind infixr 0 ** and not to use typebind 0 ** or typebindr 0 ** instead?


Apart from that, I don't understand much need of typebind variant. Doesn't autobind cover all that actually needed (surely, once the syntax of typebind is supported for autobind)?

What I mean is the following. Consider, I declared your original (**) example as autobind:

autobind infixr 7 **

record (**) (x : Type) (y : x -> Type) where
  constructor MkDPair
  fst : x
  snd : y fst

As far as I understand, expression (n := Nat) ** Vect n a would be correctly desuraged to DPair Nat (\n => Vect n a). Compiler's stages after the desuraging will ensure that the RHS is actually of type x -> Type having that LHS is x. So, it seems that we do not need special case for that as the desugaring stage.

So, why not to merge typebind and autobind to just bindr type of operators and to support both (n : Nat) ** ... and (n := Nat) ** ... syntax for them, as they desugar practically the same?

UPD: Or, even, desugar (n : expr) ** ... as (**) expr $ \n : expr => ... and desugar (n := expr) ** ... as (**) expr $ \n : ? => ... as you currently do. I.e. to not to distinguish between those desuragings as the operator declaration cite, only as the use cite.

@andrevidela
Copy link
Collaborator Author

andrevidela commented Oct 30, 2023

Thank you for your feedback.

why bother with typebind infixr 0 ** and not to use typebind 0 ** or typebindr 0 ** instead?

  • Users already know how to declare fixities and precedence, typebind infixr 0 ** communicates that ** is an infixr operator with precedence 0 even if they don't know what typebind means.
  • typebind infixr reads as a composable grammar, where typebind is an adjective for infixr. Composability is a desirable feature since it allows you to write multiple statements with the same building blocks. rather than come up with a bespoke keyword for every single composition of keywords
  • This is a purposely conservative proposal for a new syntactic feature, the goal is to be as undisruptive as possible. In its current state, if you wanted the minimum possible definition, you could just write typebinder ** and that would be enough for the compiler to know what to do. But as a syntactic feature, we should be mindful in the ways it might or might not work. In particular, it might be plausible to write public export autobind typebind infixl 7 << in the future in a way that makes sense, but there is no point in rushing out this feature if we don't know how people are going to use it and how we can mitigate its misuse.

why not to merge typebind and autobind to just bindr type of operators and to support both (n : Nat) ** ... and (n := Nat) ** ... syntax for them, as they desugar practically the same?

The answer is in the text in two places:

Technical details should answer this from a technical perspective, from a more philosophical one, the goal of this feature is to rely on existing user-intuition about binders and types to provide more expressive power, without creating confusion or unexpected behavior. In essence, the goal is to make easy things easy, and hard things possible. Most users and library designers will get along just fine with typebind but for more sophisticated uses autobind allows greater expressivity, while keeping a familiar look and feel.

One of the benefits of this approach is to provide helpful error messages when the syntax is slightly incorrect or when the expected modules are not imported.

There is no user-facing benefit in providing two syntaxes for the same thing. From a teaching and learning perspective, this is even a detriment. Why explain a student that they can write (n := Nat) ** Vect n String when they've already taught (n : Nat) ** Vect n String? Why would you willingly remove the syntactic rule that anything after a : is a type? Those kind of conventions are of great importance for people learning and porting knowledge from other languages. Because of this, I strongly support the distinction between type-binding and automatically-binding operators. They should not be used interchangably because they do different things, achieve different goals and look different.

I invite you to look at the error message the distinctions provide, they showcase the ability to identify syntax misuse, and provide helpful diagnostic to obtain a resolution. All while teaching the user the necessary concepts to customise the behavior of the operator :

Error: Operator =@ is a type-binding (typebind) operator, but is used as an automatically-binding (autobind) operator.

Errors:8:19--8:21
 4 | 0 (=@) : (a : Type) -> (a -> Type) -> Type
 5 | (=@) a f = (1 x : a) -> f x
 6 | 
 7 | data S : {ty : Type} -> (x : ty) -> Type where
 8 |   MkS : (x := ty) =@ S x
                       ^^
Explanation: regular, typebind and autobind operators all use a slightly different syntax, typebind looks like this: '(name : type) =@ expr', autobind looks like this: '(name := expr) =@ expr'.

Possible solutions:
 - Write the expression using typebind syntax: '(x : ty) =@ S x'.
 - Change the fixity defined at Errors:2:10--2:21 to 'export autobind infixr 0 =@'.
 - Hide or remove the fixity at Errors:2:10--2:21 and import a module that exports a compatible fixity.

(source: https://github.com/andrevidela/Idris2/blob/autobind/tests/idris2/operators/operators002/expected#L2-L16)

What's more this proposal leaves enough room to write autobind typebind infixr 1 ** (a perfect example of composable grammar) in the future which would support overloading the syntax if the need would ever arise.

@alahijani
Copy link

alahijani commented Jun 12, 2024

I think the distinction between typebind and autobind makes sense at a purely syntactic level without involving type checking.

  • when And is a typebind operator, (x : a) `And` p x should desugar to And a (\x : ? => p x)
  • when Let is a autobind operator, (x := a) `Let` p x should desugar to Let a (\x : ? => p x)

The distinction is only about the allowed token : vs. :=. Neither kind should check the type of the bound variable.
For a typebind operator binding a variable that's not a Type consider this example,

data Formula : Type
Proof : Formula -> Type

data Formula : Type where
  True : Formula
  And : (a : Formula) -> (Proof a -> Formula) -> Formula

The distinction between typebind and autobind can be made use-site, but that's suboptimal because you really don't want your users to type (x := a) `And` p x when they should be typing (x : a) `And` p x.

@andrevidela
Copy link
Collaborator Author

Thank you for your comment, there is a couple things that I would like to address:

About overloading syntax

  • when And is a typebind operator, (x : a) And p x should desugar to And a (\x : ? => p x)
  • when Let is a autobind operator, (x := a) Let p x should desugar to Let a (\x : ? => p x)

If there is no difference between those two statements, why even have two statements in the first place? I appreciate that you aim to simplify things, but in that effort, you seem to have lost the original motivation for this feature. To reiterate the design goal mentionned above:

In essence, the goal is to make easy things easy, and hard things possible

Introducing two overlapping syntaxes does not make things easier to learn, or to use. Nor does it address the hard cases where elaboration is insufficient to infer a type.

About what users should type

You say

The distinction between typebind and autobind can be made use-site, but that's suboptimal because you really don't want your users to type (x := a) And p x when they should be typing (x : a) And p x.

But I do not see any motivation for this argument. Why is the distinction suboptimal and the overload in syntax optimal? What is the problem this is attempting to solve? Without precisely setting the scope of your idea, it is imposible to make any assess its validity. A very big question your suggestion rises is "Why are we breaking the cultural norm that things on the right side of a : is a Type? To what end?". Unless you address this, I do not see how this is even an argument.

To conclude

I am not sure I see the benefit in the suggestion you put forward. It also strikes me that it does not take into account the above suggestion to combine the two keywords to enable the overloaded syntax. Nor does it provide compelling use-cases for existing or future code. Additionally, either of those points woul probably be part of a separate orthogonal proposal, rather than this one, since it seems to tackle a very different design space. I'm happy to revisit this once more details are presented.

@alahijani
Copy link

Nor does it provide compelling use-cases for existing or future code.

I think embedding a type theory in Idris is a valid use case which requires (x : a) to not validate a as a Type. I mentioned this as an example above, where a has type Formula and x has type Proof a.

It also strikes me that it does not take into account the above suggestion to combine the two keywords to enable the overloaded syntax.

The two can still be combined, but my point was the embedded type theory use case I had in mind, so I did not mention that. For example here is a strawman proposal for combining the two modes:

bindr 0 (:) `And`
bindr 0 (:=) `Let`

This also keeps open the possibility of supporting other binder modes such as (auto :), (0 :) or even {:} in the future.

@andrevidela
Copy link
Collaborator Author

Hi there,

It would be nice if your answer was a little be more comprehensive. For scoping reasons, I will ignore the alternative proposal and invite you to write you own via the issue tracker.

With that said, you write

I think embedding a type theory in Idris is a valid use case which requires (x : a) to not validate a as a Type. I mentioned this as an example above, where a has type Formula and x has type Proof a.

I think so too! This is why it is also part of the motivating examples of the proposal. The code samples using VPi are taken directly from one of my implementations where I had a bug because the existing binding via lambdas was too confusing. Writing systems with binders is one of the biggest advantages of dependently-typed programming, and it makes sense that we provide affordances toward developing this kind of software.

However, your answer fails to address any of the weaknesses presented above: Why is it worse to write := and why is it better to write :? And more fundamentally, why is it better to break the convention that : is followed by a Type rather than preserve this knowledge? I understand where you are coming from, but you need to state your assumptions clearly and plainly for everyone to read. Some people here are not native english speakers, some people are not familiar with binders or writing EDSLs, I would really appreciate if you spent the time to explain what it is that you see in the ability to write (x : a) `And` p x. Or explain why you think it is not a big deal that we drop the convention that : means Type. I think the proposal above states clearly that writing embedded DSL was part of the use-cases to address, and so I would like to see why you seem to think that the proposed solution fails to address the use-case.

You've also not said anything about duplicated syntax, ease of use, or error messages. Is it because you do not care about those things? You did not have time to write them out? You agree with me that they should be part of the design process? Again, an explicit and comprehentive answer would be great. Text communication is hard enough without having to read in between the lines, let's make it as easy as possible for everyone.

@alahijani
Copy link

alahijani commented Jun 13, 2024

You've also not said anything about duplicated syntax, ease of use, or error messages. Is it because you do not care about those things? You did not have time to write them out? You agree with me that they should be part of the design process?

Not really, rather I did not discuss these because these concerns are (mostly) orthogonal to my main concern.

My main concern is explicitly:

  • Being able to bind (x : a) for a type-like object a that is not a Type. Here (x := a) would not work for And because it suggests value binding to a reader and not a type annotation. Being suggestive is the whole point of using operators.

My side concern is:

  • Keeping typing matters (and typing related error messages) separate from parsing. This is to keep the compiler easier to maintain, but also to keep the errors less confusing. Consider adding universe level checking for example. Which level of Type should a be checked against? It would be weird to have to worry about universe polymorphism in the middle of the parser. Or to have a universe level inconsistency disguise itself in a parsing error message.

@alahijani
Copy link

There is no user-facing benefit in providing two syntaxes for the same thing. From a teaching and learning perspective, this is even a detriment. Why explain a student that they can write (n := Nat) ** Vect n String when they've already taught (n : Nat) ** Vect n String?

I agree with this. If the declaration-site distinction between the two modes were to be dropped the same operator ** would need to support both tokens : and :=. I said I think that is suboptimal, for the reasons you just mentioned. So I think we are in agreement on this point.

I totally support the declaration-site distinction between typebind and autobind. I also agree that the term after the : should be morally type-like. I just don't think that being "morally type-like" should be conflated with being an Idris Type.

@andrevidela
Copy link
Collaborator Author

Not really, rather I did not discuss these because these concerns are (mostly) orthogonal to my main concern.

If your suggestion does not take into account those aspects, I do not think it can be taken seriously. We're building a language to help people write better software, and making the syntax clear, consistent, easy to use, and provide great feedback, is an integral part of the design process. If you think it isn't, then you are not in the right place.

About your main concern

I think you've said it yourself

Here (x := a) would not work for And because it suggests value binding to a reader and not a type annotation

a is not a Type, even though it represents a "type", therefore the suggestion that it is a value instead is grounded in reality. I appreciate the effort you've put into this statement and I am happy we reached this level of understanding.

edit: I just saw your second reply.

I also agree that the term after the : should be morally type-like. I just don't think that being "morally type-like" should be conflated with being an Idris Type.

I am very thankful for this statement, it makes crystal clear what you are looking for.

About your side concern

This is to keep the compiler easier to maintain

This is a single desugaring step. There is no interaction with the elaborator, or the typechecker.

to keep the errors less confusing

The error messages designed are displayed above, maybe you could point toward improvements to make. They are also a large part of the diff of the PR, more than the feature itself. Unless you come up with better error messages, saying "keep the errors less confusing" does not in fact keep the errors less confusing.

I will ignore the coment about type universes because this thread is not about type universes.

@alahijani
Copy link

alahijani commented Jun 13, 2024

  • when And is a typebind operator, (x : a) And p x should desugar to And a (\x : ? => p x)
  • when Let is a autobind operator, (x := a) Let p x should desugar to Let a (\x : ? => p x)

If there is no difference between those two statements, why even have two statements in the first place?

There is no difference other than the token used, : vs :=. This doesn't mean we only need one and not the other. To someone reading the code using these operators, : and := mean very different things. That's enough to want both.

We can spend time bikeshedding if the declaration-site distinction should be spelled with the exact keywords typebind vs. autobind, but that's just the shallow syntax.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants