Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function Syntax #6

Open
keean opened this issue Sep 20, 2016 · 108 comments
Open

Function Syntax #6

keean opened this issue Sep 20, 2016 · 108 comments

Comments

@keean
Copy link
Owner

keean commented Sep 20, 2016

One of the first things we need to do is decide on a function syntax, for application and definition.

@keean
Copy link
Owner Author

keean commented Sep 20, 2016

I don't think we want something too weird and new, so we should follow the conventions of existing language families. Do we want to follow the functional style which makes variable and function assignment similar (and maybe support partial application):

x = 1
f x = x
g x y = (x, y)

or do we want something from the 'C' family:

int x = 1
int f(int x) = x
(A, B) f<A,B>(A x, B y) = (x, y)

or something like Rust (without the types for inference):

let x = 1
fn f(x) { x }
fn g(x, y) { (x, y) }

Or perhaps python like:

x = 1
def f(x):
    ...
def g(x, y):
    ...

@shelby3
Copy link

shelby3 commented Sep 20, 2016

@keean wrote:

Do we want to follow the functional style which makes variable and function assignment similar (and maybe support partial application):

x = 1
f x = x
g x y = (x, y)

I have argued here, here, here, and here to not to adopt that syntax.

Also that syntax is more ameniable to global type inference, because it otherwise requires a duplicate line to declare the type of the function.

Insurmountable negative factor is that vast majority of the world's programmers will be unfamiliar with it (unnatural, not second nature, fights their ingrained second nature), because it is too different from the most popular languages, JavaScript, Java, C, and C++.

I argued it raises the cognitive load too much both at declaration and at the call site, because it requires computing too many factors (inference, precedence, declaration site types and arity).

From the author of the acclaimed Learn You A Haskel:

I failed to learn Haskell approximately 2 times before finally grasping it because it all just seemed too weird to me and I didn't get it.

@keean wrote:

However I do think you have a point about people not used to the style.

You seem to have agreed?

Btw, partial application can be done with the other syntax styles. I will explain later.

@keean
Copy link
Owner Author

keean commented Sep 20, 2016

@shelby3 I agree I just want to capture everything relevant here.

I also remember we wanted something like Python style indenting rather than curley brackets.

Do we want to have a keyword like 'def' or 'fn' for function definitions?

@keean
Copy link
Owner Author

keean commented Sep 20, 2016

I actually quite like Rust's syntax for definitions, rather than Python or TypeScript. One problem is with combining TypeScript style signatures with Python style indenting:

inc(x : Int) : Int :
    ...

The problem here is we might want to leave out types:

dec(x) :
   ...

It makes it impossible to tell whether the return type is missing. Whereas with Rust style:

inc(x : Int) -> Int :
    ...

@shelby3
Copy link

shelby3 commented Sep 20, 2016

@keean wrote:

The problem here is we might want to leave out types:

I was just about to post a similar point. We can't use the C/Java-style of type precedes instance name:

int x = 1
int f(int x) = x

Also I wouldn't want some verbose gaudy type as the prefix any way.

I wrote that 2 days ago:

If non-exported function signatures can be inferred, then a syntax with optional type annotations would seem better.

Agreed. Thus we need the suffixed : type and not the C/Java prefixed type declaration.

Haskell puts the optional types on a separate line instead which I think is unfamiliar, consumes extra lines and violates DNRY, which I don't like (then afaik all the types have to be provided, not the option of just some of them).

@shelby3
Copy link

shelby3 commented Sep 20, 2016

@keean wrote:

Do we want to have a keyword like def or fn for function definitions?

Since yesterday, I've been trying to think if there is a way we don't have to prefix the function declaration (aka definition) with fn or def (because I hate noise, unless it aids readability). I was going to work on the LL(k) parser grammar so we can prove it is a context-free grammar (CFG), which is very important. Also I want consistency with declaring methods in the body of typeclass definitions. So if we need to prefix with fn or def for local functions, then we should do same for methods, but I hate that boilerplate.

I've been thinking about all these issues too. 😉

Inside the body of the typeclass declarations we certainly don't need the prefix on methods. But for functions declared inside the body of another function, we would have a grammar conflict with function calls. One way to get around that is to force a space between the function name and ( for declarations (e.g. f (x...)) and that whitespace not allowed for function calls/application (e.g. f(x...)). I may have other ideas. Let me think about it for a while.

The grammar is really a holistic design and we can't design it piecemeal and be sure to not have CFG conflicts.

Feedback?

@shelby3
Copy link

shelby3 commented Sep 21, 2016

@keean wrote:

I also remember we wanted something like Python style indenting rather than curley brackets.


inc(x : Int) : Int :
   ...

The problem here is we might want to leave out types:

dec(x) :
  ...

It makes it impossible to tell whether the return type is missing

I don't like the alternative syntax with an arrow inc(x : Int) -> Int :. It is inconsistent! Consistency is very important. And that trailing : is very confusing no matter what we do.

In the LL(k) grammar I was working on in May, I devised a solution which eliminates the need for the trailing : to indicate the start of an indented code block.

I presumed we will enforce a standard indent as agreed by @SimonMeskens.

The rule is that if the following line is indented by the standard amount, then it is a new code block. If it is indented more than the standard amount, then it is the continuation of the prior line.

I realize that enforcing indenting spacing is something that programmers will disagree on because they all have their preferential amount (and some want tabs and others want spaces), but this is open source and consistency (for readability) is more important than allowing people that freedom to make code unreadable for others (such as when the tabs don't line up!).

Btw, I am hoping we decide for 2 spaces indent to conserve horizontal space for deeply indented blocks? I notice from your examples, that you like more spaces when you indent (4 or 5?), but the really skilled programmers I've seen use 2 spaces, because 2 spaces is enough and also it helps align at the tail of the if on the line above:

if x
  something
else
  something else

Three spaces uses 50% more horizontal space and it aligns with the x which is confusing or consistent depending on the way one thinks about it:

if x
   something
else
   something else

Any block indenting greater than 3 spaces is going to piss off some programmers. Any programmers who think they need 4 spaces for block indenting are amateurish. Our eyes can't easily detect it at 3 spaces, and I find 2 is enough (and I am blind in one eye and blurred in the other eye).

What is your opinion? I could tolerate 3 but I don't think it is ideal. But 4 makes me not happy:

if x
    something
else
    something else

And 5 is ridiculous:

if x
     something
else
     something else

Most of my coding life I used 3 spaces, but when I saw that expert coders were using 2 spaces, I realized I didn't need 3.

@SimonMeskens
Copy link

SimonMeskens commented Sep 21, 2016

What about the wildly popular arrow syntax? It's familiar, due to being included in JavaScript, TypeScript, Java and C# and it's quite concise (unlike using a keyword like def or fn). I'm using indents, but this would work with curlies or anything else too (I personally prefer required indentation). In Coffeescript, return is optional, but I really don't like that. What I do like is optional return for one liners. I kinda like the conciseness of not having to use brackets, but I agree that it makes the language rather dense to read and hard to get into. I agree with earlier sentiment that semi-colons are evil. You pay a penalty for all your code for that one time where you need them, usually to make your code less readable. I picked a thin arrow (->) instead of a fat arrow (=>), which is used more often, because I think the fat arrow is visually quite busy.

Block lamda (daft sample to pad it out a bit):

inc(x: Int): Int ->
  let y = x + 1
  return y

Notice that absence of type on y means the compiler implicitly assigns type, not that y is untyped.

Now, the problem arises for me, when you want to type curried functions or just simple lambdas. What do we use for indicating a type goes from one to the other? TypeScript for example uses the fat arrow for both, this is syntactically very confusing. I think for this reason, we could also replace the arrow with double colon. I personally think double colon is the nicest separator, but it's slightly less popular.

I'll introduce something else I like too: using := for assignment, like in Pascal. The reason being that this gels nicely with :: (both are a type of assignment, one assigns code, one assigns a value) and because it allows you to use = for comparison, as == for comparisons is visually busy and comparisons already tend to be the busiest syntax. I might even go as far as to suggest and and or for logical operators to further fix that.

filter(list: List of Item, pred: Item -> bool): List of Item ::
  let out := List(Item)
  for(item of list)
    if(pred(item))
      out.add(item)
  return out

filter(myList, x => x = 5)

Notice that we might as well just drop the ::, as with forced indent, it's clear that the block indicates a function body.

inc(x: Int): Int
  return x + 1

I kept the code quite imperative for now, to not assume too much yet. If we do get rid of brackets, I might prefer something like this I think? Still keeping brackets for calls maybe? Starting to get alien. Still less confusing than Haskell though.

filter
  list: List of Item
  pred: Item -> bool
  return List of Item
=>
  let out := List(Item)
  for item of list
    if pred(item) and not item.empty
      out.add(item)
  return out

filter(myList, x => x = 5)

@shelby3
Copy link

shelby3 commented Sep 21, 2016

@SimonMeskens wrote:

What about the wildly popular arrow syntax? It's familiar, due to being included in JavaScript, TypeScript, Java and C# and it's quite concise (unlike using a keyword like def or fn).

inc(x: Int): Int ->
  let y = x + 1
  return y

That solves the : at end to start a new block, but I offered also an enforced indenting style as a solution instead, which removes that noisy arrow.

Your suggestion won't work for eliminating the def or fn as I explained where I offered inc ( with the space between inc and ( instead, because we need efficient LL(k) left-to-right parsing with minimal constant maximum number 'k' of tokens lookahead, but yours is an unknown variable unbounded number of tokens lookahead (for the parser to differentiate it from a function call, as parser has to reach a : to distinguish from function call, and : is optional on each argument for non-top-level functions).

I potentially like your idea of consistency between the syntax of anonymous lambda functions and regular function implementations (but maybe they shouldn't be equated since one is not a singleton, is not named, and other is a named, singleton?), but you've rendered the new block syntax inconsistent with if as follows (and I'm not sure which consistency should be our the priority choice?):

if x
  something

Also this would be inconsistent with -> to start new block:

if x:
  something

@SimonMeskens
Copy link

That seems fair enough, was just offering some ideas. Personally, I really like homoiconic languages, but I'm not sure how feasible that is for this one. Here's a good article on it:

https://blogs.oracle.com/blue/entry/homoiconic_languages

@shelby3
Copy link

shelby3 commented Sep 21, 2016

I am very sleepy so I hope the following is coherent.

On further thought, I must overrule my prior suggestion for an enforced number of spaces for indenting. As much as I would like to have this consistency, I don't think it will be popular because any choice will offend most people who don't like that choice. Number of spaces for indenting is a personal preference.

I looked at the grammar I wrote in May and now I remember how I solved this problem and maximized consistency.

First we must consider that these and if-else will also optionally be used without a code block and also that if-else will replace the ternary operator to simplify the language (more consistency) and because the if x: or if (x) at the start has lesser chance of precedence errors than "string" + x ?. Also I force a code block for nested if (as shown below).

So I suggest that if (x) is superior to if x: because the : when a new code block doesn't follow (i.e. what follows is on same line), then it could make be confused with a type annotation (or even a cast if we ever used that : Type format for casts). Also the : is more invisible when crowded.

if (x)
  true
else
  false

if x:
  true
else
  false

if (x) true else false
if x: true else false

if (x)
  if (y)       // may not put this at end of prior line, because it is a nested `if`
    true
  else
    false
else
  if (y)       // may not put this at end of prior line, because it is a nested `if`
    false
  else
    true

So if we choose if (x), then consistency with function definitions is lost in terms of the character that separates the code block, or code on same line. Thus I am comfortable with @SimonMeskens' idea except I favor the fat arrow (=>) because JavaScript and Scala use it and it looks more like an assignment symbol (=) so we can consistently use the same symbol for code block, one line function code (see below), and lambda (aka arrow) functions. Also afaik -> has typically been used for type signatures and not anonymous (unnamed) lambda functions, although Java 8 uses ->. The following definition now looks like assigning a lambda function to the function name, which is actually how JavaScript does it for { inc : (x) => return x + 1 }.

inc = (x: Int): Int =>
  let y = x + 1
  return y
inc = (x: Int): Int => return x + 1

Notice above instead of def, fn, or my hokey space idea, I used an = to differentiate from a function call inc(x...) needing only k = 2 tokens of lookahead (note inc no matter many characters is one token returned by the lexer to the parser). It is not shorter than def or fn, but it is more consistent with the usage of lambda functions, because we assign the unnamed lambda function to the function name.

Some people suggest to reverse the direction of the arrow?

@SimonMeskens
Copy link

That seems pretty clear to me 👍

@keean
Copy link
Owner Author

keean commented Sep 21, 2016

Some interesting ideas, I quite like the uniform assignment syntax. Haskell and other functional languages define functions as:

x = 1
f x = 1

It just means you need an LR parser. So I don't see a problem with:

x = 1
f(x) = x

I would suggest using '=' consistently. However I also like the anonymous function syntax and using normal assignment:

f = (x) => 
    x

That looks odd for something like the identity statement.

I would suggest the block begins if there is nothing on the same line after the '=>' .

I think block indent depth should be set by the indent of the first line. Line continuations would be any kind indented more than this.

However there would be problems where there is a block and a line continuation like:

set_callback((x) => 
        print(x)
        x + 1
    ) // Must be at least as indented as the original, but less than the block indent.

What about optionally naming the anonymous function to help debugging. JS allows this:

set_callback(move_right(x) => x + 1)

I think these forms will require an LR parser, but I am okay with that.

@keean
Copy link
Owner Author

keean commented Sep 21, 2016

Should we allow no brackets for a single argument function?

f = x => x + 1

@shelby3
Copy link

shelby3 commented Sep 21, 2016

@shelby3 wrote:

On further thought, I must overrule my prior suggestion for an enforced number of spaces for indenting. As much as I would like to have this consistency, I don't think it will be popular because any choice will offend most people who don't like that choice. Number of spaces for indenting is a personal preference.

But could we at least agree to disallow tabs for indentation (i.e. only spaces allowed)? My attitude is if it pisses off a few people, then I am unsympathetic to their preference, because if I as the reader don't know what tab width settings the programmer was using, then the blocks of code don't align on columns when I view it. This is an era of open source and code must be transferable without ambiguity.

Most programming text editors these days have an option to automatically convert all [Tab] keypresses to spaces.

Could I get a vote on the above suggestion? (the thumbs or thumbs down on this comment)

@shelby3
Copy link

shelby3 commented Sep 21, 2016

Another requested vote is should we disallow lines which contain only spaces? This would remind the programmer (and his IDE) to delete such lines from the code. Such lines make for clutter in version control differencing when they are later removed (cleaned up). Not putting them there in the first place would be better.

@shelby3
Copy link

shelby3 commented Sep 21, 2016

@keean wrote:

Should we allow no brackets for a single argument function?

f = x => x + 1

No. I am going to argue now at this early stage that we need to have some guiding principles. And I think one of those principles should be to not unnecessarily create multiple ways to do the same thing when we can easily avoid doing so (which is one of the complaints against Scala). Especially not creating another variant of syntax just to save two characters. This is inconsistency for the reader. And remember typically there can be 10 - 100 times more readers of each source code base, then there were coders of that source code. We want to minimize the number of variant cases the new programmer has to learn.

@keean
Copy link
Owner Author

keean commented Sep 21, 2016

Okay, that makes sense to me, so it has to be:

f = (x) => x + 1

Presumably we are happy with all sorts of variations on type information like:

f = (x : Int) => x + 1
f = (x) : Int => x + 1
f = (x : Int) : Int => x + 1

@shelby3
Copy link

shelby3 commented Sep 21, 2016

I am not yet 100% firm on my prior suggestion to unify function definition around the =. I have another comment post coming about that and also responding to your prior comment about that, where you noticed some of the same issues I did.

@keean wrote:

Presumably we are happy with all sorts of variations on type information like:

The multiple ways to optionally annotate type declarations on arguments and result value, seems acceptable to me, because it is a consistent rule that also will apply to reference declarations as well.

@shelby3
Copy link

shelby3 commented Sep 21, 2016

On this tangent of variant syntax to define an anonymous (unnamed) function, I have liked a shorthand from Scala which enabled very concise expressions:

@keean wrote:

set_callback((x) => x + 1)

set_callback(_ + 1)

That is saving us much more than two characters and it can really remove a lot of verbosity if we further enable it to handle more than one argument.

I disliked that Scala used multiple _ to represent multiple arguments (or did Scala only allow one input argument for this shorthand? I forget), as these are undifferentiated:

set_callback2((x, y) => x + y)

set_callback2(_ + _)

I suggest instead with a limit of 3 arguments:

set_callback(_ + 1)
set_callback2(_ + __)

Note these should not cross function call lexical parenthetical scopes, i.e. the following are not equivalent:

setCallback((x) => f(() => x))
setCallback(f(_))

Normally I would argue against creating a new variant to do the same thing. But the above is so geek-cool, that I think it would be a beloved 😍 feature. My marketing senses/intuition prioritize love/passion ❤️ over rigid design guidelines. Also I think I can make a case that the significant increase in brevity makes it more than just another way to do the same thing. We can't get that same brevity with the other syntax. Thus I argue it is necessary.

Am I mistaken on the love for this alternative syntax?

P.S. afair, Scala even allows the following, which I argue against, as it violates the guideline for not unnecessarily creating variant syntax:

set_callback(_:Int + 1)
set_callback2(_:Int + __)

@keean
Copy link
Owner Author

keean commented Sep 21, 2016

I quite like the _ syntax, but there is probably quite a lot of other uses for it too. If we use it I would suggest we use it as a De Bruijn index, so that we have:

set_callback(_1 + _2)

@shelby3
Copy link

shelby3 commented Sep 21, 2016

@keean wrote:

set_callback(_1 + _2)

I had that idea also, but it makes it more verbose for the single argument case. If we prefer that, I'd perhaps suggest instead:

set_callback(_ + _2)

But I didn't suggest that because it is inconsistent, violating one of the other guidelines I have for not proliferating special case rules in order to keep the cognitive load for the new programmer reasonable.

Also in Scala that conflicts with the tuple member names. We might want to not reserve numbered names when they can be useful in general for names.

Also there is another reason I favored this:

set_callback(_ + __)

And that is because it visually literally looks like a slot (i.e. "fill in the blank"); whereas, _2 doesn't.

I do understand that _ and __ are somewhat similar in appearance and maybe not as readily differentiated to the eye, but on balance I think I prefer "fill in the blanks" than _2 which doesn't connote anything visually. And beyond 3 arguments is a diminishing benefit and we really need named arguments, because it gets too confusing to track the correspondence (I guarantee bugs 😏) even for ____ or _4 (and verbose compared to single letter argument names).

Another possible design choice, is only support it for one argument.

Anyone else want to share an opinion?

P.S. I know the choice is imperfect (I admit this _ + __ looks unbalanced and a bit weird) and I also consider not having the feature because it is not perfect, applying a guideline of not adding marginal syntax that can be done another way which isn't egregious. But I do love the brevity as being visually beautiful/elegant 💎 , which I think is another reason I prefer the connotation visually to slots. What do others think?

@keean
Copy link
Owner Author

keean commented Sep 21, 2016

I find it hard to distinguish _ and __, and it gets even harder with ___ and ____. I think I prefer the Scala way where each slot is simply _ and you have to provide an argument for every slot. I think that is more readable and consistent. We can allow a 'let expression' if you need a slot more than once:

set_callback(let x = _ in x + x)

Although that makes me think how do we differentiate between a single like if and a multi line one? In Python function definition uses : at the end of the line so it would make sense for if. I think blocks (other than functions) might be a different topic, but it might not. It depends if we want consistency between functions and other statement blocks?

@shelby3
Copy link

shelby3 commented Sep 21, 2016

In the interest of brain dump disclosure...

I wish we could use Unicode symbols, then we could use (although it is a slippery slope towards "symbol soup" dingbats source code):

setCallback(⎵¹ + ⎵²)

Or:

setCallback(⑴ + ⑵)

Unfortunately Unicode doesn't define the character set for numbers above horizontal brackets, so the above are the two best available choices.

I realize we can't use Unicode for something a programmer needs to be able to type in quickly, but the world is changing and there are now configurable customized hardware keyboards and otherwise it is possible on Android, Windows, and Linux to software remap the keys on any hardware keyboard. A programmer can remember to type [Ctrl]+[1] keys simultaneously to generate a remapped ⎵¹ or . Since this is an optional shorthand feature that programmers don't have to use (and they can always resort to copy+paste if they haven't mapped their keyboard), I am not entirely joking. 😆

But I loathe special requirements on system configuration and/or specialized hardware. So this would need to be super compelling and it isn't.

It would also for the time being be a discrimination against the developing world, where they may not have access nor funds (or be forced to use a shared computer in a netcafe) to purchase the custom hardware keyboard.


Edit: Dec. 3, 2017 it’s possible to render with HTML+CSS the Unicode “bottom brackets” and “open box” with the superscripted number shifted in the desired position:

numbered_slots

So I’m thinking the programmer could type _1 and IDEs could optionally render it as above. And __1 for the arguments of inline functions within a function call that is nested within an inline function already employing the _1 shorthand.

Special typing assistance isn’t required, i.e. the text can contain either _1 or ⎵¹. The editor (IDE) could optionally display them as Unicode as I above to make it more concise/pretty. That does cause an issue with inconsistent columnar alignment with different renderings, but I think that is an acceptable tradeoff. There’s no Unicode character which draws the superscripted number over the bottom brackets.

Ditto for example I will propose to allow instead of ->, but in this case the IDE could also optionally render -> as but many users might not choose to do so (as again it changes the displayed columnar alignment). I wish I could find a Unicode right arrow which is two characters wide.

@shelby3
Copy link

shelby3 commented Sep 21, 2016

@keean wrote:

I find it hard to distinguish _ and __

I don't (and I'm blind in one eye), but __ and ___ is borderline for me, especially if I am rushed or my second nature is on auto-pilot.

and it gets even harder with ___ and ____

Agreed. That is why I wrote we couldn't go beyond 3 arguments, maybe not even more than 2.

I think I prefer the Scala way where each slot is simply _ and you have to provide an argument for every slot.

But that isn't orthogonal to ordering, which sometimes can't be re-ordered to fit. And doesn't allow using the slot more than once (which you acknowledged).

And I find that confusing semantically (it is so inconsistent with the way variable names work). It adds cognitive load. I remember that being a source of confusion for me when I was learning Scala and still might be a gotcha when I am not paying careful attention.

We can allow a 'let expression' if you need a slot more than once:

set_callback(let x = _ in x + x)

That violates the generalized guiding principle that I thought we agreed on. It adds specialized syntax to fix the shorthand when the programmer can otherwise revert back to the normal syntax for unnamed function. And afaics that proposal gains nothing significantly justifiable in brevity:

set_callback((x,y) = y in x + x)

@keean
Copy link
Owner Author

keean commented Sep 21, 2016

@shelby3 wrote:

That violates the generalized guiding principle that I thought we agreed on. You are proposing to add specialized syntax to fix the short-hand when the programmer can otherwise revert back to the normal syntax for unnamed function. And afaics you are proposing to gain nothing in brevity:

I thought having an assignment that you can use in an expression would be in line with the 'everything is an expression principal'. It would be usable in any expression, not just in the special short function syntax.

But that isn't orthogonal to ordering, which sometimes can't be re-ordered to fit. And doesn't allow using the slot more than once (which you acknowledged).

If you want to use each variable more than once or in a different order, use the original longer syntax :-)

I am not a huge fan of the _ syntax, but I can live with it, but the further we go down that route, the less I like it. I am tending to think we should stick to 'only one way to do thingsand say you just do(x) => x + x` its not really much longer and I think readability beats conciseness in this case.

@shelby3
Copy link

shelby3 commented Sep 21, 2016

@shelby3 wrote:

And beyond 3 arguments is a diminishing benefit and we really need named arguments, because it gets too confusing to track the correspondence (I guarantee bugs 😏) even for ____ or _4 (and verbose compared to single letter argument names).


and it gets even harder with ___ and ____

Agreed. That is why I wrote we couldn't go beyond 3 arguments, maybe not even more than 2.

The proposed shorthand isn't suitable for more than 2 or 3 arguments, regardless of syntax we choose. It isn't justifiable to have _4, and the programmer should revert to the normal unified syntax for unnamed function definition in that case of so many arguments, because the brevity will be roughly the same due to the fact that single letters a, b, c, d are 4 characters shorter than _1, _2, _3, _4 not counting using a slot more than once.

Thus I logically conclude (including the logic in my prior comments) that the shorthand should only support at most 2 or 3 arguments and the syntax should be _ + __.

Otherwise choose not to have a shorthand syntax.

@shelby3 wrote:

P.S. I know the choice is imperfect (I admit this _ + __ looks unbalanced and a bit weird) and I also consider not having the feature because it is not perfect, applying a guideline of not adding marginal syntax that can be done another way which isn't egregious. But I do love the brevity ...

@shelby3
Copy link

shelby3 commented Sep 21, 2016

@keean wrote:

I thought having an assignment that you can use in an expression would be in line with the 'everything is an expression principal'. It would be usable in any expression, not just in the special short function syntax.

I failed to communicate my point. I am not making a decision here about whether an "assignment-as-expression" is desirable or not (since afaics that is mostly an orthogonal design issue, and since this wasn't a compelling use-case for it). Rather I am stating it doesn't make the use-case of the shorthand any less verbose.

But that isn't orthogonal to ordering, which sometimes can't be re-ordered to fit. And doesn't allow using the slot more than once (which you acknowledged).

If you want to use each variable more than once or in a different order, use the original longer syntax :-)

That logic is incongruent with my other point:

And I find that confusing semantically (it is so inconsistent with the way variable names work). It adds cognitive load. I remember that being a source of confusion for me when I was learning Scala and still might be a gotcha when I am not paying careful attention.

Point is that if we add the shorthand, then it should respect normal semantics of variable names. This is one of those insights into language design that Scala's designers apparently didn't do well in some (many or most?) cases.

I am not a huge fan of the _ syntax, but I can live with it, but I am tending to think we should stick to 'only one way to do things...

Didn't you write otherwise upthread? Or perhaps you meant all design uses for _ such as partial application?

I quite like the _ syntax, but there is probably quite a lot of other uses for it too.

Any way, holistic design requires a lot of thought, so it is easy to end up flipflopping.

I generally agree of course in the stated principle that "readability beats conciseness".

...and say you just do (x) => x + x its not really much longer and I think readability beats conciseness in this case.

It isn't just the issue of being concise in string length, but also the crowdedness (symbol soupy-ness) of the juxtaposed double (( followed in close proximity by ) =>:

setCallback((x) => x + x)
setCallback(_ + _)
setCallback((x,y) => x + y)
setCallback(_ + __)

I have one more alternative syntax idea for the shorthand, assuming that we will set a rule that all variable names are lowercase (or at least that all variable and function names begin with a lowercase letter or underscore):

setCallback((x) => x + x)
setCallback(A + A)
setCallback((x,y) => x + y)
setCallback(A + B)

That would conflict with the convention of uppercase letters for type parameters (such as inside of typeclass method bodies), but we could require type parameters to be a minimum of two characters (or just disallow A, B, C, ... as many arguments we want for type parameters).

But I still think I prefer the underscore as it looks like a slot. Otherwise drop the shorthand proposal.

@skaller
Copy link

skaller commented Oct 4, 2016

Wrong terminology @keean. A function of arity 3 has type

A -> B -> C -> D

where D is not a function type.

@keean
Copy link
Owner Author

keean commented Oct 4, 2016

but by forcing every function to only map from tuple, you can make the syntax look familiar and still have the language be mathematically sound.

This is my current view as well. It has to be a literal tuple (syntactic restriction), but the type given to the function can be correct.

I type recursive types internally as a graph structure, and convert to mu notation when printing, which is normally written as as like this: a -> a as a

@keean
Copy link
Owner Author

keean commented Oct 4, 2016

Wrong terminology @keean. A function of arity 3 has type

A -> B -> C -> D

where D is not a function type

I disagree, arity is the number of arguments a function takes. Here an arity one function returns an arity one function, returns an arity one function that returns a value. Side effects could happen in any of those arrows. To ensure an arity three function where side effects only happen after all three arguments are passed requires A * B * C -> D

@shelby3
Copy link

shelby3 commented Oct 4, 2016

@SimonMeskens wrote:

I agree that a function with multiple arguments gets you into a mess

Could someone please clearly explain what that 'mess' is? I don't see it. As you pointed out, a non-curried function that inputs 3 arguments is equivalent to a curried function that only accepts one argument which is a triplet tuple.

@skaller wrote:

A curried function of arity 3 has type

A -> B -> C => D

ftfy.

A non-curried function of arity 3 has a type:

A * B * C => D

@keean wrote:

It has to be a literal tuple (syntactic restriction)

Right it is syntactic restriction, but we could support destructing an instance x of a tuple into an equivalent arity function application f(x), but it would be confusing and remove two-factor checks on programmer error. Instead we could choose to allow shorthand f((a, b, c) = x) (which is already available in the grammar I wrote) or f[x] (which is also already available in the grammar I wrote). As for f((1, 2, 3)) which @skaller doesn't like, we could allow instead f[1, 2, 3].

OTOH, @skaller's preference also has "absurd" flaws too, e.g. if I want to distribute a triplet tuple to an arity 3 curried function, then I must write some verbose tuple destructuring and application such as let (a, b, c) = x; f a b c.

So it doesn't matter which preference we choose, afaics they are both "absurd" according to @skaller's logic. If someone can explain some fundamental reason only one of the preferences is less messy, I would appreciate it.

Rather I prefer the local reasoning and mainstream familiarity of explicitness of arity with forced grouping and I have explained above we can eliminate common cases of extra parens.

@skaller math doesn't come in only one conceptual abstraction. There is more than one way to skin a cat. Type theory is not the same as category theory. We create type systems and syntactic conventions that optimize the programmer experience, not the arbitrary chosen mathematical abstraction. That is not to claim that there isn't more degrees-of-freedom in a category theory model with only functions as nodes on a graph and types as graph edges. @keean's typing model is types as nodes on the graph and edges between them for inference.

Edit: I would like to make the analogy to building houses with grains of sand or with windows, bricks, doors, etc.. The most elemental model @skaller prefers is also very amorphous.

@SimonMeskens
Copy link

The big idea is that if every function is defined as taking one value and returning one value, there are a number of operations you can do with functions, without having to know arity. Every function in the language is exactly the same, so you can write in a point-free style easier, composition becomes easier and partial application becomes trivial.

A language which has higher arity functions or one that forces a single tuple as input looks the same to the end user, but the latter one just has the advantages I listed. In practice, this is exactly the same, it's just a way of reasoning about it. It's the same argument as @keean saying he wants to treat every function as an IO Monad. To the end user, this is invisible, but in the context of purity and assumptions you can make, it's a huge difference.

@shelby3
Copy link

shelby3 commented Oct 4, 2016

@SimonMeskens wrote:

The big idea is that if every function is defined as taking one value and returning one value, there are a number of operations you can do with functions, without having to know arity. Every function in the language is exactly the same, so you can write in a point-free style easier, composition becomes easier and partial application becomes trivial.

But if we can curry functions (on demand), then we can have that too. I don't see why we need to commit to curried by default?

Edit: I am not implying you are proposing that. Just trying to understand if @skaller has any point other than just a preference of priorities of the default.

@SimonMeskens
Copy link

I never said that we should curry by default? That would make the language harder to use for a lot of people. I'm saying we should have functions with multiple arguments, but treat those like they are a function from tuple to value, instead of higher arity. For the user, it looks exactly the same.

@shelby3
Copy link

shelby3 commented Oct 5, 2016

@SimonMeskens I agree and just wondering if there is anything fundamentally broken in making that choice.

Note the syntax I am proposing for currying a 3 arity non-curried function is:

(x) => (y) => f(x, y, _)

Which is equivalent to:

(x) => (y) => (z) => f(x, y, z)

For partial currying (i.e. partial application of non-curried):

(x) => f(x, _, _)

@shelby3
Copy link

shelby3 commented Oct 5, 2016

@keean the question is what is the difference in type between:

(x) => (y) => (z) => f(x, y, z)

And:

(x) => (y) =>
   let y' = y + external_state
   (z) => f(x, y', z)

I was thinking => would signify possible side-effects, but I realize the purity annotation (which I currently have as * in the grammar and in front of the parens for the function definition) would be more accurate, but its syntax needs to change as follows.

So the type of former is A -> B -> C -> D and the latter should be A -> B => C -> D assuming the type of f(x, y, z) is pure A * B * C -> D otherwise the latter should be A -> B => C => D.

So that would settle the preference over whether to use -> or =>, because = is clearly more accurate for a pure function.

You had disagreed with me about certain side-effects being inapplicable, but I (and @skaller) argued I think convincingly that the only side-effects that matter are those that our type system guarantees.

If we pass around the IO monad as external state, then every print(x) will be an "impure" function w.r.t. to every function that also accesses the IO monad but if a new copy of the IO monad is returned instead of being mutated, then those functions are still pure functions because they don't modify external state nor depend on any state not provided by the inputs. This why Haskell is still pure even though it has side-effects to IO monad, because this IO monad state is a state which is passed along the function hierachy and so the ordering is explicit. That effects are semantically imperative (meaning that changing something in one function could affect expected semantics in some function far away, i.e. the locality of semantics is desirably forsaken), but the functions are operationally (operational semantics) pure and can be optimized as pure.

@keean
Copy link
Owner Author

keean commented Oct 5, 2016

@shelby3 I went down the whole different arrows thing in the past and I am not convinced it will keep things DRY.

Personally I think functions should be functions, and we should infer monadic effects (so you don't have to put different arrows). You then use constraints at certain points, so for example if you want a particular callback to have no side-effects then you can give it the appropriate constraint. Now it would be a compiler error to try and pass a side-effectful function as the argument for that callback.

For optimisation purposes purity inference is better than annotation because the compiler will always get it right, and you won't forget to change the arrows if you edit a function which would lead to the compiler making false assumptions.

This also links to the idea of zero arity functions behaving as values, which I think is a mistake and we should not do. The reason is it will confuse imperative programmers. If something is a 'const' expression we can evaluate at compile time if performance is a concern. Otherwise if the value depends on runtime values it needs to be executed at runtime do treating it as a value is wrong anyway.

@shelby3
Copy link

shelby3 commented Oct 5, 2016

@keean what you are really saying is you refuse to have pure annotations and it has nothing to do with anything you've written prior about arrow notation, since this is the first time we've discussed using two arrows to signify pure annotation. What you wrote in the past is you didn't prefer different syntax for inline and non-inline function definitions. And I concurred.

As for DNRY (aka DRY), it is a fact of programming that we provide different implementations that accomplish similiar goals, e.g. iterators versus enumerables and mutables versus immutables, because they provide different characteristics. If you don't like that, then perhaps programming is not the right field for you. Perhaps you'd also prefer we don't offer monadic effects composition since it will also have the same trait that if you don't want to use that form, then we have to duplicate all of it in the alternative form that others may prefer to use in their coding.

The purpose of purity annotation is to signify the programmer's intent so the compiler can check his intent. The purpose of purity in types, is again so intent is expressed.

You are trying so hard to find any excuse possible to take seriously the need for pure and impure functions. I don't see how you are arguing logically.

Only pure zero argument functions are uni-valued (one constant value). Non-pure zero argument functions do not behave as uni-values.

@keean
Copy link
Owner Author

keean commented Oct 5, 2016

Writing your intent for purity seems the wrong idea. You do not want to say when something is pure, you want to say when you need something to be pure (especially in am imperative language where everything is impure by default unlike Haskell).

Consider:

f(x, y) = 3 + 5

What do I gain from marking this pure?

@shelby3
Copy link

shelby3 commented Oct 5, 2016

@keean wrote:

You do not want to say when something is pure, you want to say when you need something to be pure

Why didn't you just come right out and say that what you want is for functions to be able to behave as both pure and impure simultaneously via inference, because you think this will help DNRY? Can't you be more explicit, so that readers don't have to piece together your indirect implied meaning.

Because you think that if a function is not marked pure, then it won't require all its invariants to be pure when the caller of said function does not require purity.

I see at least three problems with your idea:

  • It can't work across module boundaries, or what ever we decide is the boundary of inference.
  • It doesn't check to insure that a function can be used as pure when it is needed. Whereas, if the programmer had been notified, he might have been able to restructure it to be pure and impure.
  • It doesn't enable a programmer to make any purity assumptions in his algorithms which might alter the choice of algorithm and code structure, because the programmer can't dictate the expectation.
  • The programmer must always assume () function application generates side-effects.

I argue it is better to be explicit and provide a pure and impure version of a function, such as if the function inputs any callbacks or mutates any data structures in place. Otherwise there is no need for two variants of the function.

Afaics, you are putting too many unnecessary expectations on inference and this is going to make inference very complex and brittle especially when we start adding higher-order features for solving the Expression Problem.

@keean
Copy link
Owner Author

keean commented Oct 5, 2016

I don't think we need purity anyway, just side-effect control.

What do we gain by marking a function pure?

@shelby3
Copy link

shelby3 commented Oct 5, 2016

@keean answered here.

@shelby3
Copy link

shelby3 commented Jun 1, 2017

@shelby3 wrote:

@keean wrote:

@skaller wrote:

Wrong terminology @keean. A function of arity 3 has type

A -> B -> C -> D

where D is not a function type

I disagree, arity is the number of arguments a function takes. Here an arity one function returns an arity one function, returns an arity one function that returns a value. Side effects could happen in any of those arrows. To ensure an arity three function where side effects only happen after all three arguments are passed requires A * B * C -> D

@keean the question is what is the difference in type between:

(x) => (y) => (z) => f(x, y, z)

And:

(x) => (y) =>
   let y' = y + external_state
   (z) => f(x, y', z)

I was thinking => would signify possible side-effects, but I realize the purity annotation … would be more accurate, but its syntax needs to change as follows.

So the type of former is A -> B -> C -> D and the latter should be A -> B => C -> D assuming the type of f(x, y, z) is pure A * B * C -> D otherwise the latter should be A -> B => C => D.

So that would settle the preference over whether to use -> or =>, because = is clearly more accurate for a pure function.

I am preferring finer grained invariants of employing read-only annotations on arguments than annotations on whether functions are pure. The only non-local side-effects (within our static type system!) that can occur within a function not due to mutating arguments are mutating external state accessed not via arguments.

Am I correct, that the only valid (in correct design) use case for accessing state external to a function which is not a function argument/parameter are for lexical closures that facilitate modular programming?

@shelby3 wrote:

Edit#2: I think perhaps closures play some role in that they encourage the programmer to inline a function to capture the external environment without needing to pass it as an argument down the call stack.

Should we somehow restrict these such as only allowing lexical closures within the same file in order to approximate the local reasoning that pure functions and referential transparency aim to accomplish, i.e. no closures on imported scopes (aka global variables, I guess those are not lexical scopes correct)? So then there are no impure functions (even in partial application) which are not lexically local and visible.

(Pure functions in theory also enable some optimizations and compiler reasoning, yet sufficiently local, contained lexical closures may also)

You had disagreed with me about certain side-effects being inapplicable, but I (and @skaller) argued I think convincingly that the only side-effects that matter are those that our type system guarantees.

If we pass around the IO monad as external state, then every print(x) will be an "impure" function w.r.t. to every function that also accesses the IO monad but if a new copy of the IO monad is returned instead of being mutated, then those functions are still pure functions because they don't modify external state nor depend on any state not provided by the inputs. This why Haskell is still pure even though it has side-effects to IO monad, because this IO monad state is a state which is passed along the function hierachy and so the ordering is explicit. That effects are semantically imperative (meaning that changing something in one function could affect expected semantics in some function far away, i.e. the locality of semantics is desirably forsaken), but the functions are operationally (operational semantics) pure and can be optimized as pure.

The point is that pure functions and Monads are not a panacea. Thus afaics, the goal is localizing invariants and reasoning where feasible, but this is never an absolute.

@shelby3
Copy link

shelby3 commented Jun 4, 2017

Added some analysis of default function argument parameters. Note scroll to the top of the linked Wikipedia page for the difference of my revision to the page.

What is interesting is the two general ways to model default arguments:

  1. We can annotate the types in the function signatures as taking argument parameters which can be optional (e.g. undefined) without specifying their default values in function’s type signature; thus each function which matches that signature can independently implement its own default values at call time, i.e. not resolved statically and thus incurs runtime overhead. This is what TypeScript does by default.

  2. We can type the default values in the function signatures so the call sites can statically pass the default values, but this bloats the generated code and it means the default values for callbacks are controlled centrally by the signature of the function which inputs the callback (which might be desired in some cases). Unless we want to make the grammar more complex, default values in type signatures would realistically need to limited to literals and named argument parameters (call site expressions do not make much sense any way unless they refer to global variables). A function of the #​1 variant can be passed as a callback to a function signature of either variant #​1 or #​2, but not vice versa; which seems to be a dominance for #​1 variants (analogous to What Color is Your Function?) so the compiler would need to automatically make a #​1 variant of every #​2 function definition (and vice versa for maximum runtime performance so do not pass a #​1 variant to function with a #​2 signature). Also #​2 variants can work in the nested callbacks case (which is separate issue to HRT), although perhaps unwieldy.

Note if we supported both variants then the TypeScript code generated by our transpiler would output variant #​2 function signatures as not accepting default arguments, because every call site in the generated code would explicitly supply the default argument values (thus the aforementioned code bloat).


Named function argument parameters is a separate issue, except that named arguments are only useful if they follow a default argument parameter unless one argues they are also useful as a form of type-checked comment for descriptive local reasoning and/or so call sites do not need to be refactored when a default argument parameter is inserted at the function definition site.

I prefer the C# syntax at call site of name: value instead of name = value because then no chance of shadowing an assignment in an argument at the call site (tangentially some argue that assignment in arguments is bad because argument evaluation order is undefined in some languages but the order is specified left-to-right in JavaScript).

This has another advantage that names specified in type signatures look very familiar to other languages, e.g. (arg1: Type1, arg2: Type2) → ResultType.

It makes no sense to specify names on the types of callbacks nested within callbacks, since the only purpose of names on types of callbacks is so the body of function that inputs the callback can employ names at call sites.

Generated TypeScript code can simulate named parameters with destructing parameters— which apparently is efficiently handled by engines as of ES6. But there are some pitfalls. However, even though it may be efficiently handled, since we may want to allow mixing named and unnamed in the same call site, our transpiler generated TypeScript code should explicitly pass all arguments (including undefined (or the non-redefinable void 0) for default arguments not provided) except optionally not for trailing default arguments which are not aforementioned #​2 variants.

@shelby3
Copy link

shelby3 commented Jun 4, 2017

Making the anonymous (aka lambda) function definitions as concise as possible (see also) and remove fugly nested parenthesis without proliferating limited special cases of abbreviated lambdas and single argument functions, has been facilitated by the proposal to push the optional type annotation off to the RHS, because no longer need parenthesis when there is a result type annotation.

Thus the parenthesis are are not necessary to delimit the tuple of function arguments, if the argument parameters are space-delimited instead of comma-delimited to avoid the grammatical ambiguity conflict with comma-delimited lists (i.e. within tuples) in general. Space-delimited function arguments are not ambiguous because juxtaposed expressions can otherwise never occur because line breaks to the block indent column position are enforced between expressions (instead of1 JavaScript’s optional semicolon ;).

Although I would like to have only one way to define functions, there are conflicting priorities in that for functions defined outside any other function body (i.e. at the top-level block at column position 0 and within typeclass implementations) there is no grammatical ambiguity between function definitions and calls because function calls are not allowed outside of a function body, thus these can be most concisely written in the syntax of a function call: f(x, y, z) → …. Within function bodies employing a prefix such as Rust’s fn to distinguish function definitions from calls, afaics has no advantage over having only one way to declare an anonymous function and optionally assigning it to a (const) name: f := x y z → …, because we must have anonymous functions any way (naming all functions is not concise nor necessary).

The names and variant #​2 defaults for arguments in a function definition with the type annotation off the RHS do not need to be duplicated in said type annotation. But for function signature definitions in typeclasses and for callbacks (aka HOFs) in function signatures where the body of the function is not specified, then the names and said defaults can be optionally (or even only partially) merged into the type signature (instead separated off to the RHS): (x: Type1, y: Type2, z: Type3 = 99) → ResultType; and where the name of function needs to be provided in typeclasses and optionally in function signatures: f(x: Type1, y: Type2, z: Type3 = 99) → ResultType. The motivation for retaining the option for (even only some portion of) the type annotations off on the RHS, is for separated reasoning and also to abbreviate longish types which are repeated.

1 Offering optional semicolons in order to place more than one expression on the same line (as in Python), so for example inline (non-block-indented) anonymous functions can have more than one expression, would be maybe desirable if it is feasible that our grammar does not require they be enclosed in matching curly braces.


@shelby3 wrote:

Note I understand that you are thinking of the entire set of function arguments as a tuple (per that discussion we had with @skaller last year about why we use parenthesis for function calls) and more recently you had argued against overloading the parenthesis to include using them for tuples (the reason you had square brackets in your syntax example), but I think the programmer can understand that a function which takes multiple arguments can also be called with a tuple as one argument if the type of the function’s first argument (and any default arguments) doesn't make it ambiguous. I think keeping the parenthesis consistent between code and type signatures for function declarations is more consistent for the eye. And consistency is the 0-tuple is unit, the 1-tuple is grouping, and n-tuple is an anonymous product type.

@keean
Copy link
Owner Author

keean commented Jun 4, 2017

I think keeping the parenthesis consistent between code and type signatures for function declarations is more consistent for the eye. And consistency is the 0-tuple is unit, the 1-tuple is grouping, and n-tuple is an anonymous product type.

I was using square brackets for type-parameters, not function arguments. As in

f[R](r : R, f) requires Range[R] =>
   ...

Could be written with a separate type signature:

let f : (R, F) requires Range[R], UnaryFunction[F] = ...

Note: If the type system has universal quantification we don't need to declare R and F as type parameters, as they are implicitly universally quantified.

@shelby3
Copy link

shelby3 commented Jun 4, 2017

Note: If the type system has universal quantification

#11 (comment)

@keean
Copy link
Owner Author

keean commented Jun 4, 2017

There are still type variables with implicit universal quantification, but you don't need to introduce them with type parameters. Here are two equivalent function types, first the parametric type:

f[A, B](x : A, y : B) -> ()

Now the type with implicit universal quantification

f(x : A, y : B) -> ()

A key difference is the first example with explicit type parameters is not actually a type, it is a type constructor and you must provide both type parameters for it to be an actual type (unless you have HKT) and it is monomorphic. The second with universal quantification is a type, and it is polymorphic. This shows up in Rust which has parametric types, you cannot pass a 'generic' function with unspecified type parameters as an argument to a function. In Haskell with universal quantification you can.

@shelby3
Copy link

shelby3 commented Jun 5, 2017

@shelby3 wrote:

@keean wrote:

@shelby3 wrote:

@keean wrote:

A function has a type... it can only have one type, that is fundamental, otherwise you cannot know the type of the function :-)

Interject that an intersection of functions is a type.

But that reminds me of our point that intersections and typeclasses seem incongruent.

So typeclasses for overloaded functions might be problematic. I do not have knowledge or experience as to why we would ever use typeclasses on function instances.

You are I guess only thinking of the way it is in Haskell (and Rust), but there is another way which I've seen in other languages such as method overloading in Java. Each duplicate function name is a different function.

Some people claim that form of overloading interacts very badly with inference. And other corner cases.

It certainly messes with type inference if you do this, and breaks local reasoning, because new versions of a function may change program behaviour, especially combined with generics.

I agree, it is a can-of-worms. I hadn't thought it through entirely, especially on the point of changing program behavior far from the declaration site. Yikes! 😨 TypeScript allows this apparently.

Hmmm. But how does it differ from a function which inputs a union of traits, e.g. Write | Show? Isn't adding another function overload equivalent to adding another item to the union for an argument of a single function? Are you claiming we can't write a function that inputs a union of trait interfaces? It seems incorrect to force the programmer to choose another boilerplate name (e.g. outputWrite and outputShow) for a function that is doing the same operation, just to differentiate the implementation of that function for two different input types. However, I agree that when multiple arguments per function are interacting in this matrix then it gets quite complex to determine which function will be chosen by in(ter)ference.

So I lean towards disallowing Java-style function overloading, but allowing unions and intersections for argument types.

Multiple arity can usually be handled with default arguments. And out-of-order defaults can be handled with named arguments on application, e.g. apply(arg6 = value).

The above is what N4JS has chosen (except I don't know if they implemented named arguments).

The main complaints against function name overloading I see are:

  1. Makes reasoning about type inference more complex.
  2. Need typecasts to select an overloaded function when assigning to an inferred type.
  3. Complicates the specification (see also), e.g. on issues of ambiguity and interaction with other features such as default and named arguments.
  4. Complicates naming in the transpiled target.
  5. Refactoring dominoes cascade.
  6. Name resolution and type analysis unification (in the compiler) are not orthogonal.

My motivation for function name overloading is that replacing succinct typing of a common semantic with a Cartesian product exponential explosion of naming seems to be regressive to the point of typing. And unions (disjunction) input argument parameters are not the same thing as intersections of functions, and are less efficient. Note TypeScript offers overloads as an intersection of functions, but unlike Java, the selection of the function is not statically determined and instead by runtime case-type logic.

There appear to be some mitigations for the aforementioned complaints:

  1. A language which only employs bottom-up inferencing, i.e. no principle typings inference, should have less issues.
  2. Specification could dictate that first overload is selected otherwise.
  3. We could mimic TypeScript’s specification rule which is that the first overload choice (from top to bottom in the source code) that matches is chosen. The specification makes no attempt to determine if any of the other combinations would be otherwise ambiguous without that rule.
  4. We could just number the overloads, i.e. append a suffix number to the function name. K.I.S.S.. In terms of interaction with code that is not automatically recompiled to dependencies (yet that should not happen in correctly designed ecosystems), this presents a recompilation dependency (or refactoring for interoption with non-statically type checked languages such as JavaScript) issue if a function overload is deleted or inserted in the list of overloads. Appending to the end of the list does not have this issue. This design choice can be considered to be a Trojan horse that could force use of a statically compiled language.
  5. Nothing we can do about this nor IMO should we. Refactoring stable APIs is costly, even without function overloading. Afaics, generic typing in general will cause such amplification of refactoring cascade, i.e. it isn’t a problem specific to function overloading. All the function overloads should have a consistent semantic to the extent that types change and not number of non-default arguments. Otherwise the programmer is misusing the function overload feature. Thus cascading type changes should not cause any more chaos conceptually than they would for generic typing in general.

EDIT: default and named parameters fulfill some of the use cases where function overloading would be needed. Typeclasses fulfill other cases where only the data types change because typeclasses provide data type independence. Since function overloading seems to be incompatible with typeclasses (i.e. our aim is to employ typeclasses instead of concrete data types on function arguments, except perhaps for low-level highly optimized code) and sound type inference, we should probably not provide that feature.

@shelby3
Copy link

shelby3 commented Jun 6, 2017

@keean wrote:

This shows up in Rust which has parametric types, you cannot pass a 'generic' function with unspecified type parameters as an argument to a function. In Haskell with universal quantification you can.

Apparently Rust can not even pass a function with specified type parameters into a function, because Rust does not offer HRT, which must be simulated with what is analogous to FCP (i.e. passing a named data type which wrap a type parametrised function and then unwrapping to call it generically).

My proposal for inference of HRT looks syntactically like universal quantification but it is just inferring the rank of the HRT.

But to transpile this HRT to TypeScript (which I presumed also does not support HRT although this comment claims otherwise) I suppose will require employing FCP and/or eliding/erasing the typing if TypeScript does not support HRT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants