Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Syntax for inline functions and objects #54

Closed
gavinking opened this Issue · 91 comments
@gavinking
Owner

Something we need to decide as a community, and relatively soon, since this is one of the first things we need to implement after delivering M1 is the syntax for inline functions and, optionally, inline objects.

We already have this feature for named argument lists

Now, note that we already have one very nice way to do inline functions and objects:

value adults = people.filter {
    function by(Person p) {
        return p.age>18;
    }
};

And:

text.addListener {
    object listener satisfies TextListener {
        shared actual onEdit(TextChange change) { ... }
    }
};

And for many usecases this syntax is really great. The question is if we need something for use with positional argument invocations.

Obvious extension to positional argument lists

Now, the most obviously natural thing would be to simply allow an ordinary function or object declaration, but without the name, for example:

value adults = people.filter(function (Person p) {
        return p.age>18;
    });

text.addListener(object satisfies TextListener {
        shared actual onEdit(TextChange change) { ... }
    });

This is the option strongly favored by Stef, and I'm also very sympathetic to it because it's simple, straightforward, super-easy to parse and very regular with the syntax of the rest of the language. The verbosity can be at addressed by supporting an abbreviation in the case of a function which simply returns an expression. Either:

value adults = people.filter(function (Person p) p.age>18);

might work, or at worst:

value adults = people.filter(function (Person p) (p.age>18));

It would also be possible to eliminate the function keyword:

value adults = people.filter((Person p) (p.age>18));

But now it's getting less regular, less explicit, and somewhat more difficult to parse.

The real doubt I have about this solution is it is simply not really a very good or natural or easy to format syntax for extending the language with new control structures, which is kinda the whole practical point of inline functions.

Smalltalk-style arguments

A different option we've discussed is to let you put inline functions outside the parens of a positional argument invocation, resulting in an invocation protocol that looks a lot like Smalltalk's.

value adults = people.filter() 
    by (Person p) {
        return p.age>18;
    };

text.addListener() 
    listener satisfies TextListener {
        shared actual onEdit(TextChange change) { ... }
    };

Again, we can provide an abbreviation for inline functions that just return an expression:

value adults = people.filter() by (Person p) (p.age>18);

I personally find that this results in something that is highly readable once you get used to it. But I can understand if many people find it a lot less familiar/natural.

So this issue is a request for general community feedback.

(This issue was previously discussed here.)

@ikasiuk
Collaborator

When reading the text, with every example the syntax become harder to understand intuitively. I would say that at the point where you eliminate the "function" keyword, it goes from "ok, this is pretty obvious" to "wait, give me a second", especially when I imagine seeing it for the first time, without all the explanation you are giving here.

You are right of course, the Smalltalk style is clearly better in terms of language extensibility. It is not even very complicated as such, once you have understood it. The real problem is that it looks so different from the rest of the language, which makes it really hard to comprehend when you first see it. I do understand it now, but it still feels a bit out of place - kind of alien.

So the question is: what is more important - simplicity and understandability, or extensibility? I tend to prefer the former, but it is not an easy decision.

@FroMage
Owner

I'm for

people.filter(function (Person p) {p.age>18});

as an alternative to

people.filter{function by(Person p) {p.age>18}};

I don't like parenthesis for deferred expressions as that's too akin to regular (evaluated now) expressions, while curly braces (well, except for named invocation) always denote a special control flow (if, for, while, method, class).

The function keyword makes this more regular and I don't buy that it's too verbose for the same reason that if is not too verbose and should not be replaced with (b ? t : e).

Making closures implicitly return the last evaluated expression seems very reasonable though, not sure why I feel that way.

@gavinking
Owner

I would say that at the point where you eliminate the "function" keyword, it goes from
"ok, this is pretty obvious" to "wait, give me a second", especially when I imagine seeing
it for the first time, without all the explanation you are giving here.

Agreed. But do you think that it would be acceptable for Ceylon to have a syntax for inline
functions that is more verbose than Java 8, C#, Fantom, Gosu, Kotlin, Scala, ML, Haskell,
Smalltalk, Ruby, Clojure ... Indeed, the only languages I know of which are similarly
verbose here are Python and JavaScript, and I believe that there is a proposal to introduce
an alternative, less verbose, syntax in the next release of JavaScript!

And even if it is desirable, which I'm totally prepared to take seriously, isn't it possible that
this would be a turnoff that would affect adoption of the language? (I'm playing devil's
advocate.)

So the question is: what is more important - simplicity and understandability, or extensibility?

Well, at some level the whole idea behind Ceylon is to get extensibility without sacrificing
understandability. But the two goals are often in tension, and I they seem to be here.

@FroMage
Owner

I don't care about more verbose for something we think is saner. I don't like that in Scala you can't see if an expression is evaluated now or later without knowing the type of the variable where it ends up.

For example, given:

foo(bar++);

I don't know if bar++ is evaluated now or later unless I know if the type of the foo() parameter is Integer or Closure.

I am a bit of an extremist on this topic, I admit, but I strongly believe that the vast majority of people actually like to see if something is evaluated now or later. Call me an old fashion Scheme/JavaScript guy, but I'm convinced that this sort of visibility is what the vast majority of developers want, and not some terrible looking #()(2) or ()(2) like in some of those languages you cite.

@ikasiuk
Collaborator

Yes, brevity seems to be very important for some people, and Ceylon has to face the comparison with other languages.
Hm, Stéphane has a point with the curly braces making it look more clearly. I have the same intuition there: curly braces => control flow. What about

people.filter((Person p) {p.age>18});

or

people.filter(by(Person p) {p.age>18});

Somehow looks much better to me than with parentheses.

@FroMage
Owner

If brevity is important to people, then let's keywordise fun and require it on closures. Or λ (as an optional alternative) to be really geeky. Personally I don't buy that this is the sort of brevity that matters, not when we think that "value" is better than "var".

@gavinking
Owner

Stef, I just verified that either of the following options can be parsed:

people.filter(function (Person p) p.age>18);

Or:

people.filter((Person p) p.age>18);

So if we limit inline functions to only occur within positional argument lists, we actually would not need any delimiters at all for the body in the "simply returns an expression" case - which is, after all, the most common case.

But before we went down that path, I would want to know how important we think the following are:

  • inline object declarations, and
  • extending the languages with new control structures.

Now, I suppose there's no reason why we could not later (post-Ceylon 1.0) add support for Smalltalk-style invocations if we were to only support this simple case in M2. But I still want us to think it though.

@ikasiuk
Collaborator

If brevity is important to people, then let's keywordise fun and require it on closures. Or λ (as an optional alternative) to be really geeky. Personally I don't buy that this is the sort of brevity that matters, not when we think that "value" is better than "var".

I completely agree (also with the Scala example you give above). As Gavin says: "playing the devil's advocate"...
We should not sacrifice clarity for brevity. But I still think that when using curly braces the "function" keyword is not necessary to understand the code intuitively.

@gavinking
Owner

I don't care about more verbose for something we think is saner.
I don't like that in Scala you can't see if an expression is evaluated
now or later without knowing the type of the variable where it ends
up.

I'm not proposing that. In people.filter((Person p) p.age>18),
it's very clear that things are evaluated lazily. That's the effect of the
parameter list.

@gavinking
Owner

I have the same intuition there: curly braces => control flow.

The problem with this that that curly braces are already doing a
lot of work in Ceylon. And I've worked very hard to make sure
that the various interpretations they have:

  • class, method, attribute bodies,
  • named argument lists, and
  • sequence instantiations.

Are all nice and mutually regular. i.e. that every construct that
can appear in any of those contexts has the same interpretation
if it appears in any other one of the three.

But now you guys are suggesting a fourth interpretation of curly
braces where an expression that appears inside the curies has
a different interpretation. (At least, I have not yet managed to
rationalize that the interpretation is consistent.)

Unless I can see that in

(Person p) {p.age>18}

the expression p.age>18 has an interpretation that is somehow
regular with it's interpretation in:

print {"adult: ", p.age>18}

Then I guess I'm against it...

@FroMage
Owner

Do you find foo(() 2) more or less readable than foo(function(){ 2 })?

@FroMage
Owner

If closure bodies and method bodies don't register as similar to you for justifying curly braces then I wonder why not.

@FroMage
Owner
if we limit inline functions to only occur within positional argument lists

Mmm, then we can't do something like Callable<...> f = function(){2}; ?

@gavinking
Owner

Do you find foo(() 2) more or less readable than foo(function(){ 2 })?

Good question. :-)

If closure bodies and method bodies don't register as similar to you for
justifying curly braces then I wonder why not.

In a method body that returns the value of an expression, you need to write
return p.age>18;, not p.age>18.

@FroMage
Owner
In a method body that returns the value of an expression, you need to write `return p.age>18;`, not `p.age>18`.

Right, as I said, I don't know why I have nothing against return statements being optional in closures. Can't explain it but it feels right somehow.

@gavinking
Owner

Mmm, then we can't do something like Callable<...> f = function(){2}; ?

Well, that's the lines I'm thinking right now. Notice that this is not a very useful thing to be able to write, since:

value f = function () 2;

is semantically identical to:

function f() { return 2; }

However, we could loosen the restriction to allow you to write value f = function () 2;. All we would need to say is that this syntax can appear after a = as well as in a positional argument list. That's still easy to parse.

What I don't want to say is that function () 2 is an atomic expression. That would *definitely( introduce the requirement for delimiters around the body. As well as being quite useless, as far as I can tell.

@ikasiuk
Collaborator

I'll refrain from opening an issue "Gavin King doesn't let me play Skyrim"...

But seriously: if the curly braces are a problem, that syntax would be acceptable I guess:

people.filter((Person p) p.age>18);
// or
people.filter(by(Person p) p.age>18);
//  or even
people.filter(function(Person p) p.age>18);
@quintesse
Collaborator

For me one of the things that hurts my eye is the double brackets that appear everywhere except for some of the single expression examples. I know some of you say that it is the most common occurrence but I just don't know if I agree.
Still, I'm all for allowing the function and object versions Gavin showed in the first message, enabling that will make it possible to start using that kind of code (and make Stef happy) and in the mean time we can think about other possible syntaxes or leave it for later.

@gavinking
Owner

So here's something that ties in with this. Currently, I've said that a getter-style named argument is simply a convenient way to calculate an expression inline in a UI. But that doesn't really make it that useful. An alternative interpretation, that I've toyed with quite a bit is to say that this results in a Gettable (or Settable) instead of a T, i.e. that it is a way to do pass-by-reference/lazy evaluation.

TextInput {
    value model {
        return person.name;
    }
    assign model {
        person.name:=model;
    }
}

OK, that works, but it's super-verbose. But what if we supported the syntax value person.name as an abbreviation, by analogy with function () person.name? Then we can pass person.name by value as so:

TextInput(value person.name)

That's pretty damn nice. And it's all very semantically consistent. Because that would, at some vague conceptual level, just a kind of abbreviation of:

TextInput(value { return person.name; } assign { person.name:=model; } )

In the same way that

people.filter(function (Person p) p.age>18)

is like an abbreviation of

people.filter(function (Person p) { return p.age>18; })

and even

Button(void (ClickEvent event) print(event))

would be an abbreviation for:

Button(void (ClickEvent event) { print(event); })

So we would wind up introducing the following syntax into positional argument lists:

  1. function (X x, Y y) expr
  2. void (X x, Y y) expr
  3. value expr

And the value p { ... } assign p { ... } syntax into named argument lists.

We would thereby kill inline functions and pass-by-reference at once without futzing up our syntax.

@gavinking
Owner

OTOH, we also need to consider the relationship between lazy evaluation/pass-by-reference and metamodel references. I've been working on the assumption that for a toplevel or local declaration, the syntax for a metamodel reference is the same as for pass-by-reference. That is, choosing a totally hypothetical syntax for the sake of argument, if @null is a metamodel reference to the toplevel value named null, then you would pass null by reference to method() using method(@null).

@ikasiuk
Collaborator
@FroMage
Owner

Is this for attribute/method references or for any sort of lazy expression?

I am a bit confused by the "value" keyword for making lazy references.

@RossTate
Collaborator

I don't like dropping the function keyword. I'm not opposed to shortening it to fun. In fact, I sometimes like it when keywords don't match up with natural words so that I can use the natural words for my variable names.

Something I would like to aim for: forwards-compatibility with (empty) tuples and with juxtaposition as an operation. Seems to cause problems for both of these (the usual notation for the empty tuple is ()), and the Smalltalk thing looks problematic for juxtaposition.

I'm also up for using -> or =>. I know that's inconsistent with your syntax for entries, but I honestly don't care for that syntax. As I see it, once you add tuples (or at least pairs), you'll want to throw it away and replace entries with pairs.

I like the idea of a shorthand for the case when functions just return an expression. The one that looks nicest to me is wrapping the expression in parentheses. I imagine this is also easier for parsing, and it avoids ambiguity issues.

I do not like the idea of having different syntax for anonymous functions when used as positional arguments versus as lone expressions.

@guyv

I really should get into Ceylon some more before commenting on issues likes these, but in the filter example: are we really obliged to repeat the type (Person p)? Can't that be derived in most cases?

@RossTate
Collaborator

If the parameter to people.filter is a subtype of Callable<Top,People>, then yes. However, if people.filter takes an Object or if it is polymorphic/generic, then no. For example, I could do the following:

shared <T> Callable<T,T> repeated(int n, Callable<T,T> func) {...}

Then for repeated(5, function (param) { return param.next; }), you can't infer the type of param. In fact, there could be many valid types for param, so guessing the type of param is dangerous since it could affect the semantics of the program.

You could say "guess it when you can guess it", but this makes your language fickle. In particular, whenever you update the language and/or its inference algorithms, you have to make sure that anything you could infer before can still be inferred. Your inference algorithm becomes a part of your language specification and type system whether or not you intended so.

So, I'm trying to avoid such issues by having all type checking and type inference be decidable (and trying to be forwards-compatible with possible future features).

@gavinking
Owner

So "value expr" basically means that you are passing expr by reference, not by value?!

Haha, yeah, that's not intuitive ;-)

@gavinking
Owner

Is this for attribute/method references or for any sort of lazy expression?

The proposal was that it could be any expression, for example:

value person.firstName + " " + person.lastName
@gavinking
Owner

I really should get into Ceylon some more before commenting on issues
likes these, but in the filter example: are we really obliged to repeat the
type (Person p)? Can't that be derived in most cases?

Echoing Ross: no, not in general. Yeah, there are some cases where you intuitively can, but it's hard to write down exactly which cases they are. We're trying to stick very, very rigidly to this idea of always using principal types. The principal type of function (Person p) p.age>18 is well-defined: it is Callable<Boolean,Person>. The principal type of function (p) p.age>18 is not well-defined. We need to look at the surrounding invocation to determine what type it has. Indeed, we can't even analyze the type of p.age without looking at the surrounding invocation.

My experience is that whenever languages steer away from the use of principal types - and there are lots of cases like this where you think you might want to - they always wind up introducing pathological and/or unintuitive corner cases.

@gavinking
Owner

OTOH, we also need to consider the relationship between lazy evaluation/pass-by-reference and metamodel references.

So I'm hung up on this one. For an attribute of a class, I want a metamodel reference to look like this:

Person.name

which has type Method<Person,String>. The problem is when you get to toplevel attributes. Unfortunately,

null

is supposed to evaluate the toplevel attribute null, and therefore has type Nothing. Currently, the spec jumps through some hoops to say that it really has type Value<Nothing> which extends Gettable<Nothing> and is therefore implicitly converted to Nothing where needed. This also solved the problem of pass-by-reference. But I think it's clear that implicit conversions are now totally out of the question for this language.

So we need some other syntax for a metamodel reference to a toplevel attribute. An old version of the spec said that metamodel references look like Person#name, #null, #print, Equality#equals, #Range, etc. But that's really noisy and unnecessary for methods, interfaces, classes, and member attributes. We just have this one annoying case where we need a special syntax. I think we need to resolve this problem before we can get comfortable with the idea of value person.name.

@gavinking
Owner

So the wider question is: what is the syntax for referring to an attribute without evaluating it? I'm just not sold that value null is at all intuitive. OTOH, nor am I absolutely in love with something like null@ or @null.

@gavinking
Owner

Well, arguing against myself as usual, if we had a syntax like @null for referring to toplevel attributes without evaluating them (returning you a metamodel reference), that doesn't actually make the idea of value expr obsolete, since the form value expr is accepting a whole expression and wrapping it up in a new value declaration. So perhaps these two issues can be unlinked after all.

But I still want to know what is our work-in-progress solution for metamodel references to toplevel attributes before we make a final decision on this issue.

@FroMage
Owner

My problem with value person.firstName + " " + person.lastName is that of scope again: does value (forgetting that the keyword is counterintuitive) apply to the first or all arguments?

I'm not even sure we need anything lazy once we have references and closures. Why do we need it again?

As for references, well, you're not going to like it but in C you'd write &(obj.attr) to get a reference. I can totally live with obj#attr and I think this is what Java 8 is proposing.

@gavinking
Owner

My problem with value person.firstName + " " + person.lastName is that of scope again: does value (forgetting that the keyword is counterintuitive) apply to the first or all arguments?

Well, that's the thing, it applies to everything that follows, up to the , or ). It's part of the syntax of the positional arg list, so we don't run into the kind of issues we would face if it was an atom of the expression syntax.

The same comment applies to function:

people.filter(function (Person p) p.age>18)

The scope of function extends to the , or ).

@gavinking
Owner

As for references, well, you're not going to like it but in C you'd write &(obj.attr) to get a reference.

Well, I certainly don't love it. But see #57. The combination of:

  • toplevel attributes, with
  • metamodel references.

Kinda forces us into needing something a bit cryptic like this. I don't see a way out of it.

I can totally live with obj#attr and I think this is what Java 8 is proposing.

I had no idea Java8 was proposing this. Link?

@FroMage
Owner

I am allowed to write a closure with statements though right?

people.filter(function(Person p){ 
 if (p.age < 18) { 
  throw;
 }else{
  return p.age < 21;
}});
@ikasiuk
Collaborator

I'm not even sure we need anything lazy once we have references and closures. Why do we need it again?

Indeed, a small example or two could help understanding what this discussion is actually about.

@FroMage
Owner

http://cr.openjdk.java.net/~briangoetz/lambda/lambda-state-3.html see 8. Method references.

Though I don't buy their prefix notation #person.name precisely because of:

We will likely restrict the forms that the receiver can take, rather than allowing 
arbitrary object-valued expressions like "#foo(bar).moo", when capturing instance
method references.
@gavinking
Owner

I am allowed to write a closure with statements though right?

I mean, I would be inclined to initially leave that out. You can do it with a named argument invocation. I personally find stacking {} inside () to be super-ugly. We can always add it later...

@FroMage
Owner

Well, ugly perhaps but it's not intuitive to me that closures are only for expressions :( I think people would not like that.

@gavinking
Owner

I'm not even sure we need anything lazy once we have references and closures. Why do we need it again?

In my mind "lazy" has pretty much always been the same feature as "references".

But if you want to define inline and pass-by-reference at the same time, then that's what value expr would let you do, under this proposal.

TextInput(value items[i]?.product.code)

or

TextInput {
    value model {
        return items[i]?.product.code;
    }
    assign model {
        items[i]?.product.code := model;
    }
}

I think once you start getting into higher-order use of this stuff you could get some crazy-cool stuff like:

Input(stringConverter(value some.number))

Where stringConverter() accepts a Settable<T> and returns Settable<Sting>.

@gavinking
Owner

see 8. Method references.

OK, but we don't actually have any kind of problem with method references. It's attributes that are the problem, specifically toplevel attributes.

@gavinking
Owner

Well, ugly perhaps but it's not intuitive to me that closures are only for expressions :( I think people would not like that.

Perhaps. Probably. I just think that a named argument list is much more elegant for this usecase. Anyway, I said "initially". There's no problem supporting the full form with a block.

@k-abram

For the following:


text.addListener()
listener satisfies TextListener {
shared actual onEdit(TextChange change) { ... }
};

what does "listener satisfies TextListener" buy us? Isn't it mandatory that whatever is sent in is a TextListener? If so:


text.addListener( {
shared actual onEdit(TextChange change) { ... }
});

@gavinking
Owner

@kpolo you also need to take into account what can actually be parsed using a context-free grammar. Your suggestion can't be. And even if you do have something that can theoretically be parsed with a bunch of lookahead, it still might be an absolute nightmare in the IDE, where you have to tolerate non-wellformed syntax and do something meaningful with it.

So even though I, as a human reader, can tell what you mean by the above (by reading all the way to the end of the code), an LL grammar would have a lot of problems with it.

@ycros

I like the value proposition, and I also think restricting it to expressions makes sense - trying to cram statements into an argument list IS ugly.

It just needs a new name. How about lazy, since everyone seems to be using it already. (Other ideas: expr, ref, fn, lambda)

@set321go

I personally would prefer the curly braces but i dont think there should be a need to name functions or for the return statement. I think if you have the braces you can get rid of the keywords, as the braces still denote a block clearly.

people.filter((Person p) { p.age>18; })

if you get rid of the braces then i think you need to put something back in
people.filter(function (Person p) p.age>18; )

@gavinking
Owner

Again, it is to me a very important issue of regularity that { statement; } and { expression } always mean the same thing no matter where they appear in code. A syntax like:

people.filter((Person p) { p.age>18; })

or

people.filter((Person p) { p.age>18 })

is completely irregular compared to the rest of the language. The first is a syntax error, because p.age>18 is not a valid expression statement, and the second is a vararg containing a single value of type Boolean. The language is extremely consistent about this stuff.

@danberindei

You could say "guess it when you can guess it", but this makes your language fickle. In particular, whenever you update the language and/or its inference algorithms, you have to make sure that anything you could infer before can still be inferred. Your inference algorithm becomes a part of your language specification and type system whether or not you intended so.

You can solve this by enabling type inference only in the cases where you know that you for certain that your type inference algorithm is decidable. The rules for enabling/disabling type inference should be much easier to understand than the inference algorithm, so there would be no problem adding them to the language spec. Since they are applied before the inference algorithm you can still without worrying about arcane corner cases. And it would still remove the type name from 99% of the code.

The thing I hate most about Java's inner classes are all those ()s inside of {}s inside of ()s. I try to make my code as "flat" as possible, and I would rather write an extra keyword than an extra pair of parentheses. So I would prefer something like this:

value adults = people.filter(function p -> p.age>18);

or

value adults = people.filter{ by Person p -> p.age>18 };

I'm pretty sure you can parse function a, b -> a + b as function( a, b )-> a + b, but that would make -> have different priorities depending on context so it would arguably be best to require parentheses for multiple arguments.

When the function doesn't fit in one line I would prefer to use the Smalltalk-like variant. In that case having the argument type at the top of the block could actually improve readability, so I'm quite happy with your proposal.

I don't see myself using inline object parameters very often, but since they are never going to be one-liners I would always prefer the Smalltalk-like variant.

@RossTate
Collaborator

You can solve this by enabling type inference only in the cases where you know that you for certain that your type inference algorithm is decidable. The rules for enabling/disabling type inference should be much easier to understand than the inference algorithm, so there would be no problem adding them to the language spec. Since they are applied before the inference algorithm you can still without worrying about arcane corner cases. And it would still remove the type name from 99% of the code.

So, just to repeat in my own words, you're suggesting having rules identifying when there's enough context information to guarantee inference, and then only do inference when such a context is present. The problem with this, at least in the general case, is that whether there's enough context to do inference depends on what is being inferred. Because of this, for every context you give me, I can give you an inference problem which is still not solvable with that context but would be solvable with a more detailed context (admittedly, I haven't proved this, but I'm conjecturing here based on my experience with the problem).

Now, for the specific problem of inferring the parameter type for an anonymous method, we can give such a rule. It would be something like "The parameter's type does not need to be declared if the anonymous method is a direct argument to a non-generic method whose corresponding parameter type is some instantiation of Callable." Although this is possible, my personal preference is just to be uniform rather than have such special casing. Nonetheless, it is an option.

@danberindei

Although this is possible, my personal preference is just to be uniform rather than have such special casing. Nonetheless, it is an option.

Note that I'm not suggesting creating a special rule for "parameter to an anonymous method", I'm considering parameter declaration to be the context and the rule to be:

"The parameter's type does not need to be declared if [it is a parameter of an anonymous method and] the anonymous method is a direct argument to a non-generic method whose corresponding parameter type is some instantiation of Callable."

How many different contexts are there where type inference is possible? A local, an attribute, a method parameter, a method return value, a type parameter, and that's about it.

These contexts already exists, and I believe type inference should be available in all of them (limited as it may be by context-specific rules).

@gavinking
Owner

A local, an attribute, a method parameter, a method return value, a type parameter, and that's about it.

But that's the point. In an object-oriented language, type inference for a method parameter is not something that is possible. Sure, you might be able to hack it in for some special cases, but they're always going to be special cases.

@danberindei

But that's the point. In an object-oriented language, type inference for a method parameter is not something that is possible. Sure, you might be able to hack it in for some special cases, but they're always going to be special cases.

I meant these are the contexts where you have to specify a type and in some cases the compiler may infer it.

I get it, this is not Haskell, so types are not going to be inferred everywhere. But attribute types can only be inferred if they are initialized and if they are not shared. Does that make type inference for attributes a hack?

@gavinking
Owner

I get it, this is not Haskell, so types are not going to be inferred everywhere. But attribute types can only be inferred if they are initialized and if they are not shared. Does that make type inference for attributes a hack?

I think there's a big difference between saying:

  1. whether you can infer the type of an attribute depends on where the things that refer to that attribute declaration occur, and
  2. whether you can infer the type of a parameter depends, somewhat self-referentially, on what type would be inferred.

The second thing is way more open to corner cases.

In Ceylon we've adopted the principle that the type of an expression should always be completely determined by looking at the expression itself. We don't "cheat" by looking at where it occurs. This is good because it means that indetermined types don't "leak" to the LHS.

For example, using my proposed syntax, there's nothing on earth wrong with:

value round = (Float x) x.natural;

Whereas if you wrote:

value round = x -> x.natural;

not only can we not infer the type of x, we can't even infer the type of round. You would have to write:

function round(Float y) = x -> x.natural;

and then infer the type of x from the type of y, which is the opposite direction from that in which type inference most naturally works.

Now, sure, this principle means that there are going to be some cases where in Ceylon I will have to specify a type annotation where I won't have to in Java 7, for example. But to my way of thinking that's perfectly OK. A type annotation on a method parameter generally makes code significantly easier to understand. And very crystal clear language semantics is always a plus. (And anyway, Ceylon will still require far fewer type annotations than Java 7!)

@RossTate
Collaborator

Now, sure, this principle means that there are going to be some cases where in Ceylon I will have to specify a type annotation where I won't have to in Java 7, for example.

On a related note, there are even cases in Java (and I'm guessing Scala as well, although I haven't verified it) where type inference can actually affect the semantics of the program. That is, the program will do very different things depending on what types are inferred.

I hope Gavin answered your question, but if not I can give you some examples when I get back from my trip.

@kelvio

Hello I have followed the project for a few days and I am very interested in it.
I have a suggestion for the var-arg syntax. (I know the issue here is not specifically this, please forgive me) Why only the last argument can be var-arg? Always dreamed of something like this in java:

void someMethod (Float ... a, Float ... b, Float ... c) {
//For now, we ignore the array limit.
Float value = a[1];
Float value2 = b[2];
Float value3 = c[0];
}

//...

someMethod ({1.0f, 3f}, {4.5f, 2f, 42f}, {10f, 30f, 20f});

It would be possible to have something like dem Ceylon?

@danberindei

@gavinking

In Ceylon we've adopted the principle that the type of an expression should always be completely determined by looking at the expression itself. We don't "cheat" by looking at where it occurs. This is good because it means that indetermined types don't "leak" to the LHS.

I could make the same argument about the inference of type parameters for constructors. I haven't found the spec, but I saw this example in one of your comments to Issue #61:

Pair<Link<L>,Link<R>> pair = Pair(left,right)

Here the compiler does infer the type arguments for the Pair constructor, and if you write it like this:

value pair = Pair(left, right)

then it can't infer the type of pair or the type arguments for Pair.

You would have to write:

function round(Float y) = x -> x.natural;

With my proposal you could also write

value round = Float x -> x.natural;

instead, and then inference would work the "normal" way, just like the inference of type arguments on constructors.

@RossTate

On a related note, there are even cases in Java (and I'm guessing Scala as well, although I haven't verified it) where type inference can actually affect the semantics of the program. That is, the program will do very different things depending on what types are inferred.

I'm sure there are cases like that in Ceylon, too. The following code should print 0:

value round = function (Float x) { return x.integer; }
Float i = 1.;
print(round(i) / 2);

If I change x.integer to x.wholePart, it will print 0.5 instead, because the return value of round is now Float.

@gavinking
Owner
Pair<Link<L>,Link<R>> pair = Pair(left,right)

Here the compiler does infer the type arguments for the Pairconstructor

Right, but it infers them from the type of left and right, not from the type of pair.

and if you write it like this:
value pair = Pair(left, right)
then it can't infer the type of pair or the type arguments for Pair.

Not true. It infers the type Pair<TypeOfLeft,TypeOfRight>.

Hell, even this is legal in Ceylon:

value set = HashSet();

Here, set has inferred type HashSet<Bottom>. (Which is correct.)

And here:

value things = ListImpl("hello", 1.0, 5);

you get the inferred type ListImpl<String|Float|Integer>, which is also correct, assuming ListImpl is covariant in its type parameter.

Type argument inference in Ceylon is cool as hell ;-)

If I change x.integer to x.wholePart, it will print 0.5 instead, because the return value of round is now Float.

Right, but you made an actual semantic change to the code. You didn't just add/remove a type annotation.

@gavinking
Owner
someMethod ({1.0f, 3f}, {4.5f, 2f, 42f}, {10f, 30f, 20f});

It would be possible to have something like dem Ceylon?

Well that is possible, but that's not a varargs invocation, you're just calling the function with three sequences. You could declare the function like this:

void someMethod (Float[] a, Float[] b, Float[] c) { ... }
@danberindei

Right, but it infers them from the type of left and right, not from the type of pair.

My bad, I was thinking of Java's "diamond operator".

Type argument inference in Ceylon is cool as hell ;-)

That's exactly why I think it's a pity it doesn't apply to anonymous function parameters :-)

If Ceylon eventually acquires something like the concepts proposed for C++0x or Go's interfaces, then the compiler could infer a concept for the anonymous function parameters based solely on the function's body. But I didn't see anything like that on the roadmap.

In the end, you'll have to implement this type inference in the IDE anyway so you can suggest an appropriate type for code completion. So why not do it in the compiler as well?

Right, but you made an actual semantic change to the code. You didn't just add/remove a type annotation.

True, I meant to write Float (Float x) { return x.integer; } but then I looked up the spec and I saw that Ceylon doesn't have automatic widening for numeric types. This principal types thing might have some benefits after all :)

@RossTate
Collaborator

If Ceylon eventually acquires something like the concepts proposed for C++0x or Go's interfaces, then the compiler could infer a concept for the anonymous function parameters based solely on the function's body. But I didn't see anything like that on the roadmap.

Could you describe this a bit or provide a link to a concise description? I am intrigued.

In the end, you'll have to implement this type inference in the IDE anyway so you can suggest an appropriate type for code completion. So why not do it in the compiler as well?

I'm not sure when the IDE would be code-completing a type, but regardless: The IDE interacts with the user. Thus, the IDE can make a guess, and the user can correct that guess. This process is also done only as code is written, not as it is read. The compiler, on the other hand, is (ideally) not interactive. Thus, the compiler should be able to say "the code is bad" or compile the code. When the compiler says "the code is bad (because I can't guess the missing information here so please be more explicit)", then it's being an interactive process. Furthermore, the same code is compiled over and over again, so you don't want to have to explain yourself each time you compile. Also, the same code can be compiled by different compilers (or upgrades of the same compiler); thus if guessing is part of the game then code that one compiler guesses for correctly may not be guessable by another compiler and so now code is not portable. This is not a problem for IDEs, since the guess is done only once (upon writing the code), so you don't have to worry about cross-IDE compatibility. Thus, compilers and IDEs have very different requirements permitting them access to very different tools.

@danberindei

There's a short Wikipedia article on C++0x concepts here: http://en.wikipedia.org/wiki/Concepts_%28C%2B%2B%2
The nice thing about concepts is that you can declare an "auto concept" that defines a set of operations and all the types that define those operations will automatically satisfy the concept.

It should be possible to create a concept like this based on all the operations that the programmer used in the body of the anonymous function (they actually used an expression-based concept definition in one of the early concept papers: http://www2.research.att.com/~bs/popl06.pdf). However, I suspect it would be quite expensive and the IDE would have to combine it with some inference to support code completion in the function body. So I don't really think it's an option for Ceylon.

Go's interfaces (http://www.airs.com/blog/archives/277) are also satisfied by any type implementing their methods, without having to declare the relationship in advance. However, since Go doesn't have generic types, its compiler couldn't infer an appropriate interface for an argument based only on the body of the function.

I'm not sure when the IDE would be code-completing a type [...]

Everywhere a type can appear, I presume that the IDE will show a list of possible types on Ctrl+Space. And sooner or later it'll start pruning impossible suggestions from code completion, leaving in most cases only one option.

Furthermore, the same code is compiled over and over again, so you don't want to have to explain yourself each time you compile. Also, the same code can be compiled by different compilers (or upgrades of the same compiler); thus if guessing is part of the game then code that one compiler guesses for correctly may not be guessable by another compiler and so now code is not portable.

Still, I only have to add the type once, regardless of how many times I compile the code. To me it doesn't seem in any way different from the compiler telling me that I wrote a wrong type for the anonymous function argument and that I have to correct it.

And if you make the rule about when to infer the type part of the language, then code would be portable between compiler versions (maybe not between language versions).

@RossTate
Collaborator

If Ceylon eventually acquires something like the concepts proposed for C++0x or Go's interfaces, then the compiler could infer a concept for the anonymous function parameters based solely on the function's body. But I didn't see anything like that on the roadmap.

Okay, now I understand what you're suggesting. Unfortunately this is most likely undecidable as it's closely related to a number of other undecidable problems, specifically subtyping in System F-sub.

Consider the following:

Object foo = fun x => x + x;

What's the type of foo? (This matters cuz we can still cast foo to a Callable.) It only has a principal type if we allow <X> Callable<X,X> where X satisfies Summable<X> to be a type (rather than a type scheme), and we allow that to be a subtype of say <X> Callable<X,X> where X satisfies Number<X> and Callable<Integer,Integer>, at which point we get close to System F-sub, where just subtyping is undecidable. Matters only get worse when we start incorporating higher-order functions:

Object bar = fun x f => x + f(x);

Since now we have to let <X> Callable<X,X,Callable<X,X>> where X satisfies Summable<X> to be a subtype of, say, <X> Callable<X,X,<Y> Callable<Y,Y> where Y satisfies Summable<Y>> where X satisfies Summable<X>.

There's a short Wikipedia article on C++0x concepts here: http://en.wikipedia.org/wiki/Concepts_(C%2B%2B)
The nice thing about concepts is that you can declare an "auto concept" that defines a set of operations and all the types that define those operations will automatically satisfy the concept.

Thanks for the link. I'd forgotten about the "concept" concept, heheh.

Auto concepts are implementable in C++ because C++ templates are all instantiated at compile time (not the same thing as reified). Remember that the template system for C++ is itself Turing complete (from what I understand) so just the instantiation process may not terminate. We don't want this for Ceylon (at least, I'm assuming this is the case, I haven't actually asked anyone about this), and to avoid this requires rather awkward restrictions. For example, we wouldn't be able to allow the following code:

interface Set<X> {
    Set<Set<X>> groupBy(Equivalence<X> grouper);
}

With this, to instantiate Set<Integer> at compile time we would have to also instantiate Set<Set<Integer>> which would then need Set<Set<Set<Integer>>> and so on so that instantiation never terminates.

It should be possible to create a concept like this based on all the operations that the programmer used in the body of the anonymous function (they actually used an expression-based concept definition in one of the early concept papers: http://www2.research.att.com/~bs/popl06.pdf). However, I suspect it would be quite expensive and the IDE would have to combine it with some inference to support code completion in the function body. So I don't really think it's an option for Ceylon.

I couldn't figure out what expression-based concept definition you're intending. Admittedly, I haven't been able to read the whole paper, just skim through it.

Go's interfaces (http://www.airs.com/blog/archives/277) are also satisfied by any type implementing their methods, without having to declare the relationship in advance. However, since Go doesn't have generic types, its compiler couldn't infer an appropriate interface for an argument based only on the body of the function.

Go's interfaces take advantage of the fact that Go uses duck typing (as explained by the article). This has significant importance on implementation. In particular, to invoke a method in Go, one has to look up the method-name (or usually some hash thereof) in a hashtable at runtime to get the function pointer. In Ceylon on the other hand, method's are looked up at a predetermined location in either the virtual method table or a dynamically found interface method table, which tends to be more efficient (especially when in the vtable). There is a related discussion on record types though: #120.

Furthermore, the same code is compiled over and over again, so you don't want to have to explain yourself each time you compile. Also, the same code can be compiled by different compilers (or upgrades of the same compiler); thus if guessing is part of the game then code that one compiler guesses for correctly may not be guessable by another compiler and so now code is not portable.

Still, I only have to add the type once, regardless of how many times I compile the code. To me it doesn't seem in any way different from the compiler telling me that I wrote a wrong type for the anonymous function argument and that I have to correct it.

I'm confused, "I only have to add the type once" sounds like you're making the type explicit in the code, i.e. you are not using inference in the compiler.

And if you make the rule about when to infer the type part of the language, then code would be portable between compiler versions (maybe not between language versions).

In which case the compiler isn't guessing; the compiler has an algorithm it can use that always works. If it doesn't always work, then the compiler is guessing, and you still have the same problem.

So, we're not saying we'll never do inference. We're saying we'll do inference when it is decidable (and we expect it will be forwards compatible with potential future features of the language). Unfortunately, the constraints for decidable parameter-type inference for anonymous functions are rather complex and inconsistent with our principle of principal types that we're using to keep all our features fitting together cleanly (and decidably).

@danberindei

Unfortunately this is most likely undecidable as it's closely related to a number of other undecidable problems, specifically subtyping in System F-sub.

I was trying to argue that it wouldn't be practical to infer the argument types of an anonymous function based only on its body. I guess it's not even possible in the general case.

Remember that the template system for C++ is itself Turing complete (from what I understand) so just the instantiation process may not terminate.

It does terminate in your example - in C++ , if nobody calls Set<Integer>.groupBy then Set<Set<Integer>> is never instantiated.

I couldn't figure out what expression-based concept definition you're intending.

In the paper there was an example of a concept declared by using variables of the concept type in regular C++ expressions and the compiler was supposed to infer the concept's operations based on those expressions. They dropped that approach at some point, so they may have hit issues even with a limited subset of expressions.

In particular, to invoke a method in Go, one has to look up the method-name (or usually some hash thereof) in a hashtable at runtime to get the function pointer.

Actually no, the lookup is performed only once for every type -> interface pair. Very similar to C++, every interface implemented by a type (every base class in C++) has its own vtable - the difference is that the vtable is not constructed on startup, instead it's built when the type is first cast to that interface. Again, just like in C++, but unlike Java, you can't use a pointer to an object as a pointer to an interface (base class).

I'm confused, "I only have to add the type once" sounds like you're making the type explicit in the code, i.e. you are not using inference in the compiler.

Assuming I didn't know the language rules about type inference and I omitted the type, the compiler wouldn't infer it and would give me an error instead. I'd have to write the correct type, but then I would't have to do anything on a second compilation.

Once I read the rule about type inference (perhaps in the compilation error message) and I understand it, I won't get the error again. Of course, this depends on the type inference rule being simple enough.

Unfortunately, the constraints for decidable parameter-type inference for anonymous functions are rather complex and inconsistent with our principle of principal types that we're using to keep all our features fitting together cleanly (and decidably).

I believe inferring argument types for a Callable instantiation (with all the type parameters bound) is always decidable - even trivial, so this would be a simple enough rule to decide when type inference should be enabled.

That leaves only the inconsistency with Ceylon's design principle of always using principal types. You could work around that by not making function x -> x + 1 or function(x) x + 1 a valid expression, and making it only a part of the method invocation syntax.

@RossTate
Collaborator

So it sounds like we've gone full circle and wound up with this rule again:

"The parameter's type does not need to be declared if it is a parameter of an anonymous method and the anonymous method is a direct argument to a non-generic method whose corresponding parameter type is some instantiation of Callable."

I still think I'd just prefer consistency over this special casing. For example, this rule means that making a method generic can break existing code, which I dislike since you'd expect this to make the method more useable, not less. This is a personal preference though.

P.S. Thanks for the implementation info on Go. That's pretty cool! Unfortunately I don't think we can take advantage of it because we're sitting on top of the JVM so we can't dynamically insert new interfaces into a class's interface table =^( At least, not that I know of...

@danberindei

So it sounds like we've gone full circle and wound up with this rule again:

I always thought that this is the only real option to support type inference for anonymous method parameters at all. Anything that makes the user stop and think if the type can be inferred or not is too complex.

For example, this rule means that making a method generic can break existing code, which I dislike since you'd expect this to make the method more useable, not less.

I agree this would be a downside. But at least the rule is quite simple, so it's not that easy to overlook it and make a method generic by mistake.

@tombentley tombentley referenced this issue in ceylon/ceylon-compiler
Closed

Anonymous functions #473

@FroMage
Owner

In light of the fact that aside from old languages like JavaScript and Scheme, every new language seemed to drop the C-style delimited declaration for lambdas. So I'm ready to accept that no matter what I think, the world wants lambdas that look like:

(Integer x) -> x + 1

Another way to look at it, is to see what the Java 8 syntax for lambdas is going to be (http://cr.openjdk.java.net/~briangoetz/lambda/lambda-state-4.html), and decide to go with that syntax on account of the fact that users will be used to it already, in much the same way we're making it easy for Java users to switch to Ceylon.

Apparently their syntax is:

(Integer x) -> expr

or

(Integer x) -> { stmt; } or (Integer x) -> { stmt; return expr; }

We should think about this, and also about the Java 8 syntax for method references, and interface methods.

@gavinking
Owner

@FroMage I think that's a horrible syntax, much worse than what we have right now.

-1

@FroMage
Owner

Which one do we have right now btw? Is it still what this issue description says? Or did we diverge during the discussion?

@gavinking
Owner

Note that the parser + typechecker currently accept either of the following two options:

(Integer x) x+1
(String s) print(s)

Or:

function (Integer x) x+1
void (String s) print(s)

I don't understand the need for the arrow, at least not if you're surrounding the parameter list in parens.

@gavinking
Owner

Note that this:

Integer x -> x+1

is not less verbose than

(Integer x) x+1

and it is less regular.

@quintesse
Collaborator

Can the type be left out like they do in this (supposedly) Java 8 example?

blocks.forEach(b -> { b.setColor(RED); });
@FroMage
Owner

I guess the arrow replaces the function keyword. I don't find () x particularly easy on the eyes…

@FroMage
Owner

Another thing to consider from Java 8 is the fact that they support lambdas as instances of single-method interfaces, which may or may not be something we'd like.

@gavinking
Owner

Can the type be left out like they do in this (supposedly) Java 8 example?

No. See the long discussion above. Our type inference scheme is very consistent and disciplined and doesn't allow inference of parameter types.

@quintesse
Collaborator

@gavinking Less regular true, the thing I maybe liked about the "arrow" is that if I look at code using it in Java 8 examples (eg http://cr.openjdk.java.net/~briangoetz/lambda/collections-overview.html) the lambdas "jump out" more. It's more difficult to mistake it visually for anything other than a lamda.

@quintesse
Collaborator

@gavinking Is that was is referred to in the lambda document (point #4) as "target typing" (http://cr.openjdk.java.net/~briangoetz/lambda/lambda-state-4.html)

@gavinking
Owner

I don't find () x particularly easy on the eyes…

There's an argument for having some kind of special syntax for "laziness", letting you write, for example:

function x
void print("hello")

(i.e. leaving off the parameter list when there are no parameters.)

But please note that functions with no parameters ("laziness") is definitely not the common use case for higher order function support. So I would prefer to not pollute the discussion too much with this special case.

@gavinking
Owner

@quintesse Scanning briefly, it looks like "target typing" is the ability to infer a single-method-interface type as the type of the whole lambda. That's not relevant to us since our anonymous functions aren't instances of single-method-interfaces.

@gavinking
Owner

@quintesse

Java:

blocks.filter(b -> b.getColor() == BLUE)
        .forEach(b -> { b.setColor(RED); });

Ceylon:

blocks.filter((Block b) b.getColor() == BLUE)
        .forEach((Block b) b.setColor(RED));

To me these are pretty much equally readable. And the Ceylon version doesn't require fancy type inference algorithms.

I would say the visual difference between them is that the Java syntax emphasizes that you have an anon function, the Ceylon syntax emphasizes that you're introducing a new variable b.

@quintesse
Collaborator

@gavinking It's not so much about readability, the amount of tokens is more or less the same, it's the fact that the "arrow" jumps out more giving a bit of a visual indication that something special is going on here, that it's not a direct expression but a lambda. Now the effect might be minimal for some, but for me it's there. NB: I'm not suggesting to do it the same as in Java, in fact my proposal would only add more tokens compared to either version:

blocks.filter((Block b) -> b.getColor() == BLUE)
    .forEach((Block b) -> b.setColor(RED));
@quintesse
Collaborator

Anyway, I'm not going to insist on using an arrow or anything, just explaining my way of thinking.
So what's the syntax for multi-statement lambdas?

@gavinking
Owner

@quintesse

it's the fact that the "arrow" jumps out more giving a bit of a visual indication that something special is going on here, that it's not a direct expression but a lambda.

But that's not what's interesting. Well, perhaps it's interesting if you're a newbie and uncomfortable with the whole notion of higher-order functions. But it's definitely not interesting to someone who uses higher-order functions every day. What's much more interesting is that you are introducing a new thing called b, and wtf is this thing and what is its scope. That gets lost in the Java syntax.

@gavinking
Owner

So what's the syntax for multi-statement lambdas?

None: they're ugly as sin, unreadable, and we don't need 'em. We already have a very elegant way to write multi-statement inline functions in a named argument list. We don't need them in positional argument lists.

@gavinking
Owner

Another argument against inferring parameter types is that many higher order functions are generic, and, even if it's possible to design some sophisticated algorithms to reliably infer the type arguments in some limited set of cases, given a complete code snippet, it's likely to be much harder to do this inference given a bit of incomplete code the user is typing in the IDE. Which means you will often be typing the body of the anon function without the help of autocompletion. We'll see what the JDT guys come up with for Java, perhaps they can make it work often enough that users won't mind too much, but I have doubts that you can ever get this truly robust.

@gavinking
Owner

@4959037 Yeah, higher-order anonymous functions. I'm serious about that ;-)

@gavinking
Owner

I'm going to close this issue because we have an acceptable, already-implemented solution. That doesn't mean I'm trying to shut down this interesting discussion: you guys can still comment on the closed issue.

@gavinking gavinking closed this
@FroMage
Owner

you guys can still comment on the closed issue.

…while you're done listening to us? ;)

@FroMage
Owner

In all my years of writing functional code in Scheme and JavaScript I've never once written simplistic lambda expressions like used in the Java 8 paper but always statements. I'd like to hear a more compelling argument than "ugly as sin, unreadable, and we don't need 'em" ;)

@gavinking
Owner

I'd like to hear a more compelling argument than "ugly as sin, unreadable, and we don't need 'em"

I don't understand. How much more compelling an argument do you need than "we already have an alternative much, much better syntax for this case"?

@RossTate
Collaborator

Another thing to consider from Java 8 is the fact that they support lambdas as instances of single-method interfaces, which may or may not be something we'd like.

I've thought about this before. Suppose we had some annotation, say delegate to steal from C#, that could be applied to interfaces and requires the interface to have exactly one formal (not default) non-generic method (though the interface could still be generic). If so annotated, we provide a constructor for the interface that just takes a Callable of the appropriate type.

For example:

shared delegate interface Equivalence<in T> {
    formal Boolean equivalent(T left, T right);
    default Integer hash(T t) { return 0; }
    Boolean distinct(T left, T right) { return !equivalent(left, right); }
}

Equivalence<Integer> modular(Integer divisor) {
    return Equivalence((Integer left, Integer right) left % divisor == right % divisor);
}

Does that seem reasonable?

@FroMage
Owner

we already have an alternative much, much better syntax for this case

I suppose that is a question of style, and not of regularity. I still didn't hear an argument why function () { stmt; } would be bad, unless I missed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.