New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compound extension #176

Open
wants to merge 3 commits into
base: master
from

Conversation

Projects
None yet
7 participants
@chuckjaz
Copy link

chuckjaz commented Jan 4, 2019

A compound extension proposal as discussed here: https://discuss.kotlinlang.org/t/compound-extension/10722

@TWiStErRob

This comment has been minimized.

Copy link

TWiStErRob commented on proposals/compound-extensions.md in 0a9c368 Dec 18, 2018

Have you considered a soft keyword? compound fun (A, B).some( would be unambiguous from the beginning.

This comment has been minimized.

Copy link
Owner Author

chuckjaz replied Dec 18, 2018

I had not considered it, no. I will add that as an alternative.

Added a more detailed description of how to resolve an extension
from disucssions in https://discuss.kotlinlang.org/t/compound-extension/10722

Added another syntax variation using pseudo-keywords discussed
here: chuckjaz@0a9c368
A compound extension is declared using dotted list of types. For example,

```kotlin
operator fun Body.String.unaryPlus(s: String) = escape(s)

This comment has been minimized.

@JakeWharton

JakeWharton Jan 5, 2019

This can't work. What does Map.Entry.unaryPlus apply to? Map and Entry or Map.Entry?

This comment has been minimized.

@Wasabi375

Wasabi375 Jan 5, 2019

Contributor

I think this is discussed later on in the syntax part of the proposal. As far as I can tell, this example uses the current syntax as an introduction to the topic. This syntax clearly does not work, neither with namespaces (as pointed out below) not nested classes.
The question is which of the other possible syntax variants should be used primarily. I personally prefer parenthesis.

This comment has been minimized.

@altavir

altavir Jan 5, 2019

It seems like square brackets give the best readability and parseability. We should remember that we need the syntax for function types. [].()-> reads better than ().()->. I think that alternative syntax solutions should be added to proposal.

This comment has been minimized.

@chuckjaz

chuckjaz Jan 7, 2019

Author

It works in either case as addressed below.

In the where there is an existing meaning then such as Map.Entry it refers to Map.Entry. If Map doesn't have a nested type named Entry it refers to the Entry in scope. If you have an Entry in scope you wish to extend, it collides with Entry but a alias can be introduced to allow you to disambiguate.

@altavir

This comment has been minimized.

Copy link

altavir commented Jan 5, 2019

I do not really like the title. Event compound receiver would be be better. But commonly used name for this feature is multiple receiver extension.

@altavir

This comment has been minimized.

Copy link

altavir commented Jan 5, 2019

As a motivation for this proposal (not exactly current version but the one we got after discussion), I wrote an article about context-oriented approach in kotlin program design.

Additional motivation could be found in kmath documentation where I actually use this approach.

Currently I kind of really need it for performance optimization. I have contexts which define operations on some objects (for example nd-structures). Now I want to some functions like plus to be defined in the context:

class NDField<T>{
  ...
  fun NDStruncture<T>.plus(arg: T): NDStructure<T>
}

This operation could benefit a lot from being moved to extension. It allows not to bloat the NDField class and avoid complicated type constructs in inheritance. What is more important it allows to create an optimized implementations for specific types without complicated inheritance structures of field itself:

  fun <T> [NDField<T>, NDStruncture<T>].plus(arg: T): NDStructure<T>
  fun [NDField<Double>, NDStruncture<Double>].plus(arg: Double): NDStructure<Double>

Being dispatched statically, those functions could use performance optimization like inlining. Context-oriented paradigm could work solely on such extensions since the method resolution is done based not on runtime type (runtime type could be used, but it is could be avoided), but on scope type. In fact, one could use empty marker scopes and dispatch all functions statically which should also benefit the performance.

@chuckjaz

This comment has been minimized.

Copy link
Author

chuckjaz commented Jan 7, 2019

This seems similar in character to the escape html DSL example or am I missing something?

@chuckjaz

This comment has been minimized.

Copy link
Author

chuckjaz commented Jan 7, 2019

As for the name, I am open to other names but "compound receiver" doesn't seem right to me as the receiver is not compound. "multiple receiver extension" is better as it is more directly what is happening.

@erokhins

This comment has been minimized.

Copy link
Collaborator

erokhins commented Feb 4, 2019

@chuckjaz Thank you for proposal!

I want to highlight some not covered/problematic places.

Syntax

As @JakeWharton mentioned before, syntax fun A.B.foo() can has 2 different meaning: function with receiver A.B (where B is nested class of A) or function with two receivers: A and B.
Ambiguity can be resolved as you proposed (nested class wins), but user cannot force compiler use another meaning: two receivers.

Another problem here is that sometimes you want to use fully-qualified name for type i.e. fun a.b.Foo.x.y.Bar.bas() and in such cases resolution rules will be even worse (especially in package a we have class b and so on).

Resolution rules

Consider the following example:

class A {
    fun B.C.foo() {}
}

class B
class C

fun A.B.foo() {}

fun test() {
    with(A()) {
        with(B()) {
            with(C()) {
                foo()
            }
        }
    }
}

This example illustrates 2 different problems:

  1. what we should do if our candidates have different amoung of receivers? (We can consider that member extension function foo in class A have 3 receivers: A, B, C)
  2. should be members wins over extensions?

Receivers handling

Disambiguation via this@A where A is type name introduce entirely new meaning for labels -- label by type name. I would try to avoid that. Instead of that maybe we can mark receiver explicitly if necessary.

Related not covered theme -- possibility to explicitly pass all implicit receivers on call site:

class A
class B

fun A.B.foo() {}

fun test(a: A, b: B) {
    (a, b).foo() // one possible synthaxs for that
}

@erokhins erokhins self-assigned this Feb 4, 2019

@altavir

This comment has been minimized.

Copy link

altavir commented Feb 4, 2019

The alternative syntax in discussion (round or square brackets) covers all syntactic problems. @chuckjaz could you move it from discussion to the proposal as alternative? I really do not like this dot solution.

As for member vs extension problem. As @orangy mentioned, we should try to keep the same behavior as was before for member extensions. I do not think that multiple receivers would change that. The member always win rule is still in place.

@erokhins

This comment has been minimized.

Copy link
Collaborator

erokhins commented Feb 4, 2019

Sometimes extensions win over members:

class A {
    fun foo() {}
}
class B

fun B.foo() {}

fun test() {
    with(A()) {
        with(B()) {
            foo() // resolved to B.foo
        }
    }
}

So particular rules should be clarified.

@altavir

This comment has been minimized.

Copy link

altavir commented Feb 4, 2019

Are those rules explicitly written somewhere? It would be easier just to look at them and see what should be changed.

@erokhins

This comment has been minimized.

Copy link
Collaborator

erokhins commented Feb 4, 2019

@altavir

This comment has been minimized.

Copy link

altavir commented Feb 4, 2019

I am moving two my latest posts form discussion here with little remarks

Next iteration of resolution proposal

Here I will try to combine best parts of all current proposal and account for backward compatibility. The resolution could be done in following steps:

  1. Create a list of actual contexts G, A, B, C, ... where G is a global context. Top level non-extension functions have only G as context. Class members have that class as context. Extension function have their receivers added to context where they are defined. Running a function with receiver adds that receiver to the list of contexts where this function is defined.Types in the list could be duplicating. We will call those actual types Cy where y is the index
  2. Checking the definition of the function. Function receiver list is written in form of [R1, R2, R3]. Duplicate types or ambiguities throw compile error. Types could have parameters defined outside type list like fun <T> [T, Operation<T>].doSomething(). We will call receiver types Rx where x is the index.
  3. Binding of extension function. Each of types R in receiver set is matched against the elements of context list from right to left, binding it to first match and thus creating a map Rx to Cy. If there is a bound pair for each of Rx then function is considered resolved and bound to context. Multiple R could be bound to the same context C, so it is possible to have just one context for multiple receiver types if it matches them both.

The normalization step is probably could be avoided in this scheme. The results of this procedure are the decision about binding and binding map Rx to Cy.

This resolution is done via declared receiver types Rx which are then substituted by actual runtime objects representing Cy. Since the binding is done to the closest receiver matching the type, this will also represent the closest receiver of matching type.

Compatibility check

  • [A].f === A.f.Extension function with single argument should work exactly like existing extension function. It seems like it does. It is resolved and bound to the closest context matching its type.

  • Match current member receivers strategy. Seems to be working the same way. It resolves to the closest context even if this context implements both receivers.

@altavir

This comment has been minimized.

Copy link

altavir commented Feb 4, 2019

Some additional thoughts

I was thinking a lot about this context-based resolution idea and I think that it is possible to go a little further (not immediately, mind you, it is the idea for the future). I am leaving this global context G everywhere in my schemes and it does not play any important role because it usually is empty. But in fact it could provide a lot of opportunities.

File level context

G is basically resolved to file which does not have any type of its own. But if there was a way to bind a context or a set of contexts to the file itself (not proposing any syntax solution, but it should be quite easy). Then everywhere in my schemes above we will just replace G with G, F1, F2. Meaning that all classes and functions in this file will have additional implicit contexts, just like type-classes. Of-course, it means that Fs could only be singletons in this case.

This mechanism is in fact currently used in KEEP-75 for script implicit receivers. But of course, for code, it should be explicit and only singleton objects will do.

Extension classes and interfaces

We consider a class or interface to be top-level non-empty context. It seems to be not so hard to add a set of external contexts to the class context. It could look like class [T1,T2].MyClass{}. In this case the instance of this class could be created only in context bound to both T1 and T2 and all members of this class will have additional implicit contexts, meaning member of MyClass will be able to call members of T1 without additional declarations. From the implementation point of view, the instances of T1 and T2 could be class constructor implicit parameters (or a single map parameter, which is probably better), it should not violate anything, even Java compatibility (you will have to pass instances of contexts to create class instance). The interfaces could work the same way, just provide synthetic (invisible) property which holds the Map<KClass, Any> of actual receiver.

Note: There will be some problem with class type parameter syntax here since unlike functions, type parameters in classes are declared after class name, but probably it could be solved.

This second proposal is in fact much more flexible than first one since we can use non-singleton contexts and define them explicitly on class creation. Also, both ideas probably could cover most of type-classes use-cases.

@chuckjaz

This comment has been minimized.

Copy link
Author

chuckjaz commented Feb 4, 2019

I want to highlight some not covered/problematic places.

Syntax

As @JakeWharton mentioned before, syntax fun A.B.foo() can has 2 different meaning: function with receiver A.B (where B is nested class of A) or function with two receivers: A and B.

Another problem here is that sometimes you want to use fully-qualified name for type i.e. fun a.b.Foo.x.y.Bar.bas() and in such cases resolution rules will be even worse (especially in package a we have class b and so on).

My intent here is not to change the rules of resolution but to add a meaning to a case that is currently an error. The proposal only kicks in, so to speak, when Kotlin would report an error.

Ambiguity can be resolved as you proposed (nested class wins), but user cannot force compiler use another meaning: two receivers.

They can by using a type alias or rename on import. This is the same case you run into if you are trying to use the type Bar from x.y in a package that defines its own Bar. One must be renamed or otherwise aliased.

Resolution rules

Consider the following example:

class A {
    fun B.C.foo() {}
}

class B
class C

fun A.B.foo() {}

fun test() {
    with(A()) {
        with(B()) {
            with(C()) {
                foo()
            }
        }
    }
}

This example illustrates 2 different problems:

  1. what we should do if our candidates have different amoung of receivers? (We can consider that member extension function foo in class A have 3 receivers: A, B, C)

The first found is scope wins. In this case since the fun B.C.foo() in class A wins. The number of receivers is defined by the function being called. In this case, it requires 3, A (as this) B and C as receivers. Since fun A.B.foo() is not found the fact it takes 2 receivers is not relevant.

In other words, the lookup should behave consistently with removing the compound receiver and being left only the simple receiver. For example, because

class A {
    fun C.foo() {}
}
 
class B
class C
 
fun B.foo() {}
 
fun test() {
    with(A()) {
        with(B()) {
            with(C()) {
                foo()
            }
        }
    }
}

will result in A's fun C.foo() being called as it is in scope prior to fun B.foo(). The previous example should result in fun B.C.foo() from class A, and, identically, because fun B.foo() would receive two receivers today (A as the this parameter and B as the receiver parameter) and the presence of foo B.foo() {} doesn't affect this, the function fun B.C.foo() would receiver three and is unaffected by foo A.B.foo() in scope.

  1. should be members wins over extensions?

The current lookup rules for extensions are unmodified. A compound receiver function foo A.B.foo() should be thought of as extending the type A with a member fun B.foo(), which extends the type B with fun foo(). A direct member of B has precedence over all extensions.

Receivers handling

Disambiguation via this@A where A is type name introduce entirely new meaning for labels -- label by type name. I would try to avoid that. Instead of that maybe we can mark receiver explicitly if necessary.

I will add that as alternative. My intent was not to add a new type label but to have the type name imply a label.

Related not covered theme -- possibility to explicitly pass all implicit receivers on call site:

class A
class B

fun A.B.foo() {}

fun test(a: A, b: B) {
    (a, b).foo() // one possible synthaxs for that
}

This seems to be a related but separate proposal as it would apply to extensions in generate, not just compound extensions.

@chuckjaz

This comment has been minimized.

Copy link
Author

chuckjaz commented Feb 4, 2019

Sometimes extensions win over members:

class A {
    fun foo() {}
}
class B

fun B.foo() {}

fun test() {
    with(A()) {
        with(B()) {
            foo() // resolved to B.foo
        }
    }
}

So particular rules should be clarified.

The proposal defines, given a scope, such as the scope for the body of the with (B()), how does one determine if foo() matches an extension function in scope. It does not affect how scopes are composed. The example above illustrates what happens when a member of B obscures a member of A. This proposal will have an effect on this case but only as far as this proposal affects how members are found in the scope of B.

@chuckjaz

This comment has been minimized.

Copy link
Author

chuckjaz commented Feb 4, 2019

@altavir Thanks for providing a central place to archive discussion about this proposal.

I addressed the comments attached here https://discuss.kotlinlang.org/t/compound-extension/10722/31 and modified the proposal to cover the issues surfaced in the discussion.

@fvasco

This comment has been minimized.

Copy link

fvasco commented Feb 4, 2019

In the section "Matching rules": "Report ambiguous calls If multiple valid candidates are possible the call is ambiguous and an error is reported", how to disambiguate and so fix the error?

I wish to use a compound extension to add two integers, should I write:

fun Int.Int.sum() = this@Int + this@Int

The section "Disambiguation of this" does not help for my task.

Using the above function, what is x value of the code:

val x =
 with(2) {
  with(3) {
   with(5) {
    sum()
   }
  }
 }

Is it an ambiguous call?

@chuckjaz

This comment has been minimized.

Copy link
Author

chuckjaz commented Feb 5, 2019

@fvasco This is not ambiguous, it is just obscure.

You can use a type alias to introduce a name to refer to the first receiver.

typealias MyInt = Int
fun MyInt.Int.sum() = this@MyInt + this@Int

The value of x above is a 8 as you would expect. It is not ambiguous as one and only one function would match fun MyInt.Int.sum(). The receivers are calculated as nearest where MyInt binds to the Int scope introduced by with(3) and Int binds to the receiver introduced by with(5).

@altavir

This comment has been minimized.

Copy link

altavir commented Feb 5, 2019

@fvasco I think this one should be prohibited. I mean compile time error. In my variant the function will not pass the checking stage.
@chuckjaz I am not sure that type alias is a good idea. I do not know how typealiases are resolved by compiler, but using them for this resolution could bring additional level of complexity. What will happen if ones switches order of with in example?

@fvasco

This comment has been minimized.

Copy link

fvasco commented Feb 5, 2019

@altavir I agree with you, this is a missing point in the documentation.

If we consider the this scopes as a list then no permutation should be allowed (because the position is relevant, see "Matching rules"), otherwise if we consider the this scopes as a set then no duplication should be allowed (because multiple permutation matches).

Moreover in my opinion type alias is a bad solution, Kotlin already supports label so I consider a better approach:

fun A@Int.B@Int.sum() =this@A + this@B
@altavir

This comment has been minimized.

Copy link

altavir commented Feb 5, 2019

I also would like to remind everyone, that we want the feature to be gradually integrated in the language. That means restrict first, permit later approach. We need to restrict anything that could pose a problem in first implementation and think about relaxing the restriction.

And again, I do not like the dot syntax. It is confusing. It supposes that we have a fixed order, but we agreed that order of receivers should not matter. Also it will be really difficult to work with function types (you should also remember that types could have parameter).

By the way, if we are talking about parameters, we need to understand if two types with the same base and different parameter are allowed. I think that it should be restricted as well meaning that [Box<A>, Box<B>] (or Box<A>.Box<B> in dot notation) should be prohibited as well.

@chuckjaz

This comment has been minimized.

Copy link
Author

chuckjaz commented Feb 5, 2019

@chuckjaz I am not sure that type alias is a good idea. I do not know how typealiases are resolved by compiler, but using them for this resolution could bring additional level of complexity.

The this scopes would exist, using a label just disambiguates them allowing you to access a possibly obscured member or the instance directly as in the example.

Currently the simple name of the type is allowable as a label such as in the case of referring to the outer class in a nested class.

class A {
  var foo = 1
  inner class B {
    var foo = 2
    fun addFoos() = this@A.foo + foo
  }
}

My proposal uses something similar to disambiguate the receivers just as the above.

What will happen if ones switches order of with in example?

The order of the receiver parameters change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment