Skip to content
This repository has been archived by the owner on Apr 13, 2023. It is now read-only.

higher order generics #3860

Open
CeylonMigrationBot opened this issue Aug 16, 2013 · 97 comments
Open

higher order generics #3860

CeylonMigrationBot opened this issue Aug 16, 2013 · 97 comments

Comments

@CeylonMigrationBot
Copy link

[@gavinking] We've often discussed this feature, type constructor parameterization, or kinds, or whatever you want to call it, which boils down to, by analogy with first-class/higher order functions at the value level, adding:

  • first-class type constructors, and
  • higher-order generic declarations.

That is, I can define generic types and functions which are parameterized not only by types, but also by type constructors. And I can pass a reference to a type constructor as an argument to such a generic type or function.

I now have an implementation of higher order generics in the typeconstructors branch of the typechecker project. An outline of my approach:

  • Type constructors are represented as types within the type system, just like functions are represented as values. This is much less significant than it sounds, since type constructors have no meaningful subtype relationships with other types, but it does let us avoid introducing a whole new level of the type system. In particular, it means that a type parameter that accepts type constructors is still just a type parameter, not some new kind of thing.

  • The syntax @List produces a reference to a type constructor for a generic type declaration. In principle, we could live without the @, but that would make it awfully difficult to distinguish type constructors for ordinary unparameterized types, so I think the @ is a good thing. Reiterating what I said above, @List is considered a type. It's not, however, meaningful to ask what its values are. Nor does it have any interesting subtype relationships with other types. (Indeed, @List might not even be a subtype of Anything.)

  • A generic declaration may have a parameter that accepts type constructors. The syntax is:

     interface Functor<@Fun> given Fun<Element>
    

    The @ is a visual indication that the parameter expects a type constructor, and the given clause defines the signature required. We could simplify the syntax to interface Functor<Fun<Element>>, but I find this a lot less self-explaining.

  • Within the body of such a parameterized declaration, the type parameter is a parameterized type like any other parameterized type. I can write, for example:

    interface Functor<@Fun> given Fun<Element> {
        shared formal Fun<Out> map<In,Out>(Out apply(In a))
                                          (Fun<In> inFun);
    }
    
  • Finally, I can instantiate the generic declaration by giving it a conforming type constructor:

    object listFunctor satisfies Functor<@List> { ... }
    
  • Now, with just this much machinery I can write code that abstracts over functors, for example:

    Fun<String> toString<@Fun>(Functor<@Fun> functor)
                              (Fun<Object> inFun) 
                    given Fun<Element> 
            => functor.map(Object.string)(inFun);
    

There's one big limitation remaining in the current implementation: I don't have type constructor arg inference, so I have to write shit like this:

List<String>(List<Object>) listToString = toString<@List>(listFunctor);

Instead of just:

List<String>(List<Object>) listToString = toString(listFunctor);

I assume that this problem is tractable.

Otherwise, the implementation was surprisingly easy and straightforward though surely there are still a couple of holes.

Now, the only reason I've even bothered playing around with this shit at this stage is that we have an actual use for it in the metamodel API. We would like to be able to have typesafe model objects for generic declarations. That is, it would be nice to be able to write:

GenericClass<@List> listGenClass = `@List`;
Class<List<String>> stringListClass = listGenClass(`String`);

And have that be all totally typesafe.

UPDATE: Note, we no longer propose to use this stuff for the metamodel!

What I'm not proposing at all is to go and make Iterable and its subtypes into Functors. It would be possible to do this someday, if we decide it's truly useful, but I definitely don't want to do it in Ceylon 1.x.

So, we need to decide if we're going to make this a part of the language. There are strong arguments on both sides, which I'll let others advocate, because I simply can't seem to make up my mind.

[Migrated from ceylon/ceylon-spec#754]

@CeylonMigrationBot
Copy link
Author

[@gavinking]

Now, the only reason I've even bothered playing around with this shit at this stage is that we have an actual use for it in the metamodel API. We would like to be able to have typesafe model objects for generic declarations.

Now for the bad news: unfortunately, with the machinery that is there now, I can't yet write a totally generic GenericClass interface. I would need Scala-style GenericClass1, GenericClass2, GenericClass3 up to the blessed GenericClass22, beyond which point are monsters. I think you all know how I feel about that approach ;)

So we would need a way to abstract over type constructors with diff-length type parameter lists. (Analogously to how Callable uses a tuple type to abstract over diff-length value parameter lists.) I don't know how to do this yet.

@CeylonMigrationBot
Copy link
Author

[@gavinking]

I don't know how to do this yet.

So I think there's two related solutions that might work here. The first would be to introduce a type-level spread operator * that spreads a tuple type over the type parameters of a type constructor, just like how our existing spread operator spreads a tuple over value parameters:

shared interface GenericClass<@TheClass, Bounds> 
        given Bounds satisfies Anything[] 
        given TheClass<*Bounds> {
    shared formal Class<TheClass<*Args>> apply<Args>() 
            given Args satisfies Bounds;
}

An alternative approach would be to define that @Map is a type constructor in "cartesian product" form, i.e. a function that accepts a tuple type like [String,Object] and produces the corresponding instantiation, so @Map<[String,Object]> is the type Map<String,Object>. That is, we suck the definition of spreading inside the definition of this type constructor:

shared interface GenericClass<@TheClass, Bounds> 
        given Bounds satisfies Anything[] 
        given @TheClass<Bounds> {
    shared formal Class<@TheClass<Args>> apply<Args>() 
            given Args satisfies Bounds;
}

If we went down that path, it might be interesting to define the same thing at the value level, i.e. that @plus<Float> is a function of type Float([Float,Float]). The @ syntax would become some kind of dual of the spread operator *.

Of the two approaches the second is probably slightly simpler to define/implement, but I suppose it's also a whole new syntactic thing that we don't currently have at the value level. So I guess the type-level spread operator is the smallest change we could make to the language that would let us solve the problem of defining GenericClass.

P.S. Note that there is a new kind of type constraint in the above code examples:

given TheClass<*Bounds>

or:

given @TheClass<Bounds>

says that the types in the tuple type Bounds must be assignable to the type parameters of TheClass. So

GenericClass<@Map,[String,Object]>>

is a well-formed type expression, but these type expressions are not:

GenericClass<@Map,[Object]>>            //error: wrong number of type args to Map
GenericClass<@Map,[String,Anything]>>   //error: type arg Anything not assignable to upper bound on Map.Item

@CeylonMigrationBot
Copy link
Author

[@FroMage] Syntactically and intuitively I'd say I prefer to be able to spread type arguments over the second alternative, but I fear the consequences that could have…

@CeylonMigrationBot
Copy link
Author

[@gavinking] Note: the current implementation of higher-order generics has the following "hole":

T<String,Anything> bad<@T>() given T<X,Y> => nothing;

value val = bad<@Map>(); //val has illegal type Map<String,Anything>

However, in light of #3641, I guess this is not actually a real problem, or at least it won't be if/when we address #3641.

@CeylonMigrationBot
Copy link
Author

[@simonthum] At the possible expense of looking stupid: What does this whole thing buy you over a ConstructingCallable< ...> extends Callable<...> which every type class implements and maps to its initalizer?

@CeylonMigrationBot
Copy link
Author

[@gavinking] @simonthum without higher-order generics, it's impossible to define a GenericClass type that acts as the metamodel object for parameterized classes, with an apply() operation that accepts a list of type arguments and produces the Class metamodel object for the instantiation of the generic type.

@CeylonMigrationBot
Copy link
Author

[@gavinking] Hrm, with the machinery developed so far, I guess it's still impossible to represent generic functions at the meta level. To solve this problem we would also need something like "anonymous" type constructors.

For example, the metatype of the function plus is:

GenericFunction<<T>=>T,<T>=>[T,T],[Summable<Anything>]>

Where GenericFunction is defined like this:

shared interface GenericFunction<@Type,@Arguments,Bounds> 
        given Bounds satisfies Anything[]
        given Type<*Bounds>
        given Arguments<*Bounds> {
    shared Function<Type<*Args>,Arguments<*Args>> apply<Args>() 
            given Args satisfies Bounds;
}

Interestingly, if we solved this problem, we might also be able to assign a denoteable type to generic function references. That is, a function reference like sum, which is currently illegal (we force you to write sum<Float>) could have the type GenericCallable<<T>=>T,<T>=>[T,T]>. I'm not sure if that quite qualifies as higher rank polymorphism, but it's definitely close, if I've got my definitions straight.

All this is technically very tractable, but I'm increasingly inclined to think it simply doesn't belong in this language.

@CeylonMigrationBot
Copy link
Author

[@gavinking] OTOH, since arbitrary "anonymous" type constructors are going to let you write down undecidable things really easily, the other alternative would be to have the type constructors be derived from the signature of the method via a special syntax, for example:

GenericFunction<@plus,plus(@),[Summable<Anything>]>

Where @plus is the return type constructor, i.e. <Value>=>Value here in this case, and plus(@) is the argument types constructor, i.e. <Value>=>[Value,Value] in this case.

Or whatever. It's very difficult to imagine a syntax for this which would be anything other than totally baffling to most programmers.

@CeylonMigrationBot
Copy link
Author

[@gavinking] After extensive thinking on this topic, I think I've decided that there us no real value to the typesafe apply() method. If you have a GenericClass<@T> and want to apply it to the types X and Y, you can just as easily write:

 `T<X,Y>`

To get the Class<T<X,Y>> you need. The same is true for generic functions. So I think the whole metamodel rationale for introducing higher order generics is a distraction, and we won't need all that crazy stuff I've been speculating about above.

Of course, the question remains: do we need "basic" type constructor polymorphism, as already implemented today, for some other reason?

@CeylonMigrationBot
Copy link
Author

[@gavinking] So if we ever do decide to add this stuff, we will have to decide what to do with type aliases. Given:

alias Labeled<Value> => String->Value;
alias Numbered<Value> => Integer->Value;
alias E<Key,Value> => Key->Value;

We would need to either:

  1. prohibit type constructor references for aliases like @Labeled, @Numbered, and @E, or
  2. be able to reason about their equivalence, and determine that @E==@Entry, but @Labeled!=@Numbered and @Labeled!=@Entry.

Surely the easy approach 1 would be reasonable, but it would rule out some genuinely useful things.

But what's interesting about approach 2 is that if we went down that path, it's simply not a big stretch at all to "anonymous" type constructors like this:

<Value> => String->Value

This is, quite clearly, the same type constructor as @Labeled above.

Now this is clearly verging on rank-2 polymorphism! You could write:

shared interface GenericCallable2<@Type, @Args> 
        given Type<X,Y> given Args<X,Y> {
    shared Callable<Type<X,Y>,Args<X,Y>> apply<X,Y>();
}

to represent functions with two unbounded type parameters. And then the type of a reference to this function:

List<X|Y> fun<X,Y>(List<X> xs, List<Y> ys) 

would be:

GenericCallable2<<X,Y>=>List<X|Y>,<X,Y>=>[List<X>,List<Y>]>

Of course there's still a couple of things missing here:

  1. the ability to abstract over the arity of the type parameter list, and
  2. the ability to capture bounds.

So I'm not actually seriously proposing to try and tackle rank-2 polymorphism, just trying to get some insight by showing how it fits in.

@CeylonMigrationBot
Copy link
Author

[@RossTate] An update: with shapes, we figured out we can do subtyping and joins even with higher-order generics with type lambdas. The question becomes, then, do you want higher-order type arguments to be inferred? I don't know how to solve that problem, unfortunately.

@CeylonMigrationBot
Copy link
Author

[@gavinking]

higher-order generics with type lambdas

A "type lambda" is my "anonymous type constructor", right?

do you want higher-order type arguments to be inferred?

I'm inclined to think that higher-order generics is not very useful without some sort of type constructor inference. I don't believe that people want to write code with lots of type constructor arguments in it. Hell, one of the major things that our type system tries to move away from is the need to write type arguments all over the place...

@CeylonMigrationBot
Copy link
Author

[@gavinking] P.S. Ross please note that while I do think that it's very interesting to speculate about stuff like:

 GenericCallable2<<X,Y>=>List<X|Y>,<X,Y>=>[List<X>,List<Y>]>

I believe that there is almost nobody out there who actually wants to see something like this in real code. If I were to go about designing a language with the deliberate goal of scaring off newbies, I would make damn sure to include the above in it ;-)

@CeylonMigrationBot
Copy link
Author

[@RossTate] Heh, I ain't disagreeing. Just updating you with the facts (and yes, anonymous type constructors = type lambdas). As for inference, the only way I can see that working is if we disallow type lambdas and somehow declare what generic classes/interfaces can be used as higher-order generics and with respect to which parameters.

@CeylonMigrationBot
Copy link
Author

[@gavinking] @RossTate Hrm, yeah, 5 mins thinking about the problem makes me realize that, since the union/intersection tricks aren't applicable here, you can pretty easily get ambiguities. I guess this just doesn't arise with ML-like languages because they don't have subtyping. How the hell does Scala handle this problem?

@CeylonMigrationBot
Copy link
Author

[@RossTate] Admittedly, I'm not sure how Scala handles the problem cuz it doesn't seem to be well spec'd. I have seen examples with type lambdas (though sometimes indirectly encoded since they seem to be a feature in progress), but I haven't seen anything with higher-kinded-type-argument inference. Regardless, Scala has explicitly stated that it doesn't worry about decidability, so I doubt that their approach to this would really fit with your design philosophy.

@CeylonMigrationBot
Copy link
Author

[@gavinking] Is it covered by the Scala spec?

@CeylonMigrationBot
Copy link
Author

[@RossTate] Not from what I could tell.

@CeylonMigrationBot
Copy link
Author

[@gavinking] Hrmph. Now that's definitely not a good sign.

@CeylonMigrationBot
Copy link
Author

[@pthariensflame] FWIW, there is a reason even full GHC-extended Haskell doesn't have (unrestricted) type lambdas; they guarantee a Turing-complete type system, even just on their own, unless they are somehow kind-checked in a more sophisticated than usual way (see, e.g., Agda for an example of this latter type of technique).

@CeylonMigrationBot
Copy link
Author

[@gavinking] Hrm so does that mean parameterization by type alias constructors would also tip you over the edge into turing-completeness? It would, right?

@CeylonMigrationBot
Copy link
Author

[@pthariensflame] Yup. Type aliases as fully-fledged type constructors (as opposed to something like GHC's LiberalTypeSynonyms system) are exactly equivalent to (in fact, in essence, they are) unrestricted type lambdas.

@CeylonMigrationBot
Copy link
Author

[@gavinking]
@pthariensflame So I had understood @RossTate's slightly cryptic comment above:

An update: with shapes, we figured out we can do subtyping and joins even with higher-order generics with type lambdas.

To be suggesting that with his shapes vs types system, type lambdas turn out to be decidable.

Ross, could you clarify?

@CeylonMigrationBot
Copy link
Author

[@pthariensflame] I can't find anything on these "shapes" by searching (probably because it's such a generic name :) ). I'll wait for @RossTate to clarify, as well. In the meantime, it's worth noting that there is a very simple restriction on type lambdas that allows decidability (I referred to it earlier as "GHC's LiberalTypeSynonyms system"): A type alias may only be used without parameters if it is being passed to another type alias. The formal way to state this restriction is that an injective higher-order type constructor may only take other injective type constructors as its higher-order parameters (in Ceylon, this is especially convenient to interpret if you mentally replace "injective" with "reified"; they're not really the same underlying concept at all, but, within Ceylon specifically, all reified type constructors are guaranteed, by definition, to be injective, and the only potentially injective non-reified type constructors in Ceylon are, I think, undecidably so).

@CeylonMigrationBot
Copy link
Author

[@RossTate] The shapes concept (currently in submission but not yet published, though informal discussion can be found in #3922) significantly restricts recursion in inheritance. This effectively restricts the computational power of type lambdas enough to ensure that they always terminate.

@CeylonMigrationBot
Copy link
Author

[@pthariensflame] @RossTate So it seems like a "shape" is just the OO-ish incarnation of a Constraint (á la modern GHC-extended Haskell, or my own (WIP) language Spellcode). Is that an accurate assessment?

@CeylonMigrationBot
Copy link
Author

[@RossTate] Technical differences aside, the OO-ish incarnation of a Haskell's constraints is bounded polymorphism (which is actually the much older concept). They both restrict a type parameter and with that restriction provide additional abilities to that type parameter. What shapes are is the recognition that some classes/interfaces are only used for constraints, and furthermore essentially all uses of recursive inheritance are only for the sake of constraints.

@CeylonMigrationBot
Copy link
Author

[@pthariensflame] Ok, I think I get it, then. In that case, why implement it as complex restriction at all? Why not have a separate declaration type with an appropriate initial keyword (shape or bound or something) that would handle the whole pattern for you automatically, and then just lock all the other declaration types away from any type-level recursion at all?

@CeylonMigrationBot
Copy link
Author

[@RossTate] That can be done as well. It's just easier to retrofit a restriction onto existing code than impose a change in syntax.

@CeylonMigrationBot
Copy link
Author

[@gavinking] I recently updated this work to current master, it is available in the branch typeconstructors2. To recap the limitations of the current implementation:

  1. No type constructor inference (we have basically no clue how that should work).
  2. No upper bounds on type constructors.
  3. No upper bounds on type constructor type parameters.

For 2/3, what I should be able to do is write stuff like:

void fun<@X>(X<String> strings) 
        given X<T> satisfies Iterable<T> 
        given T satisfies Object {
    for (s in strings) {}
}

And those type constraints should be checked at the call side.

@CeylonMigrationBot
Copy link
Author

[@gavinking]

It would now be a matter of totally trivial syntax sugar to support "type lambdas"

We now have them.

But, even better, over the last few days I repurposed the ability to reason about type constructors to also solve the problem of arbitrary-rank polymorphism. Now, that may not be decidable—Waiting for @RossTate to chime in—so I might need to put in restrictions to limit it to rank-2 polymorphism. But in principle the typechecker now seems comfortable reasoning about arbitrary references to generic functions.

This new facility provides:

  • anonymous type constructors, as above,
  • a type system for references to generic functions, where the type of a generic function ref is a type constructor for a function type,
  • the ability to apply a generic function reference by supplying type arguments, or invoke it be providing value arguments (and inferring the type args!), and
  • anonymous generic functions.

For example, we can write:

<E> => Singleton<E>(E) makeSingleton = Singleton;
Singleton<String>(String) makeStringleton = makeSingleton<String>;
Singleton<String> stringleton = makeSingleton("hello");

Since the type of a generic function can involve bounds on the type parameters of the function, anonymous type constructors can include bounds in their types, for example:

<V> given V satisfies Summable<V> => V(V,V) add = plus;

Note that any code which uses these "anonymous" type constructors or anonymous generic functions can be refactored to use type aliases and ordinary functions.

alias BinaryOp<V> given V satisfies Summable<V> => V(V,V);
BinaryOp add = plus;

@CeylonMigrationBot
Copy link
Author

[@RossTate] @gavinking, can you clarify what change it is you want me to chime in on?

@CeylonMigrationBot
Copy link
Author

[@gavinking] @RossTate Whether we need to prevent rank > 2, because of undecidability.

@CeylonMigrationBot
Copy link
Author

[@RossTate] Oh, we don't.

Though all this is conditioned on not considering <V> => V(V,V) to be a subtype of Float(Float,Float).

@CeylonMigrationBot
Copy link
Author

[@gavinking] @RossTate type constructor types are never subtypes of ordinary types, indeed, at least right now, <V> => V(V,V) is not even an Anything. (That's probably wrong and I should probably make it a subtype of Object. I just forgot to do that.)

@CeylonMigrationBot
Copy link
Author

[@gavinking]

Oh, we don't.

This is actually what I figured. So we support full rank-N polymorphism, even though we don't know of any use for rank>2 ;-)

@CeylonMigrationBot
Copy link
Author

[@RossTate] Sorry, I got mixed up by the notation. Is <V> => V(V,V) a type lambda or a universally quantified type? They're different things and you seem to be using it as both.

@CeylonMigrationBot
Copy link
Author

[@gavinking] It is both.

@CeylonMigrationBot
Copy link
Author

[@gavinking] It is an anonymous type constructor, and it is also the type of a reference to a generic function.

@CeylonMigrationBot
Copy link
Author

[@RossTate] Note to self:

What Gavin's doing is making κ -> * a subkind of *, using ∀ as the conversion. The reason this is probably unambiguous is that in this system all proofs of subkinding are unique, indicating that all ways to convert a type from one kind to another produce the same result.

@CeylonMigrationBot
Copy link
Author

[@luolong] Note to self: that soundbite from @RossTate sounds like a true gobbledygook to me ;)

@gavinking
Copy link
Contributor

Note that something that came up in #7303 is that we don't allow type argument lists in indirect invocations, meaning that stuff like this doesn't compile:

T foobar<T>(T t) => t;
print((foobar)<String>("hello")); //note the parens

One has to write:

T foobar<T>(T t) => t;
print(foobar<String>("hello")); //no parens

I don't see this as a real limitation of the language, and I suspect that it might be rather difficult, if not impossible, to solve the grammar-related problems with unambiguously parsing (foobar)<String>.

So I don't see it as any sort of priority to make this work. AFAICT it's always possible to call a generic function reference directly.

@RossTate
Copy link

Oh this is where I’d seen this idea before! Excellent timing. Apparently Julia does something similar but opposite: it implicitly converts type-lambdas (I.e. higher kinded types) into normal types by using existential quantification, as opposed to using universal quantification as in what @gavinking was describing.

@gavinking
Copy link
Contributor

gavinking commented Jan 21, 2018

@RossTate Right, so I guess the intuition here is that:

  • the set of type functions of form <T> => Whatever<T> is isomorphic to
  • the set of universally quantified types of form forall <T> => Whatever<T>, and also to
  • the set of existentially quantified types of form exists <T> => Whatever<T>.

And that implies, in particular, that forall and exists are well-defined type functions! I had not realized this until now. Not only that, they're bijections, it seems to me, so the type functions exists ◦ forall-1 and forall ◦ exists-1 are well-defined bijections on the domain of types, allowing one to get an existentially-quantified type given a universally-quantified tyoe and vice-versa. (Not sure if that's a useful thing, but it's interesting to me.)

However, in Ceylon, we didn't make use of the existential side of this isomorphism (so far) and, recognizing that type functions don't have any "instances", we felt comfortable identifying the type function with its universal quantification as the "same" type. However, I feel like there's some additional thing which makes this a reasonable thing to do, but I can't quite put my finger on what it is. And I'm not sure that this additional thing is true for existential quantification.

@gavinking
Copy link
Contributor

recognizing that type functions don't have any "instances"

Or, more strictly, recognizing that they are kinds. And then identifying that kind with the associated universally quantified type.

For some reason I have an intuition that there's some asymmetry here and it's not as correct (or as useful) to identify the kind with it's corresponding existentially qualified type. But perhaps that's rubbish and I'm just not used to the idea.

WDYT, @RossTate?

@gavinking
Copy link
Contributor

For some reason I have an intuition that there's some asymmetry here

Ah, right, the asymmetry arises when one considers higher rank generics, and that the type function Map (defined by Map<Key,Item>) and the function reference Map (to Map<Key,Item>(Key key, Item item) could have the same type <K,I> => Map<K,I>. I don't think there's any equivalent useful thing that pops up when dealing with existential types.

@RossTate
Copy link

The "some additional thing" is that, for every thing that you could do with both a higher-kinded type and a simple-kinded type, doing that thing to a higher-kinded type and then implicitly converting it to a simple-kinded type always results in the same thing as first implictly converting that higher-kinded type to a simple-kinded type and then doing that thing to the simple-kinded type.

All that is fancy for "it doesn't matter where you insert necessary casts because you'll always arrive at the same result".

So what works and doesn't work depends on all the constructs in the language. For Ceylon I'd say universal quantification makes more sense, but for Julia apparently existential quantification works out well.

Unrelated: the function reference Map has type Forall K, I. Map<K,I>(K,I), not just Forall K, I. Map<K,I>.

@gavinking
Copy link
Contributor

Unrelated: the function reference Map has type Forall K, I. Map<K,I>(K,I), not just Forall K, I. Map<K,I>.

Yes, sorry, of course.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants