Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: blank types as an alternative to generics #21132

Closed
metakeule opened this issue Jul 23, 2017 · 58 comments
Closed

proposal: Go 2: blank types as an alternative to generics #21132

metakeule opened this issue Jul 23, 2017 · 58 comments
Labels
FrozenDueToAge generics Issue is related to generics LanguageChange Suggested changes to the Go language Proposal v2 An incompatible library change
Milestone

Comments

@metakeule
Copy link

metakeule commented Jul 23, 2017

Reasoning

It has been said before, that adding generics would make Go (both the language and the Compiler) more complex and slower compiling.

The compilation of a function that deals with a generic type would need to be redone for each concrete type the function is used with (including non builtin types) to allow proper allocation etc. That would slow down the compilation of a large code base with many reuses of generic functions a lot. A main selling point for Go is the fast compilation speed which is partly due to incremental compilation. In the case of generics, incremental compilation is not possible, since the concrete type is only known where the generic function is used, not where it is defined.

This proposal goes another way to achieve the same that generics provide: Type safety for functions that deal with types that are unknown in the definition of the function.

But it approaches it a different way: Instead of adding new kinds of "real" types or "template" types, the here called "blank types" are just hints to the compiler to allow it to match them by comparison of their names (within defined scopes) at compile time.

At runtime however they are simply passed as interface{} values wrapping the real values since the type soundness already has been proven. In order to achieve that, some constraints have to be made, so that blank types don't "leak" into the (real) type system.

Blank types would help for some of the use cases that generics are used for, without adding the above mentioned complexity and overhead of generics.

With the given restrictions only the cases are allowed where the blank type does not
affect the generated code and thus eliminate the need to generate multiple versions of a function.

(With the experimental additions 1 and 2 feature parity with generics could be reached.)

Non goals

It has been expected that this proposal would allow to create functions that deal with some subsets of types, like e.g. Add(a,b, Number) Number. However the author thinks that this requirements is better served by union types (in combination with type switches) which also would make a nice addition to Go2, but which is an orthogonal feature that would complement this proposal in a nice way.

Definition

Lacking a better name, we define a blank type as follows:

  • it is a placeholder for a type, represented by a dollar sign followed by a single uppercase letter (e.g. $A, $B etc) (if a better representation could be found, this could change)

  • the name of a blank type is its upcase letter after the "$"

  • it is just checked at compilation time. At runtime its value is passed around like empty interface{} wrapped values since the compiler did already prove type soundness.

  • it may not be used inside a type definition (e.g. type GenericSlice []$A is not allowed, also type $C string is not allowed).

  • A blank type and a "variable" of a blank type, may not be used outside a function.

  • A function using blank types is called a "blank function".

  • The blank type itself may only appear inside the function signature of the blank function. It is not possible to declare a new variable based on a blank type (or a new composed blank type) inside the function body. However "variables" of a blank type may be used as parameters and return values of builtin generic functions (see section "Compilation").

  • interface types might not be passed as arguments ending up as blank type parameters.

Example

func Append(a $A, x []$A) ([]$A) {
        return append(x,a)
}

func main() {
  Append(4, []int{}) // allowed
  Append(4, []interface{}{}) // forbidden
  Append(interface{}{4}, []interface{}{}) // forbidden
}
  • The blank type itself has a lexical scope that is starting with the blank function receiving parameters, continues through the function body and ends with function returning parameters. Example
    func Add1(key string, val $A, m map[string]$A) {
        m[key] = val
        return
    }
    
    func Add2(a $A, x []$A) ([]$A) {
        return append(x,a)
    }

here the blank type $A of Add1 and Add2 are two different blank types, each only valid within the corresponding function.

  • A blank type is a pure lexical feature, i.e. the compiler proves that it is lexically sound. There is no runtime check nor any need to keep track of a blank type within the AST (where there are the same concrete types as before).

  • A blank function might return a normal type or a blank type or any combination of them.

  • A returned blank type might be used as an input to another blank function, but only if the corresponding argument of the receiving function is also a blank type.

  • At runtime, a returned type will be unboxed (from interface{}) to its concrete type if the receiving variable is no blank type (the soundness had already been proven by the compiler).

  • A blank type that is going to be returned is only allowed, if it came to the function as an input argument , example:

func Something(key $A) ($B)

this blank function signature is illegal, since $B does not appear within the receiving arguments.

func Get(i int, []$A) ($A)

this blank function signature is legal

Composition

Blank types have restrictions on composition.

The following composed blank types are possbile arguments to a blank function:

chan $A                   channel of a blank type
[]$A                           slice of a blank type
[3]$A                          array of a blank type
map[T]$A                  map of a concrete type to a blank type (T is any concrete type that is allowed as a key in a map)
                                 map keys of a blank type are not allowed since not every type 
                                 in Go maybe a key to a map.
*$A                            a pointer to a blank type
func([]$A,*$B,string) map[string]$A   a callback function that is a blank function

These are not valid / not allowed:

struct{ Something $A }  a struct containing a blank type
struct{ $A }            a struct embedding any type
[][]$A                  nesting composed blank types inside slices
[3][4]$A                nesting composed blank types inside arrays
map[string][]$A         nesting composed blank types inside maps
map[$A]string                  map of a blank type to a concrete type 

So there is no recursion inside the definition of a blank type and a composed blank type, with the exception of blank functions that may be used as callbacks to blank functions.

Compilation

The lexer/compiler checks at three points:

  1. When a blank function is compiled.
  2. When a blank function is called.
  3. When a blank function returns.

For 2. and 3. the compiler just has to check, if the concrete types matches.
Example:

func Get(i int, a []$A) $A {
    return a[i]
}

func main() {
    var x = []string{"Hello", "World"}
    var y string = Get(2, x)
}

The compiler finds that $A must be a string (since main passes a []string to the []$A argument of Get). Then the returning type of Get is string and so must be the type of the received argument y inside main.

It has to be discussed, if shortcut assignment for the returned arguments should be allowed, since it puts some work on the compiler, so if this would be valid as well:

func main() {
    var x = []string{"Hello", "World"}
    y := Get(2, x) // y would be a string now
}

For 1. the compiler has to check the soundness of the blank function.
A blank function is sound if:

  1. the only kind of arguments it returns and receives are either concrete types, blank types of valid composed blank types,
  2. the name of any blank type that is part of returned arguments also is part of at least a received argument
  3. the body of the blank function is sound

The body of a blank function is sound if the function compiles when the only functions, blank types are passed to are either builtin generic functions (definition see below) or other blank types. Also only blank types may be used inside the function body that are defined in the signature of the received arguments.

The following pseudo code contains all valid expressions where a "variable of a blank type" (that is only true for the parser) may appear. All other expression may not "use a blank type". (see the compiler walkthrough to get more insight).

func (sth interface{}, a []$A, b [3]$B, c map[int]$A, ch chan $A) ($A, interface{}) {
   ch <- a[0] // generic channel send
   a[1] = <- ch  // generic channel receive
    _ := a[0]    // generic slice getter
    a[0] = a[1]  // generic slice setter
    _ := b[1]    // generic array getter
    b[0] = b[1]  // generic array setter
    a = append(a, a[1]) // append
    copy(a,a)           // copy
    len(a)              // length of slice
    cap(a)              // capacity of slice
    a[0] = c[0]     // generic map getter
    c[0] = a[0]    // generic map setter
    len(c)           // length of map
    _a := &a[2]      // generic reference
    *_a = a[3]       // generic dereference 
   y := a[0].(interface{})  // "cast" blank type to an interface. at runtime nothing happens because the concrete value of a is wrapped inside the empty interface{}. but it tells the compiler that the programmer really wants to pass the runtime value and not the blank type.
    return a[2], y      // generic return (y is just an example to show that the values of a blank type still can be passed around at runtime
}

When invoking the generic builtin functions, the blank type is treated by the compiler as if it was a actual type for the scope of the blank function. That means that only an "instance" of a blank type may be passed to a generic setter to set a composed blank type (or to append). Since there no other way to create an instance of a blank type, the only way to get them is from the received arguments, be it directly or through the use of a builtin generic getter.

Composed blank types may never be passed to generic setters.

The other way, blank types and composed blank types may be used, is to be passed to other blank functions. In this case, a blank type must correspond to a blank type in the receiving function and a composed blank type must correspond to a composed blank type of the same kind in the receiving function. Example:

func Append1(a $A, as []$A) ([]$A) {
  return append(as,a)
}

// the name of the blank type does not matter
func Append2(b1 $B, b2 $B, bs []$B) ([]$B) {
  // this is valid
  return Append1(b2, Append1(b1, bs))

  // this is not valid
  return Append1(Append1(b1, bs), b2)
}

Since there could be chains of blank functions it might be helpful for the compiler to generate a helper function for each blank function that returns the resulting concrete types from a concrete input arguments ("type calculation"). That would also help Experimental Addition1 and 2.

The other implications for the compiler, if the experimental additions were considered (which would result in full feature parity with generics) are pointed out in the sections Experimental Addition1 and 2.

Experimental Addition1

The following might be a way to expand. It is not sure if it will work out
(would need further investigation).

Interfaces may contain blank function methods or mixed them with concrete methods. Example

type Finder interface {
    Find(key string, target *$A) (found bool)
}

Also types may be defined on blank functions:

type finder func (key string, target *$A) (found bool)

Then could probably modify the definition of a blank function in the case of an anonymous function where the scope of the name of the blank types in the signature of the anonymous function will be the same as the outer scope, in which the anoymous function is defined, e.g.

func Length(a []$A) func(a []$A) int {
  return func (a []$A) int {
    len(a)
  }
}

func main() {
    var s = []string{"A", "B"}
    lenFn := Length([]string{})
    // s must be []string
    lenFn(s)
}

Here the $A in the signature of Length and in the signature of the anonymous function that is returned by Length is the same, i.e. must be of the same type. However it might be hard for the compiler to prove.

If we'd put these extensions together, we could probably build wrappers around composed blank types like this:

// implements Finder
func (f finder) Find(key string, target *$A) {
    return f(key,target)
}

// New returns a new Finder for the blank type $A which is implemented via the returned anonymous function
func NewFinder(m map[string]$A) Finder {
  return finder( func (key string, target *$A) bool {
    thing, found := m[key]
    if found {
        *target = thing
    }
    return found   
  })
} 

func main() {
    f := NewFinder(map[string]int{"one": 1})
    var val int
    // if val isn't an int we would get an compiler error here
    found := f.Find("one", &val)
}

Experimental Addition2

Again a thought experiment (like experimental addition1).

Lets take the changes of experimental Addition1 and take them a step further:

  • Allow for anonymous blank functions to be fields of structs.
  • Allow to cast a $A to an empty interface (interface{}), which must always succeed

Then we could wrap "generic" structs that deal with interface{} in a type safe way.

Example

package main

import (
 stdlist "container/list"
)

type List interface {
	PushFront($A)
        Front(*$A)
  ....
}

type list struct {
 stdlist *stdlist.List
 pushFront func ($A) // blank callback as field of struct
 front func (*$A)	
}

// now here is where the meat is
// the anonymous callback functions pushFront, front etc
// are defined in a way that makes sure type $A stays the same as when NewList was called 
func NewList(a $A) List {
	l := &list{stdlist: stdlist.New()}
	l.pushFront = func (a $A) {
             l.stdlist.PushFront(a.(interface{}))  // instance of $A casted to empty interface. 
                                                                // at runtime a is just a value wrapped inside 
                                                                // an empty interface and a.(interface{}) does 
                                                                // nothing. it is just there to make the 
                                                                // compiler happy proving type safety
	}
	l.front = func (a *$A) {
             el := l.stdlist.Front()
             *a = el.Value        // the fact that the underlying type of el.Value is  $A 
                                       // is guaranteed by the consistency of the blank callbacks 
                                      // usage of $A and by not exporting the underlying structure.
	}
}

func (l *list) Back(a *$A) {
	l.back(a)
}

func (l *list) Front(a *$A) {
	l.front(a)
}

For this to work without the compiler having to generate code for each type a blank function is used with, we need two things:

  1. Compiler must generate helper code that infers the concrete types of the callback functions from the outer function if this blank type is reused. In the example that would mean for the compiler to do:

    • If a anonymous callback is defined, check if there is a blank type in the signature matching the name of a blank type from the outer scope (the outer function, at the top level there is none) and
    • for each match create a small function that maps the blank type argument from the outer scope to the type of the callback.
      This small code which is independent from the concrete type must be exported (invisible just for the compiler), when compiling a library so that incremental compilation still works and when compiling a dependent package, this code will be used to figure out the concrete type for an instance of a blank callback function.
  2. Then we need the second thing: Have a new property for callback functions that are blank functions: The concrete types that have been set via the code that derived it from the outer scope. So while compiling, this code must be run and attached to the struct field that is the callback function so that the compiler can prove type matching where this callback functions are invoked.

That would probably be the hardest part of the implementation but it would allow full feature parity with generics without code generation for each applied type.

At runtime however all values that pretend to be blank types are de facto like empty interface{} wrapped values (since the compiler did already prove the type safety).

Walk through the compilation steps

The following is by no means the optimal/optimized way to do it, but it provides deeper insight how this proposal would affect compilation.

After lexing and identifying blank functions, the parser checks the scoping rules for all blank functions and calls a special (pre-)compiler that creates a helper function for each blank function. Let's call this helper functions 'type calculators'.

The type calculators just calculates a concrete type of a blank function from input types.

Example (the examples have no practical sense, they are just short to illustrate without distraction)

// this would be the blank function
func Get(key string, m map[string]$A) ($A) {
	return m[key]
}

// this could be the helper function
// again just for illustration, when invoked, non blank types as arguments would be
// ignored and the first blank type would be the first parameter to this function
// b1 would be reflect.TypeOf(map[string]interface{}) when _Get_A is called
func _Get_A(b1 reflect.Type) (t reflect.Type) {
	return b1.Elem()
}

Now let's define "parsedBlankFunc" as an intermediate type for the parser to help type checking when proceeding the invocation of a blank func. Such a parsedBlankFunc would consist of a regular func that is the same code as the blank function but with any blank types replaced by interface{}. Before this replacement takes place, the soundness of the body of the blank function is checked. This has to be done expression by expression while keeping track of the variable that are supposed to have a blank type. There are restriction in which expressions such variables may appear.
(see above). Only the following expressions are allowed:

   a := <- ch      //  generic channel receive; ch must be chan $A and a will be $A
   ch <- a         //  generic channel send; ch must be chan $A and a must be $A
    x := a[0]    // generic slice getter; a must be []$A, x will be $A
    a[0] = b    // generic slice setter; a must be []$A, b must be $A
    ... same for arrays
    a = append(a, b) // append; a must be []$A, b must be $A
    copy(a,b)           // copy; a and b must be []$A
    len(a)              // length of slice; a must be []$A
    cap(a)              // capacity of slice; a must be []$A
    a[x] = y     // generic map setter; a must be map[T]$A, where x is of type T and any of the allowed map keys and y must be $A
    c = a[x]    // generic map setter; a must be map[T]$A, where x is of type T and any of the allowed map keys and c will be $A
    len(c)           // length of map, c must be map[T]$A (as above)
    x := &a      // generic reference; a must be $A, x will be &$A
    *x = a       // generic dereference; x must be &$A and a must be $A 
   y := a.(interface{})  // "cast" blank type to an interface. a must be $A, y will be interface{}
    return a, y      //  return argument types must match the type signature (blank types must have the same "name" ($A for example).

Any other expression where a "blank variable" appears is invalid and leads to a parser error. If the body is valid, all appearances of blank types within the blank function (in the signature and the body) will be replaced with "interface{}" and the resulting function will be reparsed and this reparsed version will be attached to the parsedBlankFunc. (this reparsing step could take place at a later time, e.g. when the rest of the code is parsed).

In addition parsedBlankFunc has a map of type calculators for its blank types out of the given parameters and obviously it has a structure telling it where the blank types have been inside the type signature of the original blank func.

For the scoped blank callback we define a "parsedBlankCallback" as consisting of a parsedBlankFunc and a reference to all outer scoped parsedBlankFuncs and parsedBlankCallback that have blank types of the same name.

Now, after all type calculators, parsedBlankFunc and parsedBlankCallback have been created, the parser now parses the rest of the code tree.

While doing so, whenever a blank func would have been called, the resulting types are calculated via the type calculators for that parsedBlankFunc. When ever a blank callback would have been created, the corresponding parsedBlankCallback is provided with the arguments of each needed surrounding scoped parsedBlankFunc (or parsedBlankCallback) and asked to calculate the types based on the surrounding parameters and based on its own parameter and validate them against each other where the names are matching. Also the returned types are checked against the receiving variable type and code is injected to do the correct unboxing (the types have been validated).

Now the type checks are done, the AST has been build, and the compiler can proceed without having to know about blank types and blank functions at all.

To enable incremental compilation, the parsedBlankFunc and parsedBlankCallback and their type calculators would need to be exported to the build of a non-main package, so that the parser would not have to recreate them.

Compatibility with Go1

Currently dollar signs are not allowed as identifiers.
The change would only allow the new syntax, the old syntax would still work.

Use cases...

Map(), Filter(), Reduce() etc.

With Experimental Addition1 and Experimental Addition2 feature parity with generics could be reached by implementing generic untyped code based on interface{} and then wrap it as shown in Experimental Addition2. This however would make the compilation more complex.

Open questions

It is unclear if the builtin "generic" functions like append, copy etc. could be make to work with blank types, i.e. an append([]$A, $A) must behave like an append([]interface{}, interface{}) although the underlying slice would be []int, []string, whatever at runtime. The same goes for maps. Pretty much every builtin generic function will get some sort of (probably annotated) interface{} value at runtime, while having to deal with the underlying concrete type.

A solution could be to create a generic runtime internal unbox function that unboxes the passed value (coming in as an interface{}) to it's underlying type. This function could then be injected to the code when the parser prepares the parsedBlankFuncs.

Benefits

  • faster compilation times as generics, since incremental compilation is still possible
  • no code generation for each concrete type using a generic
  • no new kinds of types
  • fully backwards compatible

Disadvantage

  • more rules in the spec
  • need to figure out helpful compiler error messages
  • unusual way to write "generic" code

UPDATE: added generic type assertion as a way to deal with blank types and section about further exploration.

UPDATE2: added reasoning section and pointers to use cases

UPDATE3: change section "Further exploration" to "Experimental Addition1"

UPDATE4: add section "Experimental Addition2" and section "Benefits" and more examples

UPDATE5: disallow blank types as keys in a map

UPDATE6: explore more of the reasoning

UPDATE7: be more clear about return arguments of a blank function and keys in maps

UPDATE8: be more clear about unboxing and type calculations

UPDATE9: add open questions section

UPDATE10: removed generic type assertion

UPDATE11: fixed typos and disallowed interface types to be passed as blank types.

UPDATE12: add "Walk through the compilation steps" section

UPDATE13: be more clear about allowed and not allowed expressions containing blank types.

UPDATE14: tell where unboxing happens in compiler walkthrough

UPDATE15: add "non goals" section and idea of internal unbox function to "open questions"

UPDATE16: be explicit about blank types only appearing within function signatures and add chan $A as an allowed composed blank type.

@gopherbot gopherbot added this to the Proposal milestone Jul 23, 2017
@metakeule metakeule changed the title Proposal: Blank types instead of Generics for Go 2 Proposal: Blank types instead of generics for Go 2 Jul 23, 2017
@metakeule metakeule changed the title Proposal: Blank types instead of generics for Go 2 Proposal: Blank types as an alternative to generics for Go 2 Jul 23, 2017
@chmike
Copy link

chmike commented Jul 23, 2017

This presentation is missing to describe WHY this would be an interesting addition to Go. What class of problems can be solve with it ? And what class of problems can't be solved with it.

@Azareal
Copy link

Azareal commented Jul 23, 2017

I don't think it is obvious. I know what problems other generics proposals would solve, the first thing which comes to mind are data-structures. There aren't really any use cases which come to mind for this one.

@ghost
Copy link

ghost commented Jul 23, 2017

In returning arguments only blank types may appear that are also part of the receiving arguments

Is this legal?

func Length(x []$A) int { ... }

or, the more generally seen:

func Foldl(init int, func(ctr int, item $A) int, items []$A) int { ... }

IME, the two common use cases for generics are data structures like trees, as you point out, and the common array/list manipulation functions like foldl() and contains(). What's uncomfortable about this constraint is that some of these functions would be possible (filter(), and some cases of map()) while others are not (fold(), contains(), other cases of map()).

What implementation detail leads you to include this constraint?

@jacek99
Copy link

jacek99 commented Jul 23, 2017

I think this use case

it may not be used inside a type definition (e.g. type GenericMap map[$A]$B is not allowed, also type $C string is not allowed).

is EXACTLY what most people use generics for in other languages. So if it does not support it, I personally thing it's a no-go.

@itsmontoya
Copy link

With this proposal, I still wouldn't be able to avoid codegen for libraries like linked list

@ebergstedt
Copy link

ebergstedt commented Jul 23, 2017

it may not be used inside a type definition (e.g. type GenericMap map[$A]$B is not allowed, also type $C string is not allowed).

It's as if OP wants generics but he doesn't want generics.

Some major cognitive dissonance here. Either do it properly or not at all.

@bradfitz bradfitz added v2 An incompatible library change LanguageChange Suggested changes to the Go language labels Jul 23, 2017
@jacek99
Copy link

jacek99 commented Jul 23, 2017

Exactly. Worst case as an alternative consider reified generics only (like Java), but this seems like a proposal that really does not address the most common use case for generics.

@z505
Copy link

z505 commented Jul 23, 2017

Is one of the use cases of generics to have the exact same functionality as golang's "interfaces" but do the cpu cycles at compile time, instead of run time checks, for type checks and code that changes based on the type being passed in?

I have a hunch some people are skeptical of go's interfaces because they are a run time check when go is a static compile time language. But, you don't want a template/generic system to end up like a C++ mess.

One issue, is anything that uses <> or $ looks like c++ or perl, respectively: but that's not a technical issue, it's an aesthetic one.

For an interesting compile time, non object oriented generic system using a compiler preprocessor, I wrote this little riddle many years ago:
http://wiki.c2.com/?IncludeFileParametricPolymorphism

The same could be done with plain C since it has an include file system. The idea, was that it was a prototype, not meant for production, but to spread ideas on how to implement a non object oriented procedural/functional strongly typed generic/parametric polymorphism system that does not determine types at run time, but compile time.

It might be important to distinguish:

  • what exactly is the difference between generics and parametric polymorphism?
  • are they two of the same thing, but generics often adds functionality primarily to OOP (object orientation)
  • are they similar, but not quite exactly the same, and some ideas collide

I'd prefer a generic system that doesn't require one use an object or struct - i.e. some languages tie the generic system into classes or objects, which, IMO is absurd and ridiculous, since non OOP data passed into regular procedures should also be generic... not just classes/objects. But, maybe that's what parametric polymorphism is. This should be clarified, IMO, in computing science (maybe even by myself, in an official article if I am somehow able to find the answer).

@z505
Copy link

z505 commented Jul 23, 2017

And another use case, when looking for solutions generics solve, might be:

  • abuse of go's empty interface, and solutions to this abuse..

Just some things I have come across anyway. Is it a warning sign when you find code using an empty interface or lots of interfaces.. is a question one might ask

@metakeule
Copy link
Author

metakeule commented Jul 23, 2017

@serussell

Is this legal?

func Length(x []$A) int { ... }

yes this is legal

or, the more generally seen:

func Foldl(init int, func(ctr int, item $A) int, items []$A) int { ... }

yes this is legal

IME, the two common use cases for generics are data structures like trees, as you
point out, and the common array/list manipulation functions like foldl() and contains().
What's uncomfortable about this constraint is that some of these functions would be
possible (filter(), and some cases of map()) while others are not (fold(), contains(),
other cases of map()).

why do you think that map() fold() and contains() is not possible? Can you elaborate?

What implementation detail leads you to include this constraint?

Which constraint exactly?

@turnage
Copy link

turnage commented Jul 24, 2017

I feel that this half measure has comparable costs to generics without the benefits. A good generics implementation can make code smaller (e.g. generic data structures), simpler (e.g. result types, map/filter/reduce), clearer (e.g. interface/trait bounds on types), and safer (static guarantees on the types in all these contexts).

This tackles map and filter (and maybe reduce?) but since it's only lexical it can't provide generic data structures and can't restrict the types which fill in the blank (which invites some surprising error messages, unclear code, or code that compiles and seems correct but isn't). The presence of the feature would also make a real generics implementation more difficult.

@metakeule
Copy link
Author

metakeule commented Jul 24, 2017

@turnage

Finally someone who read the proposal carefully. Bravo!

I feel that this half measure has comparable costs to generics without the benefits.

I appreciate your honesty to express that this is your feeling and not a proven point.
The most other commentators don't have that level of self reflection.
So your feeling is fine. I have the feeling that allows enough so that generics could be avoided without adding much complexity to the compiler. However I would agree, the usage of this feature would be more complex than the usage of generics would.

That is a trade off I made based on the following ideas:

  • The compiler is by nature a complex beast that has to be optimized in different ways. Also in the case of Go, it should be kept as fast as possible. Generics would generate a lot of more work for the compiler (new types, generated code, complex optimizations) and make it really complex. A very important part that contributes to the speed of the Go compiler is that it compiles incrementally. That means, if package A depends on package B and B has been compiled already, no function inside B must be compiled
    again when compiling A. In the case of generics, package B can't know all needed concrete types for its generic functions in advance, so when A uses a generic function with a concrete type, the compiler has to compile the function in question for each concrete type that uses it. So if your code base uses a generic Map() function in 30 packages with 20 different types that same code would be compiled at least 20 times (more probably 30 times because it might be hard to keep track of already compile variants across packages).

The second point why I think this trade off is acceptable is my feeling, that having type independent functions (generics or blank functions) makes only sense if we have n uses of m type independent functions, where n is much larger than m. In other words: much code reuse. And with much code reuse the investment into the code that gets a lot of reuse will pay off even if it is more complex than the usual code.

A good generics implementation can make code smaller (e.g. generic data structures), > simpler (e.g. result types, map/filter/reduce), clearer (e.g. interface/trait bounds on
types), and safer (static guarantees on the types in all these contexts).

You've pointed out the main use case of this proposal (and from there we have all the examples), that are map/filter/reduce kind of functions.
This proposal would allow for a smaller code/simpler base in this cases and clearer code when it comes to the usage of a blank function. It also make this kind of code type safe (because it guarantees type identity, in contrast to interface{}).

This tackles map and filter (and maybe reduce?) but since it's only lexical it can't
provide generic data structures and can't restrict the types which fill in the blank
(which invites some surprising error messages, unclear code, or code that compiles
and seems correct but isn't).

The lack of restrictions is a valid point. However that could be provided by union types which is a feature orthogonal to blank types.

Can you give me an example, in which case you would expect "surprising error messages, unclear code or code that compiles and seems correct but isn't"?

Thanks for taking the time, trying to understand the proposal.

@metakeule
Copy link
Author

@ all

Updated the proposal to add ideas of how a full feature parity with generics could be reached.

@rogpeppe
Copy link
Contributor

What would the following program (legal according to your specification AFAICS) print?

package main
import "fmt"

func Identity(a $A) $A {
	return a
}

func main() {
	ident := Identity
	a := ident(56)
	b := ident(nil)
	fmt.Printf("%T %T %T\n", ident, a, b)
}

@metakeule
Copy link
Author

metakeule commented Jul 24, 2017

@rogpeppe

It would print

func(interface {}) interface {} int <nil>

Since at runtime a blank type is just an empty interface. (And the returning values are unboxed).

@metakeule
Copy link
Author

metakeule commented Jul 24, 2017

@rogpeppe

If we would forbid the assignment shortcut in Go2 (":=") which would be a different proposal, then it would probably be easier for the compiler to check the type soundness and infer the resulting type in your example.

@rogpeppe
Copy link
Contributor

rogpeppe commented Jul 24, 2017

OK, that's something your proposal doesn't specify (it would need to).

Next: what would this print?

func Set(a []$A, i int, x $A) {
	a[i] = x
}

func main() {
	a := make([]int, 3)
	Set(a, 1, 99)
	fmt.Printf("%T %v", Set, a)
}

@nv-vn
Copy link

nv-vn commented Jul 24, 2017

It has been said before, that adding generics would make Go (both the language and the Compiler) more complex and slower compiling.

Checking that "blank types" work in context is equivalent to typechecking generics. While having to generate and then link more functions would occur if you used template-style generics, this is effectively the same idea as just making generics equivalent to Object in Java.

This proposal goes another way to achieve the same that generics provide: Type safety for functions that deal with types that are unknown in the definition of the function.

Generics provide this, in addition to possible performance gains. Why throw those performance gains out the window if you're doing all the work to get there anyways?

At runtime however they are simply passed as interface{} values wrapping the real values since the type soundness already has been proven. In order to achieve that, some constraints have to be made, so that blank types don't "leak" into the (real) type system.

How is there a real and fake type system at play here? There is only one type system, and that's what's exposed to the user. If you write func f(a $A), you're doing typechecking for the type $A. The runtime type is separate from the type system -- obviously your runtime has no real notion of how a generic type works, because that information is irrelevant at runtime. However, the two simplest choices you have are either to compile to interface {} or to the type $A is being used as. One provides a runtime performance gain, while the other provides a slightly shorter compilation time. In terms of how the type system works, these two are identical.

A blank type is a pure lexical feature, i.e. the compiler proves that it is lexically sound. There is no runtime check nor any need to keep track of a blank type within the AST (where there are the same concrete types as before).

I think this should be "a pure semantic feature," since the compiler is proving that the semantics make sense -- lexical soundness would just mean that the syntax can be parsed pretty much. If it were purely lexical, it would effectively be a comment placed on an interface {} type with some syntactic sugar. Either way, types in general are a purely semantic feature. Runtime types are an "exception" to this rule if you must call it that, but Go as a language is statically typed making this irrelevant. As an example of something similar, Haskell typechecks polymorphic code statically, but most implementations will leave in some information to help it resolve ad-hoc polymorphism at runtime (i.e. typeclasses).

Blank types would help for some of the use cases that generics are used for, without adding the above mentioned complexity and overhead of generics.

Unless you don't consider typechecking to be part of the overhead of generics, this is 100% adding that overhead into Go.

To me it seems that the only difference between what you're doing and classic generics is that you're artificially limiting the feature to functions.

@chmike
Copy link

chmike commented Jul 25, 2017

Checking that a blank function is sound is not so simple. According to the operations performed on the blank typed values in the function, the allowed types would be restricted. For instance, calling a method or applying an arithmetic, logical or binary operation on it. It would not be a simple lexical sugar.

Another problem with this proposal is that the user would have to look at the function implementation to see what type can be used with the blank function. This makes developpment more painfull. There is much more to keep in (brain) memory to be able to use them. That is not good because it means more stress on the brain.

I'm still looking forward to see a real use case where the absence of generics in Go is a real problem.

@metakeule
Copy link
Author

@nv-vn

Thanks for your comment. It helped improving the proposal which resulted in the last updates.

Please read the new section "Walk through the compilation steps" that provides a deeper insight. I have made some small changes (UPDATE10 and 11) which were errors I discovered when trying to answer your questions.

The BNF would have rules for a blank type and a blank function where a blank function is any function that has a blank type in its type signature. Blank types only appear in type signatures (see UPDATE10) and carry no specific semantic meaning (no value) other than their name (is pratically like nil: always its own identity).

That's why a called it a lexical feature. Feel free to suggest a better wording.

However when it comes to the scoping rules, there is only one rule to check:
A blank type in the return parameter must come from input parameter of either the same function or some function surrounding it. This is something that can't be checked by the lexer and obviously has some semantic and context.

But I think it will become more clear, if you read the walkthrough.

Regarding runtime optimization / performance it should perform the same way as now if you are using the empty interface{} where you want the generalization (see container/list example). So compared to Go1 it would add type safety.

@metakeule
Copy link
Author

@chmike

I've added a compilation walkthrough to help understanding where there complexity for the compiler lies. Please check. (Also made some updates on the way, 10 and 11).

Concerning what you said about type restrictions, I think you misread it somehow:
It is practically the other way round: The parser checks the body of the blank function erroring out if a non legal application on a blank type is "performed" (since the blank type is just a placeholder there is nothing really performed on it, it is just an expression of an intent). The allowed expressions are

_ := a[0]    // generic slice getter
    a[0] = a[1]  // generic slice setter
    _ := b[1]    // generic array getter
    b[0] = b[1]  // generic array setter
    a = append(a, a[1]) // append
    copy(a,a)           // copy
    len(a)              // length of slice
    cap(a)              // capacity of slice
    a[0] = c[0]     // generic map getter
    c[0] = a[0]    // generic map setter
    len(c)           // length of map
    _a := &a[2]      // generic reference
    *_a = a[3]       // generic dereference 
   y := a[0].(interface{})  // "cast" blank type to an interface. at runtime nothing happens because the concrete value of a is wrapped inside the empty interface{}. but it tells the compiler that the programmer really wants to pass the runtime value and not the blank type.
    return a[2], y      // generic return (y is just an example to show that the values of a blank type still can be passed around at runtime

I probably failed to make clear that any other expression containing a "variable" of the "blank type" (which it is only while parsing, not while compiling) is not allowed and will result in a compilation error. I will update the proposal soon to make it more clear.

So arithmetic, logical or binary operations are not allowed on a parameter passed in as a blank type. As simple as that. I think you are heading to common functions for numbers of any kind. To enable that, I would suggest to add union types to Go2 (which is orthogonal to this proposal) and then use the restrictions of the union types in combination with a type switch to get that behavior.

I think now it should have become clear that the use has not to keep anything in his head to use the blank functions other than the signature and that there is no restriction what type can be used with it. It is just a restriction which expressions may appear inside the body of a blank function.

@metakeule
Copy link
Author

metakeule commented Jul 25, 2017

@rogpeppe

it would output the same as

package main

import (
	"fmt"
)


func Set(a interface{}, i int, x interface{}) {
	a.([]int)[i] = x.(int)
}

func main() {
	a := make([]int, 3)
	Set(a, 1, 99)
	fmt.Printf("%T %v", Set, a)
}

see the "Open questions" section which addresses similar aspects.
My answers are based on the runtime behavior. However it could make sense to keep track of the original type signature of a blank function for convenience and to print it out with fmt. That would be more of a cosmetic decision and does not affect the rest of the proposal.

@metakeule
Copy link
Author

@rogpeppe

Maybe a generic runtime internal "unbox" function could help with something like a.([]int) and x.(int) in the example above.

This unbox function would then be injected by the parser when preparing the parsedBlankFuncs.

@metakeule
Copy link
Author

metakeule commented Jul 25, 2017

@nv-vn

The real type system is the one that can be inspected at runtime. There is no blank type at runtime; it is just interface{}.

@go-li
Copy link

go-li commented Jul 26, 2017

* is not a kleene star

it is the same thing like *$A in your proposal. a generic pointer to arbitrary type

no i just have an old document from time when it was just an idea:
https://docs.google.com/document/d/1sOjHJY1uAN2plaxEgHUNeT7C1xDsfLmxCVcWpmOG2CQ

@metakeule
Copy link
Author

@go-li

If it is a generic pointer doesn't that mean that it is always on the heap?
Would lead to bad performance, if I understand correctly.

@rogpeppe
Copy link
Contributor

rogpeppe commented Jul 26, 2017

You can't declare temporary variables based on blank types, because any "instance" of a blank type must be passed to the function as an argument.

Where's that specified?

You should be able to send to a channel based on a blank type. Thanks for pointing me there - I missed it. Will add chan $A as a composed blank type. (done)

Are you saying that:

 func Foo(a $a) $a {
      var b $a
      b = a
      return b
  }

is not allowed but:

 func Foo(a $a) $a {
      c := make(chan $a, 1)
      c <- a
      return <-c
  }

is?

@metakeule
Copy link
Author

metakeule commented Jul 27, 2017

@rogpeppe

Where's that specified?

I implied it at some point, without being explicit (thanks for the hint). Did add the following while answering your question:

"The blank type itself may only appear inside the function signature of the blank function."

func Foo(a $a) $a {
      c := make(chan $a, 1)
      c <- a
      return <-c
  }

is not allowed, but the following is (principle is: "bring in your types"):

func Foo(a $A, c chan $A) $A {
      c <- a
      return <-c
  }

@rogpeppe
Copy link
Contributor

@metakeule
OK, so it's not possible to do anything with the parameters other than pass them to something else.
I can't make slices of them; I can't start a goroutine and pass the values into it; I can't put a bunch of them into a map. This really makes this proposal almost completely useless IMHO.

For example, your proposal wouldn't even allow the writing of a replacement for the append language built-in - something that's only built in because it's not possible to write a generic function to do it.

I think you should take this one back to the drawing board.

@metakeule
Copy link
Author

@rogpeppe

You can build type safe wrappers around generic types and functions dealing with empty interfaces. This way you can implement anything, although not at the same speed as the native type. I don't think this is pointless.

So an append build this way would be based on []interface and wrapped and would probably not be as fast as the builtin append.

But this proposal does not aim to replace existing builtin generic functions but to complement them.

@rogpeppe
Copy link
Contributor

rogpeppe commented Aug 7, 2017

@metakeule
"You can build type safe wrappers around generic types and functions dealing with empty interfaces."

We can do that currently. How does this proposal help?

@target-san
Copy link

Type erasure in fact? Java programmers already have trouble with it.

@metakeule
Copy link
Author

@rogpeppe

"You can build type safe wrappers around generic types and functions dealing with
empty interfaces."

We can do that currently. How does this proposal help?

It helps in that you don't need to build a type safe wrapper for each type, but just once for all types.

@rogpeppe
Copy link
Contributor

It helps in that you don't need to build a type safe wrapper for each type, but just once for all types.

It doesn't look like that from your comments above. For anything that involves actually naming the "blank" type, you'd need to build a type safe wrapper for that specific type.

@metakeule
Copy link
Author

@rogpeppe

Can you give an example? I am not sure, if I understand you properly...

@rogpeppe
Copy link
Contributor

@metakeule As a very simple example, try to write an implementation of "append" that works for all slice element types and doesn't use the language built-in append function.

@metakeule
Copy link
Author

@rogpeppe

here we go

package main

import (
	"fmt"
)

type Slice interface {
	Append($A)
}

type slice struct {
	data []interface{}
	appendFn func ($A)
}

// now here is where the meat is
// the anonymous callback function appendFn
// is defined in a way that makes sure type $A stays the same as when NewSlice was called 
func NewSlice(a $A) Slice {
	s := &slice{
		data: []interface{},
	}

  // not that it would make sense to append this way, just for demonstration
	s.appendFn = func (a $A) {
		_old := s.data
		_new := make([]interface{}, len(_old)+1)
		for i, o := range _old {
			_new[i] = o
		}
		_new[len(_old)] = a.(interface{})
		s.data = _new
	}
}

func (s *slice) Append(a $A) {
	s.appendFn(a)
}


func main() {
	s := NewSlice("") // slice of strings
	// s.appendFN is now func (string) for (pre)compilation time, 
        // if not set (nil) -> compilation error
        
        s.Append("hello world") // should compile
        s.Append(42) // should not compile
}

@ianlancetaylor
Copy link
Member

@metakeule Thanks for the example, although I don't fully understand it. In particular I don't understand how you can use make([]interface, l) to build a []string. The types interface{} and string are not the same size. I don't see how this can work.

Or, if the underlying data type is always []interface{}, then this doesn't seem satisfactory. It lets you build a []interface{} that only ever stores string values, and it enforces that restriction at compile time. But that is not really good enough, because storing string values in a []interface{} will cause a lot of extra memory allocation.

Maybe that is all this proposal (which I don't fully understand) is intended to provide, but I think a good generics implementation needs more than that.

@rogpeppe
Copy link
Contributor

rogpeppe commented Aug 15, 2017

+1 to @ianlancetaylor's remarks. But, additionally, the example seems to directly contravene one of the rules stated in the proposal:

it may not be used inside a type definition

There are at least two type definitions in the example that contain a blank type.

@metakeule
Copy link
Author

metakeule commented Aug 18, 2017

@ianlancetaylor @rogpeppe

We are almost there ;-)

First of all: This is an "alternative" to generics, not an "implementation" of generics.
It is there to solve aspects of what people expect from generics, namely the ability to create generical functions and to have a type safe way to make generical structures (trees etc) (that is my simplified summary of it). The later is provided via a solution that creates type safe wrappers over types that use interface{} of some sort to hold the data.

To value the proposal, please compare it to the current state. We now have several places where interface{} is used within the standard library and where performance apparently is no issue, but type safety is. The example provides a solution to the type safety issue.

For performance critical application you can always use code generation like before.

That being said, in the given example, there is room for an optimized solution with []string as well. That would be the in the generical function department of the proposal:

func Append(a $A, x []$A) ([]$A) {
        return append(x,a)
}

But @rogpeppe asked for a solution without using the built-in append and so I provided one (that is suboptimal, but an illustration to get a deeper understanding of the proposal).

Hope now it is clearer: 2 ways of generic features:

  1. optimizable type safe wrapper functions that make use of the already built in generic functions.
  2. based on this: type safe structures that wrap around types having data in interface{} format.

@rogpeppe

There are at least two type definitions in the example that contain a blank type.

The exception to the rule is: they are allowed inside function signatures. Please suggest a better wording. What I meant to say is:

type Slice interface {
	Append($A)
}

An interface is seen as a collection of function signatures.

type slice struct {
	data []interface{}
	appendFn func ($A)
}

A callback function that is a property of a struct would count as a function signature.

In both cases the function signature is just informal (represented by some []byte in the exported types) and will only be used in the precompilation step when the blank types are checked (it may also be shown be some reflection helpers, that is to be decided). From the runtime perspective

type Slice interface {
	Append($A)
}

would behave as

type Slice interface {
	Append(interface{})
}

and

type slice struct {
	data []interface{}
	appendFn func ($A)
}

as

type slice struct {
	data []interface{}
	appendFn func (interface{})
}

@dc0d
Copy link

dc0d commented Dec 17, 2017

Context

I've developed a tool for writing generic/reusable code. In short it allows a code be used as a template, for a specialized/concrete type-safe code. It preserves the specialized parts, the modifications made to the origin. For example it is possible to redefine type T = interface{} as type T = int and even after running go generate the specializations would be kept as they are.

After sometime using it, from the observations, I came to this conclusion that something that is not parameterized, can not be generalized. It may sound obvious, but rephrasing it this way, brings interesting facts into light.

Sample Case, Generic Sortable Slice

The code that is used as the generic/template code is:

package sortable

type Sortable []interface{}

func (s Sortable) Len() int           { return len(s) }
func (s Sortable) Less(i, j int) bool { panic("N/A") }
func (s Sortable) Swap(i, j int)      { s[i], s[j] = s[j], s[i] }

The Sortable is a slice of empty interface. And it implements the sort.Interface interface. As it can be seen, it is not possible to provide a generic definition for Less(...) method. Because it is not known how to compare two different items of the slice.

So it is needed to override the Less(...) method, in a specialized/concrete implementation. For example, this a specialized version; a sortable slice of string:

// Code generated by github.com/dc0d/goreuse. DO NOT EDIT.
//go:generate goreuse -o strslice.go -prefix str -rn strSlice+=Sortable -rn Less+=Less sample/pkg/sortable

package main

import "strings"

type strSlice []string

func (s strSlice) Len() int { return len(s) }

func (s strSlice) Less(i, j int) bool { return strings.Compare(s[i], s[j]) == -1 }

func (s strSlice) Swap(i, j int) { s[i], s[j] = s[j], s[i] }

In this code the parameter type of slice is changed from interface{} to string and the Less(...) method is overwritten.

Conclusion

If Sortable was a generic type like Sortable<T> (which in turn, implies that the built-in slice has to be generic), how it would be possible to override Less(...) method? Embedding Sortable (or Sortable<T>) is not desirable, since it breaks the built-in syntax and functions for working with slices (such as a[1:3] or the append(...) function). Maybe it can be solved only by providing a default implementation and inherit from that implementation (inheritance? virtual and abstract members?).

Back to something that is not parameterized, can not be generalized; as it can be seen, adding generic type parameter at type level, breaks Go at many different points. Maybe the thing that needs to be parameterized is not the type. We can have parameterized blocks or parameterized packages (or parameterized header files maybe), and then it is possible to generalized those parts.

@gladkikhartem
Copy link

gladkikhartem commented Dec 20, 2017

This is really embarrassing... generics should make things obvious and clear, rather than complex.
I mean it's Go philosophy, that is the reason we love that language.
I would prefer not to have generics in Go at all, rather than having ugly solutions with community abusing them.

Proper code generation could replace most of the issues described and provide much more powerful tool than generics themselves without any huge drawbacks.
The only reason people do not use it is because it's integration with go tools and IDE's suck. I wish we could generate code inline in future...

I'm currently using this when I need generics.
It's stupid, It's brutefoce, but it's clear and it works.

// for  <T>,<V>  =>  int,int64 | int, int | float, int | float, float

func Sum<T>With<V> (t <T>, v <V>) <T> {
  return t + v
}
// end

and you can choose whatever syntax you love

// for  $T, $V  =>  int,int64 | int, int | float, int | float, float

func Sum$TWith$V (t $T, v $V) $T {
  return t + v
}
// end

And if you have errors like assigning int = nil - go build will show you that during compile time, not at runtime.

@metakeule
Copy link
Author

@dc0d

Two remarks:

  1. I think, general implementations that need specialized method overrides (like "Sortable") are not a valid use case for generics. Instead, embed the generic parts and use a properly defined interface. Yes there are situations where you wish you had inheritance, but that is another discussion.

  2. This is not about an implementation of generics, but about an alternative that offers some of the benefits of generics.

@metakeule
Copy link
Author

@gladkikhartem

There is no sane way to maintain generated code and have it shared across projects IMHO.

@gladkikhartem
Copy link

gladkikhartem commented Jan 2, 2018

@metakeule
You don't need to maintain it if you generate it during build process.
So that generated code does not appear anywhere other than in debugger and executable.

@metakeule
Copy link
Author

metakeule commented Jan 2, 2018

@gladkikhartem

You mean, you never have a bug to fix inside the code template? Good luck in finding all the places where you used it. With a package dependency, you either can rename a package and do a build or do a tentative update.

@gladkikhartem
Copy link

gladkikhartem commented Jan 2, 2018

@metakeule
When you have bug in a package - generic or not - you have to rebuild all projects importing that package anyway.
Almost every compiled language implements generics by code generation internally and lives happily.

What I'm proposing is to make this code generation explicit and integrated into build process to allow more than just generics.

If you're talking about code snippet I've shown above in original message - it's true that it sucks, because that's something I'm currently using to overcome generics problem - not something I'd like to see in Go.

What I'd like to see in Go is

package mysort

func Sort<T>By<V> ([]<T>) {
     // use T and V as plain code text that will be replaced during build
     // for example <T>.<V> or "<T><V>" or <T>[<V>] 
}

where compiler determines types used based on function name template

import "mysort"

type People struct {
    Name string
    Age int
}

mysort.SortPeopleByAge(peopleArray)

Such approach allows you to write powerful, still explicit generics.

@metakeule
Copy link
Author

@gladkikhartem that reminds me of a preprocessor/makro system and AFAIK the Go developers already did decide against this to keep the usage of the compiler simple and the speed of the compiler fast and predictable (can of worms).

@ianlancetaylor ianlancetaylor changed the title Proposal: Blank types as an alternative to generics for Go 2 proposal: Go 2: blank types as an alternative to generics Feb 27, 2018
@ianlancetaylor
Copy link
Member

Whether this is an implementation of generics or an alternative to generics, this is clearly closed related to the general topic of generics, which is discussed over in #15292. Closing this issue in favor of that one.

@golang golang locked and limited conversation to collaborators Feb 27, 2019
@bradfitz bradfitz added the generics Issue is related to generics label Oct 29, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge generics Issue is related to generics LanguageChange Suggested changes to the Go language Proposal v2 An incompatible library change
Projects
None yet
Development

No branches or pull requests