proposal: reflect: allow creation of recursive types #39528

TheCount · 2020-06-11T12:52:39Z

Introduction

In #20013, @ben-clayton asked for runtime creation of recursive structs. The analogous issue arises with several other constructions which are legal at compile time, e. g.:

type A [42]*A
type C chan C
type F func(F) F
type M map[string]M
type P *P
type S []S

While these specific definitions are not as useful at the top-level as a struct, combinations can be, e. g.

type L struct {
  Value interface{}
  Next *L
  Snitch chan L
}

or

type T []struct{ Edge interface{}; Out T }

The issue is related to #16522, where @crawshaw proposes an API to create a named type with a method set. In that issue, @cosmos72 observed that the proposed API would not allow for the creation of recursive types, and also referenced https://golang.org/pkg/go/types/#Named, where recursive named types can be represented in a two-step process. In #16522, it was also observed that the implementation of a runtime method set is met with certain challenges, so this proposal will concentrate on recursive types while leaving the door open for the core concern of #16522.

Proposal

This issue proposes an API for a two-step process of runtime recursive type creation similar to, but not quite analogous to https://golang.org/pkg/go/types/#Named. The first step is to create an incomplete named type which can only be used in certain restricted contexts until it is completed in the second step by binding the name to an underlying type specification, in analogy to a compile-time type declaration.

// Named creates a new incomplete type with the specified valid name.
// An invalid name causes Named to panic.
//
// The resulting type can be used as an argument to ArrayOf, ChanOf, FuncOf, MapOf,
// PtrTo, SliceOf, and StructOf whenever the analogous compile-time type specification
// in a type declaration would result in a valid recursive type. Invalid combinations
// cause Named to panic. E. g., passing an incomplete type to ArrayOf is invalid,
// but passing an incomplete type to PtrTo and then that type to ArrayOf is valid.
//
// Any such type, including the incomplete named type itself, may not be used in any
// other context until it has been completed with Bind. A future version may relax
// this requirement (e. g., allow use in MakeFunc before calling Bind).
func Named(name string) Type

// Bind binds the specified incomplete named type to underlying akin to a valid type
// declaration and returns a completed version of named. The underlying type may
// indirectly reference the named type, but no
// other incomplete type created with Named. A Bind which represents an illegal
// type declaration panics.
//
// Currently, adding a method set to named is not supported and methods must be nil.
// This limitation may be lifted in a future version.
func Bind(named Type, underlying Type, methods []Method) Type

(edit: changed return value of Bind)

Here, the methods would only be added at the Bind stage (in a future version), so that they can refer to named.

Open questions

What is a valid name for Named? The spec suggests letter { letter | unicode_digit }, but that may be expensive to check. Also, should the name be the name of an exported type?
What should the PkgPath() method of the created type return? Probably an empty string, but could also be a pseudo package name where all the types created at runtime end up in.
What should the Kind() method of the named type return before Bind? Of the existing kinds, only Invalid makes sense. Could introduce a new kind, or use the tflags to mark the type incomplete.
(edit: added) goroutine safety: probably the new type won't be intrinsically goroutine-safe until Bind has been called on it.

The text was updated successfully, but these errors were encountered:

cosmos72 · 2020-06-11T14:31:27Z

Good to hear that I am not the only one interested in adding this feature! :)

Some days ago I had almost exactly the same idea, with some minor differences:

// create a new, incomplete, named Type with Kind = Invalid.
// the parameter 'pkgpath string' is stored in the type, but does not grant any special permission:
// for example, creating a type in package "runtime" does not give it access to the package's
// unexported symbols.
// TBD: decide: accept or reject pkgpath+name equal to a pre-existing type?
func NewNamed(pkgpath string, name string) Type 

// must be called exactly once before Complete()
func SetUnderlying(named Type, underlying Type)

// cannot be implemented yet because of API/ABI limitations... I would keep this for later
// func AddMethod(named Type, method Method)

// must be called exactly once to "freeze" the type and make it immutable,
// internally it will mark the type as complete and make it fully usable as any other reflect.Type
func Complete(named Type)

I think it's better than having a single "do-everything" function

func Bind(named Type, underlying Type, methods []Method)

for three reasons:

each function above has a single purpose
it's more similar to go/types API
it allows to postpone the API to add methods until there is a technical solution to actually implement them. I am not fond of adding an API which cannot currently be implemented (the methods []Method part)

If this proposal (or the above one, or something similar enough) is accepted, I can help implementing it

TheCount · 2020-06-12T10:58:26Z

@cosmos72 The creation process is already quite involved, what with the invening calls to the PtrTo and SomethingOf functions, so I don't see a problem splitting my Bind up into the functions in your proposal.

The important thing, I think, is that there is a point when the new type is final (my Bind or your Complete); I've added an open question regarding goroutine-safety to my original proposal.

Maybe the next logical step is to see whether our proposals can be implemented without too much hassle.

TheCount · 2020-06-12T21:04:47Z

I had a cursory glance at reflect/type.go and learned that while the dynamic type of a valid Type is always *rtype, the object behind the pointer can be different in size for different types, so I think from an implementation standpoint, it's easier if Bind returns a completely new type instead of trying to alter the named argument. I've edited the proposal accordingly.

cosmos72 · 2020-06-12T21:44:19Z

Yes, reflect.Type internals are quite complicated because they are heavily optimized to reduce memory usage and the number of distinct allocations.

There is also a property of reflect.Type that would cause a lot of problems with your last suggestion - namely, that Bind() should return a completely new type instead of trying to alter the named argument:

https://golang.org/pkg/reflect/#Type states:
"Type values are comparable, such as with the == operator, so they can be used as map keys. Two Type values are equal if they represent identical types."

I hope I'm wrong, but trying to imagine how Named() and Bind() interact with such guarantee, I think you will see that having around two objects representing the same type:

one returned by Named() and possibly used to construct other types, for example with PtrTo() or one of the *Of() functions
the other returned by Bind()

causes many conflicts with such guarantee

TheCount · 2020-06-13T07:11:22Z

True. One possible solution is that this guarantee simply does not hold until Bind returns. So the Type returned by Named would be just a placeholder, to be used in the *Of functions only, not to be used as map key or in any other context. After Bind, the placeholder cannot be used any more, the new returned Type has to be used instead.

This also means that Bind has to walk recursively through the underlying type and replace all references to the placeholder with the real thing, possibly updating the type cache on the way.

I guess a similar situation arises with goroutine-safety. Probably the involved types do not become goroutine-safe until Bind returns.

cosmos72 · 2020-06-13T10:10:24Z

What about Go 1.x compatibility promise ?
I am quite sure it also covers the statement cited above:

https://golang.org/pkg/reflect/#Type
"Type values are comparable, such as with the == operator, so they can be used as map keys. Two Type values are equal if they represent identical types."

which you suggest to break.

That's why I preferred your original proposal:
it required internal changes to reflect.Type to allow creating a *reflect.rtype without knowing in advance whether it will describe a struct, map, func or one of the other possible kinds - and instead deciding it later, with Bind or SetUnderlying.
As a consequence, the current technique allocating in a single step reflect.rtype plus its auxiliary (kind-dependent) information could probably not be reused in this case, and an additional layout would probably be needed - a kind of "rtypeUnion" with the ability to contain all the information needed by the different kinds.
But whatever the required changes are - and they could be complicated - they would be internal to the package reflect, and not visible from outside.

I am worried that your idea "Bind has to walk recursively through the underlying type and replace all references to the placeholder with the real thing, possibly updating the type cache on the way"

is quite expensive CPU-wise
falls short on preserving Go compatibility promise, because the placeholder type could be already in the hands of user code, thus breaking the guarantee that Type is comparable with ==

TheCount · 2020-06-13T14:41:37Z

What about Go 1.x compatibility promise ?

Is there any way to gain access to a reflect.Type returned by Named which is outside the control of the caller of Named? If not, I'm not worried about compatibility issues. Named and Bind are a new API with a special, restrictive contract: users of this API have to keep the reflect.Type returned by Named and all derived types close, but once Bind has returned, the returned type as well as the derived types enjoy all the usual guarantees and can be given to other user code. The named argument must never get into the hands of user code and should be thrown away and forgotten after Bind, though.

This way, older source will continue to compile and run as laid out in the compatibility guidelines.

I am worried that your idea "Bind has to walk recursively through the underlying type and replace all references to the placeholder with the real thing, possibly updating the type cache on the way"

is quite expensive CPU-wise

I've had another look at the code and apparently these caches are for the benefit of the *Of functions only. So I don't think the cache has to be touched. That should make the process no more CPU intensive than the work a compiler has to do when it parses a recursive type definition.

If my assumption above about being able to keep the return value of Named close is not correct, then things become difficult. I don't think a "rtypeUnion"-approach would work well (at least not without revamping the internal type system) as some types may require a lot of space (for example, the tail of a function type is an array of *rtype pointers representing the all argument and return types) and you would have to allocate space for the largest possible type. In this case, it would be better if there was some way of asking the gc to explicitly resize a specific block and update all pointers to it, or may do allocate a lot of space and shrink later, without updating any pointers. This is just guesswork, though. I have no clue how the internal go memory system works.

ianlancetaylor · 2020-06-13T20:33:21Z

I'm not sure what you mean by keeping a value close. Certainly once a program has a reflect.Type created by reflect.Named it can call reflect.Zero(t).Interface() to get a value of type interface{}, and the program can do anything with the value, including passing it some other package. That other package can then call reflect.TypeOf on the value to retrieve the reflect.Type created by reflect.Name. For example, this would likely happen if the value were passed to encoding/json.

TheCount · 2020-06-14T08:01:38Z

I'm not sure what you mean by keeping a value close. Certainly once a program has a reflect.Type created by reflect.Named it can call reflect.Zero(t).Interface() to get a value of type interface{}, and the program can do anything with the value, including passing it some other package. That other package can then call reflect.TypeOf on the value to retrieve the reflect.Type created by reflect.Name. For example, this would likely happen if the value were passed to encoding/json.

The idea is that there is a special contract for callers of reflect.Named. Callers promise they won't, directly or indirectly, pass the returned reflect.Type, or any type derived from that reflect.Type via reflect.PtrTo or one of the reflect.KindOf functions, to any functions or methods defined in the reflect package, except as prescribed in the description of reflect.Named. Only the return value of reflect.Bind is no longer bound by this special contract and may be passed around freely.

Now, this contract is not enforceable at the language level. If someone wants to shoot themselves in the foot, they certainly can pass a reflect.Type returned by reflect.Named to reflect.Zero — and suffer the consequences (for a deterministic outcome, the implementation could raise a panic in such a case). It's more of a C-style contract, "if you don't do as prescribed here, the behaviour is undefined".

Is such a contract acceptable for the Go community? From your and @cosmos72's comments I get the notion the answer is "no". Is that correct? If so, we need an API which avoids these "special" reflect.Type instances altogether. I'll write another proposal later.

cosmos72 · 2020-06-14T12:48:22Z

The idea is that there is a special contract for callers of reflect.Named. Callers promise they won't, directly or indirectly, pass the returned reflect.Type, or any type derived from that reflect.Type via reflect.PtrTo or one of the reflect._Kind_Of functions, to any functions or methods defined in the reflect package, except as prescribed in the description of reflect.Named. Only the return value of reflect.Bind is no longer bound by this special contract and may be passed around freely.

Also, operator== would not have the usual meaning on the Type returned by Named: it's an "incomplete type" different from anything else, including from the completed version of itself.

Also, the Type returned by Named should not be passed to Go code that is not aware of such special contract - which means all existing Go code that (potentially) deals with reflect.Type

All in all, it seems an API easy to use incorrectly, and such misuses are difficult to detect.

That's why I prefer the initial proposal, where Bind or SetUnderlying simply modify the first argument - i.e. the Type returned by Named - instead of creating a new Type.

Until now, a valid identifier check was only required for isValidFieldName. For golang#39528, a valid identifier check will also be required for type names. Updates golang#39528.

Introduce two new functions, Named and Bind, to the reflect API to create possibly recursive, named types. Fixes golang#39528.

TheCount · 2020-06-14T14:23:20Z

Also, operator== would not have the usual meaning on the Type returned by Named: it's an "incomplete type" different from anything else, including from the completed version of itself.

Also, the Type returned by Named should not be passed to Go code that is not aware of such special contract - which means all existing Go code that (potentially) deals with reflect.Type

All in all, it seems an API easy to use incorrectly, and such misuses are difficult to detect.

That's why I prefer the initial proposal, where Bind or SetUnderlying simply modify the first argument - i.e. the Type returned by Named - instead of creating a new Type.

OK, but even with an in-place modification, the reflect.Type returned by reflect.Named would not be allowed to be passed to most functions before reflect.Bind has returned. That's why I wanted to write another proposal to avoid this problem altogether.

FWIW, I implemented the proposal in this issue here: https://github.com/TheCount/go/tree/issue-39528. If you think it's worthwhile pursuing, maybe you can try to change my implementation to an in-place modification (I don't know how to do that, I guess you'd have to somehow interact with the memory management system).

cosmos72 · 2020-06-15T17:23:36Z

OK, but even with an in-place modification, the reflect.Type returned by reflect.Named would not be allowed to be passed to most functions before reflect.Bind has returned.

True, an incomplete type can only be used in few cases. Essentially only to build other types - calling other function/methods with it will (and should) panic.

That's why I wanted to write another proposal to avoid this problem altogether.

I understand the reason, but it had the side effect of creating a reflect.Type which is and always remains incomplete - not just for a limited time.

FWIW, I implemented the proposal in this issue here: https://github.com/TheCount/go/tree/issue-39528. If you think it's worthwhile pursuing, maybe you can try to change my implementation to an in-place modification (I don't know how to do that, I guess you'd have to somehow interact with the memory management system).

I have a draft implementation using in-place modification (UPDATE: published at https://github.com/cosmos72/go/commit/b010cd5c67dac0856fea156d7e6d6c3b0e3f1d77)
The in-place modification itself is relatively simple... the hard part is tracking all types derived from the newly created one abd completing them as soon as possible (not implemented yet):
after

var t = reflect.Named("foo")
var arrayOfT = reflect.ArrayOf(42, t)
reflect.Bind(t, reflect.TypeOf(int(0)))

both t and arrayOfT must be updated by computing their size, GC information and mark them as complete.
And with mutually recursive types it gets more complicated.

I'd say it's time to wait for feedback from Go team on whether either proposal is acceptable.

TheCount · 2020-06-15T19:14:07Z

That's why I wanted to write another proposal to avoid this problem altogether.

I understand the reason, but it had the side effect of creating a reflect.Type which is and always remains incomplete - not just for a limited time.

Small misunderstanding here. I meant I wanted to write (yet) another proposal which avoids even the problem of the limited incompleteness between reflect.Named and reflect.Bind. Just haven't got around to doing that yet.

TheCount · 2020-06-15T19:34:00Z

I have a draft implementation using in-place modification (UPDATE: published at cosmos72@b010cd5)
The in-place modification itself is relatively simple... the hard part is tracking all types derived from the newly created one abd completing them as soon as possible (not implemented yet):
after
var t = reflect.Named("foo")
var arrayOfT = reflect.ArrayOf(42, t)
reflect.Bind(t, reflect.TypeOf(int(0)))
both t and arrayOfT must be updated by computing their size, GC information and mark them as complete.
And with mutually recursive types it gets more complicated.

In your draft, you define your wrapper type as

type wrapperType struct {
	rtype
	uncommon uncommonType
	under    *rtype
}

How do you go from this to the memory layout of the finished type as the runtime expects it? For example, a function type in memory would look like (pseudocode):

struct {
  rtype
  numIn, numOut uint16
  uncommonType
  [numIn+numOut]*rtype
}

Since you can't know numIn+numOut at the time you allocate the memory for wrapperType, it's difficult to allocate the correct amount for it. (Asking the caller of NewNamed for a hint and panicking in Complete if the hint was wrong might be a practical if slightly awkward solution.)

cosmos72 · 2020-06-15T20:48:27Z

In your draft, you define your wrapper type as
type wrapperType struct {
	rtype
	uncommon uncommonType
	under    *rtype
}
How do you go from this to the memory layout of the finished type as the runtime expects it? For example, a function type in memory would look like (pseudocode):
struct {
  rtype
  numIn, numOut uint16
  uncommonType
  [numIn+numOut]*rtype
}
Since you can't know numIn+numOut at the time you allocate the memory for wrapperType, it's difficult to allocate the correct amount for it. (Asking the caller of NewNamed for a hint and panicking in Complete if the hint was wrong might be a practical if slightly awkward solution.)

That's not necessary. A patched implementation of reflect.Type.Field would be:

func (t *rtype) Field(i int) StructField {
	if t.Kind() != Struct {
		panic("reflect: Field of non-struct type " + t.String())
	}
	if t.tflag&tflagWrapper != 0 {
		t = (*wrapperType)(unsafe.Pointer(t)).underlying("Field")
	}
	tt := (*structType)(unsafe.Pointer(t))
	return tt.Field(i)
}

The idea is to forward (almost) any call to the underlying type, which has the correct layout.

[UPDATE] slightly more complete implementation published at https://github.com/cosmos72/go/tree/issue-39528

TheCount · 2020-06-18T19:34:22Z

The idea is to forward (almost) any call to the underlying type, which has the correct layout.

OK, I think I understand now. You introduce a completely new "proxy" layout. That means of course checking and updating all parts of the Go source code where a pattern like

var something interface{}
…
explicit := *(**expectedLayout)(unsafe.Pointer(&something))

is used. Possibly also third-party tools. But nothing covered by the Go compatibility promise.

[edit: fixed pattern]

rsc · 2020-06-24T17:49:27Z

Partially constructed types are definitely a major disadvantage. Then everything that accepts a reflect.Type has to guard against them. Given that you've opened the other issue, let's close this one as a duplicate of #39717.

ianlancetaylor changed the title ~~reflect: allow creation of recursive types~~ proposal: reflect: allow creation of recursive types Jun 11, 2020

gopherbot added this to the Proposal milestone Jun 11, 2020

gopherbot added the Proposal label Jun 11, 2020

ianlancetaylor removed the Proposal label Jun 11, 2020

gopherbot added the Proposal label Jun 11, 2020

ianlancetaylor added this to Incoming in Proposals (old) Jun 11, 2020

This was referenced Jun 13, 2020

reflect: StructOf doesn't support recursive structs #20013

Open

proposal: reflect: add method Type.Underlying() #39574

Closed

TheCount added a commit to TheCount/go that referenced this issue Jun 14, 2020

reflect: allow creation of recursive types

5b6a938

Introduce two new functions, Named and Bind, to the reflect API to create possibly recursive, named types. Fixes golang#39528.

TheCount mentioned this issue Jun 19, 2020

proposal: reflect: allow creation of recursive types atomically #39717

Open

rsc closed this as completed Jun 24, 2020

rsc moved this from Incoming to Declined in Proposals (old) Jun 24, 2020

rsc mentioned this issue Jun 24, 2020

proposal: review meeting minutes #33502

Open

golang locked and limited conversation to collaborators Jun 24, 2021

gopherbot added the FrozenDueToAge label Jun 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: reflect: allow creation of recursive types #39528

proposal: reflect: allow creation of recursive types #39528

TheCount commented Jun 11, 2020 •

edited

Loading

cosmos72 commented Jun 11, 2020 •

edited

Loading

TheCount commented Jun 12, 2020

TheCount commented Jun 12, 2020 •

edited

Loading

cosmos72 commented Jun 12, 2020 •

edited

Loading

TheCount commented Jun 13, 2020

cosmos72 commented Jun 13, 2020 •

edited

Loading

TheCount commented Jun 13, 2020

ianlancetaylor commented Jun 13, 2020

TheCount commented Jun 14, 2020

cosmos72 commented Jun 14, 2020 •

edited

Loading

TheCount commented Jun 14, 2020

cosmos72 commented Jun 15, 2020 •

edited

Loading

TheCount commented Jun 15, 2020 •

edited

Loading

TheCount commented Jun 15, 2020 •

edited

Loading

cosmos72 commented Jun 15, 2020 •

edited

Loading

TheCount commented Jun 18, 2020 •

edited

Loading

rsc commented Jun 24, 2020

proposal: reflect: allow creation of recursive types #39528

proposal: reflect: allow creation of recursive types #39528

Comments

TheCount commented Jun 11, 2020 • edited Loading

Introduction

Proposal

Open questions

cosmos72 commented Jun 11, 2020 • edited Loading

TheCount commented Jun 12, 2020

TheCount commented Jun 12, 2020 • edited Loading

cosmos72 commented Jun 12, 2020 • edited Loading

TheCount commented Jun 13, 2020

cosmos72 commented Jun 13, 2020 • edited Loading

TheCount commented Jun 13, 2020

ianlancetaylor commented Jun 13, 2020

TheCount commented Jun 14, 2020

cosmos72 commented Jun 14, 2020 • edited Loading

TheCount commented Jun 14, 2020

cosmos72 commented Jun 15, 2020 • edited Loading

TheCount commented Jun 15, 2020 • edited Loading

TheCount commented Jun 15, 2020 • edited Loading

cosmos72 commented Jun 15, 2020 • edited Loading

TheCount commented Jun 18, 2020 • edited Loading

rsc commented Jun 24, 2020

TheCount commented Jun 11, 2020 •

edited

Loading

cosmos72 commented Jun 11, 2020 •

edited

Loading

TheCount commented Jun 12, 2020 •

edited

Loading

cosmos72 commented Jun 12, 2020 •

edited

Loading

cosmos72 commented Jun 13, 2020 •

edited

Loading

cosmos72 commented Jun 14, 2020 •

edited

Loading

cosmos72 commented Jun 15, 2020 •

edited

Loading

TheCount commented Jun 15, 2020 •

edited

Loading

TheCount commented Jun 15, 2020 •

edited

Loading

cosmos72 commented Jun 15, 2020 •

edited

Loading

TheCount commented Jun 18, 2020 •

edited

Loading