Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile, runtime: GOEXPERIMENT to add two non-pointer words to iface/eface #45494

Open
zephyrtronium opened this issue Apr 10, 2021 · 9 comments
Labels
help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@zephyrtronium
Copy link
Contributor

zephyrtronium commented Apr 10, 2021

Background

Due to GC shape requirements, storing scalar (non-pointer) values into any interface-typed variable forces the value to be stored indirectly, usually allocated on the heap. In some applications, this can lead to many unexpected allocations and extraordinary load on the allocator and garbage collector, causing significant performance degradation. In the worst case, this could be a DOS vector for services that are not extensively optimized, especially those using packages like image or encoding/json.

Interface values are concretely represented as two distinct types: runtime.iface for interfaces with non-empty method sets and runtime.eface for interface{}.

go/src/runtime/runtime2.go

Lines 202 to 210 in 1129a60

type iface struct {
tab *itab
data unsafe.Pointer
}
type eface struct {
_type *_type
data unsafe.Pointer
}

Proposal

In order to reduce allocations in programs using small types in interfaces, I propose adding a value of GOEXPERIMENT, such as GOEXPERIMENT=largeiface, to change runtime.iface and runtime.eface to the following:

type iface2 struct {
	tab  *itab
	data unsafe.Pointer
	sdat [2]uintptr // scalar data
}

type eface2 struct {
	_type *_type
	data  unsafe.Pointer
	sdat  [2]uintptr
}

Then, whenever a type contains no more than one pointer and two scalar words, in any order, among any number of fields plus padding, values of that type may be copied into an iface2 or eface2 value, without being allocated on the heap. If a type contains more than one pointer or more than two scalar words, then only pointers to values of that type are stored when assigned to interface-typed variables. This extends the current behavior, which is the same for zero rather than two scalar words.

Note that these types are named differently from the existing ones. The names iface and eface would not exist in the runtime while the GOEXPERIMENT is enabled, and iface2 and eface2 would not exist while it is disabled. This improves maintainability by ensuring the correct name for the experiment setting is always used.

Examples

With this proposal, the following types would become directly assignable to interface values on all supported targets:

int
int64
string
[]T // for any type T
struct {
	b [8]byte
	p *T
}

// color.(N)RGBA64
struct {
	R, G, B, A uint16
}

// assuming unsafe.Alignof(new(T)) == unsafe.Sizeof(uintptr(0))
// and unsafe.Sizeof(thistype{}) % unsafe.Alignof(new(T)) == 0
struct {
	a uint8
	p *T
	b uint8
}

The following types would remain assignable only indirectly to interface values:

interface{} // too many pointers
[2]*T // too many pointers

// reflect.SliceHeader; too many scalar words
struct {
	Data uintptr
	Len  int
	Cap  int
}

// too many scalar words with padding,
// assuming the compiler never reorders struct fields
struct {
	a uint8
	u uintptr
	b uint8
}

Assignment combinations

This section assumes that the compiler never reorders struct fields.

There are two possible approaches to implement transfers of fields between iface2 (eface2) values and dynamic values, in order to support fields in any order. The first is to add a new uint8 field to runtime._type with three two-bit fields describing whether each successive data field of the iface2 is transferred to the first, second, or third word of the dynamic value, or not transferred at all. This is very simple to implement for both convT2E/I and assertions.

The second approach is to enumerate the 9 unique assignment combinations and store the appropriate combination either in a new uint8 field or in unused bits of the tflag field. This either uses no additional storage or leaves room for additional data about the permutation, such as whether each scalar data field is a floating-point value for more efficient interaction with the new register ABI.

Note that the unique assignment combinations are, denoting a pointer as P and a scalar word as S: (no assignment, i.e. size zero type), P, S, PS, SP, SS, PSS, SPS, SSP. Adding consideration for floating-point values creates two alternatives for each S, increasing the number of combinations to 23. Alternatively, the no-assignment case could be transformed to the P case by storing a reference to runtime.zerobase, reducing the combinations to 8.

Impact

The choice is specifically two scalar words because this is sufficient to avoid allocations for nearly every non-composite type in Go. Booleans, all numeric types except complex128 on targets where uintptr is four bytes, strings, slices, maps, and channels (and function values?) can be stored this way. Of course, qualifying struct and array types are included. Many common implementations of various standard library interfaces, especially color.Color, also fit in this representation.

The fact that these types would no longer force allocations could lead to significant performance improvements within the standard library. There are several alternative implementations of the functionality of packages like fmt, log, and encoding/json which are specifically designed to avoid interfaces for the sake of throughput. This GOEXPERIMENT would likely shorten the gap between standard library and highly optimized APIs from a few orders of magnitude to a few percent for common uses with small types like int and string.

I propose adding this as a GOEXPERIMENT rather than an outright change for two reasons. First, doubling the size of every interface value may significantly penalize some Go programs, especially those running in environments like cloud functions where available memory may be very small. Second, there are several Go repositories (not an exhaustive search; notably, some repositories show up many times in vendor directories but not in this search) that depend on the current layout of interface values, using unsafe or assembly. GOEXPERIMENT provides mechanisms – build tags and assembly definitions – for such code to be updated to work with the experiment both enabled and disabled.

The cost is that, as a GOEXPERIMENT, this will require a significant amount of parallel maintenance. cmd/compile and other compilers that implement the experiment will need to generate different code for interface assignments and conversions, in addition to detecting eligible types and computing their assignment combinations. The runtime, reflect, and internal/reflectlite will need significant duplicated code paths to handle the different layouts. The implementation of atomic.Value in package sync/atomic will need to be duplicated, and sync.Pool will need some minor duplication. Some third-party packages will require duplicated code if they intend to support the experiment.

As an experiment, the goal should be to collect data about the space–performance tradeoff. We should find which programs benefit most from reduced allocations and garbage collection, and how much benefit they experience. Moreover, we should measure the change in memory usage across many programs and find which programs experience unacceptable increases. If the experiment reveals a result like "CPU usage decreases, and memory usage is within a few percent for almost all applications due to fewer spans devoted to small objects," then it could be promoted to the default case.

Related issues

Open issues that this would (entirely or mostly) resolve:

Some related closed issues:

@gopherbot gopherbot added this to the Proposal milestone Apr 10, 2021
@josharian
Copy link
Contributor

Is your intent that this experiment might live on forever, or that at some point we would decide which representation to use and then switch as necessary? I'm reluctant to maintain two representations forever—it ends up being a knob, and Go generally eschews knobs.

@josharian
Copy link
Contributor

Does this struct allocate, in the proposal?

struct {
  a [10]byte
  p *int
  b [2]byte
}

That is, are you proposing to re-pack structs, or only to re-order fields?

How about [2]int64 on a 64 bit system? That requires "splitting" a field into its components.

@zephyrtronium
Copy link
Contributor Author

zephyrtronium commented Apr 12, 2021

@josharian

Is your intent that this experiment might live on forever, or that at some point we would decide which representation to use and then switch as necessary? I'm reluctant to maintain two representations forever—it ends up being a knob, and Go generally eschews knobs.

To be honest, I expect this proposal to be declined primarily because of the amount of work it will be to implement and maintain. It would touch a lot of code. Even finding everything that needs to be updated could be difficult; that was an argument against adding a scalar word to iface/eface in #8405. On the other hand, "it might be hard" is a poor reason not to try, as they say.

The idea of this becoming a knob is a concern for me; it should be an experiment, not the new -O2. Deciding ahead of time that the experiment will exist for some fixed number of releases – perhaps long enough to include a developer survey – is a nice solution.

Does this struct allocate, in the proposal?

[omitted because I feel like I'm being way too long-winded in this thread]

That is, are you proposing to re-pack structs, or only to re-order fields?

This would allocate on all current targets, because the alignment requirement for *int adds padding between a and p and after b, such that there are too many scalar words. I am assuming that pointer types require alignment equal to their size, but I believe that is true everywhere.

The proposal is to reorder fields at run-time when transferring between concretely typed values and interfaces. This does not include packing. Another way to phrase the proposed behavior is to say that a value must be able to be reinterpreted (unsafely, ignoring pointers, but otherwise without loss of data) as [n]uintptr where n is 0, 1, 2, or 3; then the rules about one pointer and two scalar words apply.

How about [2]int64 on a 64 bit system? That requires "splitting" a field into its components.

This would not allocate. In general, I use the term "scalar word" rather than "scalar field," because the assignment combination applies regardless of how a type is divided into fields – it is at the memory level, not the type level. For example, this type would not allocate on 64-bit targets:

struct {
	a, b, c, d  byte
	e, f        int16
	p           *int
	g, h, i, j, k, l, m, n int8
}

It occurs to me now, though, that the alignment of a type might need to be at least that of uintptr for this to apply.

@ianlancetaylor
Copy link
Contributor

I don't see a reason for this to go through the proposal process. The only purpose of an experiment like this would be to see whether it should become the new default implementation. That is really a decision for the runtime and compiler team to make: whether this is worth trying, who will do the work, how to decide whether to adopt the new idea or not.

So I'm going to take this out of the proposal process. (I don't have an opinion as to whether this is a good idea or not.)

Thanks for raising the idea.

@ianlancetaylor ianlancetaylor changed the title proposal: cmd/compile, runtime: GOEXPERIMENT to add two non-pointer words to iface/eface cmd/compile, runtime: GOEXPERIMENT to add two non-pointer words to iface/eface Apr 12, 2021
@ianlancetaylor ianlancetaylor added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. and removed Proposal labels Apr 12, 2021
@ianlancetaylor ianlancetaylor modified the milestones: Proposal, Unplanned Apr 12, 2021
@mdempsky
Copy link
Contributor

This is a pretty involved experiment, touching all of {cmd/compile, runtime, reflect}; but I think it should be accessible to someone sufficiently motivated to learning all of those details. I'm happy to provide advice to anyone who wants to work on it.

@zephyrtronium
Copy link
Contributor Author

I have a couple weeks available, so I'll start on this. If I'm able to get it working, I plan to submit for Go 1.19, since there are already significant changes to the compiler scheduled for 1.18.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/343070 mentions this issue: all: add GOEXPERIMENT=largeiface

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/343071 mentions this issue: runtime: implement largeiface GOEXPERIMENT

@zephyrtronium
Copy link
Contributor Author

Although I haven't been able to test it yet, the runtime's portion of this experiment should be close to finished. Mostly while implementing equality and hashing algs, I established some slightly different rules than I originally proposed for when types can be stored directly in interfaces:

  • Instead of just using the size of a type, use its alignment (up to the pointer size). This way, we can reorder fields straightforwardly without needing to account for variables that can't be transferred as if they are uintptrs. However, this means fewer types can be stored directly; e.g., color.NRGBA64 has 2-byte alignment, so by this rule, it has four words.
  • Floating point values need special treatment for equality and hashing. Even if we were to split the possible combinations by target, there are too many combinations to store in the unused bits in _type.tflag, and the code for performing those algorithms would become monstrous. For now, my solution is to add a new tflag for "cannot be stored directly in an interface, even if it looks like it otherwise," which will apply to any type with floating point fields.

Also, I wasn't sure what to do about runtime-gdb.py. It checks whether a value is an interface by the C-style types of its fields, but I don't know what type it would expect to see for [2]uintptr, and I don't know how to find that out.

I expect the changes to the compiler to be similarly extensive as those to the runtime. Changes to reflect and internal/reflectlite should be simpler. Other areas that will need updates include cmd/link, sync/atomic, and x/tools/go/analysis/passes/asmdecl (to make go vet aware of the different layout of large interfaces). Packages related to cgo might also need to be changed, although I'm not sure to what extent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants