Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reflect: map iteration does unnecessary excessive allocation for non-pointer types #32424

Open
ugorji opened this issue Jun 4, 2019 · 7 comments

Comments

Projects
None yet
5 participants
@ugorji
Copy link
Contributor

commented Jun 4, 2019

Given a map[string]*T or map[uint16]T or many other types, iterating via reflection is expensive, because we will allocate a new value on the heap each time for very many types.

In a call to reflect.MapKeys, MapIndex, MapIter.Key, MapIter.Value, the implementation includes the following code:

func copyVal(typ *rtype, fl flag, ptr unsafe.Pointer) Value {
	if ifaceIndir(typ) {
		// Copy result so future changes to the map won't change the underlying value.
		c := unsafe_New(typ)
		typedmemmove(typ, c, ptr)
		return Value{typ, c, fl | flagIndir}
	}
	return Value{typ, *(*unsafe.Pointer)(ptr), fl}
}

The ifaceIndir call returns true for very many types, including string, intXXX, uintXXX, bool, []uint8 ([]byte), etc.

This causes an allocation to be done when retrieving each map key or value.

Iterating through a large map[int][]byte for example, will cause an allocation for each key or value retrieved from the map.

Without reflection, this doesn't happen. And it need not happen.

It has the effect of causing an explosion in memory use of my benchmarks. Here, I have a map[string]*codec.stringUint64T with about 8192 entries. And I have other maps in here also.

Sample benchmark result:

Benchmark__Json_______Encode-8   	     625	   7503851 ns/op	     432 B/op	       4 allocs/op
Benchmark__Json_______Encode-8   	     297	  15948731 ns/op	  528784 B/op	   32879 allocs/op

The first result was manual using range operator, while the second result was via reflection. The number of allocation jumped from 4 to 32879 per op.


To illustrate, I ran my benchmark which includes a map[string]*codec.stringUint64T with about 8192 entries.

go test -run Nothing -tags "alltests codecgen" -bench "__Json_*En" -benchmem -benchtime 1x  | grep -v -E "^(goos:|goarch:|pkg:|PASS|ok)"

I then instrumented the copyVal function to validate it.


var copyValMap = make(map[*rtype]bool)
var copyValMapNoIface = make(map[*rtype]bool)

// copyVal returns a Value containing the map key or value at ptr,
// allocating a new variable as needed.
func copyVal(typ *rtype, fl flag, ptr unsafe.Pointer) Value {
	if ifaceIndir(typ) {
		if _, ok := copyValMap[typ]; !ok {
			println("copyVal: ifaceIndir: ", typ, typ.String())
			copyValMap[typ] = true
		}
		// Copy result so future changes to the map won't change the underlying value.
		c := unsafe_New(typ)
		typedmemmove(typ, c, ptr)
		return Value{typ, c, fl | flagIndir}
	}
	if _, ok := copyValMapNoIface[typ]; !ok {
		println("copyVal: NO   iface: ", typ, typ.String())
		copyValMapNoIface[typ] = true
	}
	return Value{typ, *(*unsafe.Pointer)(ptr), fl}
}

copyVal: ifaceIndir:  0x14d8880 string
copyVal: ifaceIndir:  0x14d7b00 int64
copyVal: ifaceIndir:  0x14d2380 []uint8
copyVal: NO   iface:  0x1521500 *codec.stringUint64T
copyVal: ifaceIndir:  0x14d8bc0 uint16
copyVal: NO   iface:  0x15213c0 *codec.TestStruc
copyVal: ifaceIndir:  0x1553820 codec.TestStruc

I then instrumented copyVal again to track the number of times that the allocation happened.
It was as expected (>32000 times).

var copyValMap = make(map[*rtype]int)
var copyValMapNoIface = make(map[*rtype]int)

// copyVal returns a Value containing the map key or value at ptr,
// allocating a new variable as needed.
func copyVal(typ *rtype, fl flag, ptr unsafe.Pointer) Value {
	if ifaceIndir(typ) {
		copyValMap[typ]++
		if copyValMap[typ]%1000 == 0 {
			println("copyVal: ifaceIndir: ", typ, typ.String(), copyValMap[typ])
		}
		// Copy result so future changes to the map
		// won't change the underlying value.
		c := unsafe_New(typ)
		typedmemmove(typ, c, ptr)
		return Value{typ, c, fl | flagIndir}
	}
	copyValMapNoIface[typ]++
	if copyValMapNoIface[typ]%1000 == 0 {
		println("copyVal: NO   iface: ", typ, typ.String(), copyValMapNoIface[typ])
	}
	return Value{typ, *(*unsafe.Pointer)(ptr), fl}
}

copyVal: ifaceIndir:  0x14d8960 string 32000
copyVal: NO   iface:  0x15215e0 *codec.stringUint64T 32000

What version of Go are you using (go version)?

$ go version
go version devel +1531192272 Mon May 27 17:58:39 2019 +0000 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/bt/526m392x16x8y149jwx28mkm0000gp/T/go-build760118559=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

What did you expect to see?

What did you see instead?

@ugorji

This comment has been minimized.

Copy link
Contributor Author

commented Jun 4, 2019

See attached

github_golang_issue_32424_test.go.txt

Please rename the file to .go, and run as below:

mv github_golang_issue_32424_test.go.txt github_golang_issue_32424_test.go
go test -tags "32424" -run Nothing -bench 'Map(Range|ReflectIter|ReflectIndex)' -benchmem  -cpuprofile cpu.out -memprofile mem.out -memprofilerate 1

Got results:

BenchmarkMapRange-8          	   89374	     12812 ns/op	       0 B/op	       0 allocs/op
BenchmarkMapReflectIter-8    	     660	   1805001 ns/op	   16546 B/op	    1026 allocs/op
BenchmarkMapReflectIndex-8   	     576	   2146006 ns/op	   41088 B/op	    1026 allocs/op

For an operation that shouldn't have any allocations, we have a lot.

$ go tool pprof mem.out 
(pprof) top 2
Showing nodes accounting for 38.34MB, 93.02% of 41.21MB total
Dropped 163 nodes (cum <= 0.21MB)
Showing top 2 nodes out of 20
      flat  flat%   sum%        cum   cum%
   22.47MB 54.52% 54.52%    22.47MB 54.52%  reflect.copyVal
   15.87MB 38.50% 93.02%    26.51MB 64.32%  reflect.Value.MapKeys

The allocation is mostly from copyVal (which I traced as being called for each time we convert a map key or value into a reflect.Value, where we call unsafe_New because interfaceIndir(t) returns true.

For Reflect using MapIndex, the call to MapKeys allocates a slice with same length as the map, which is a reason for why MapIter is better. But that (MapIter) still has allocation because of the copyVal function call.

As you see above, with normal range, there is no allocation. And the allocation is completely because the copyVal code does a seemingly unnecessary allocation for composing a reflect.Value from the map key or value.

@ugorji

This comment has been minimized.

Copy link
Contributor Author

commented Jun 4, 2019

From my understanding, the reason for this is that reflection works off interface{}, and so we MUST create a pointer for each non-pointer that is wrapped in an interface.

I am hoping there is a way to create an API that works around this. For example, we could pass in a reflect.Value to each call to MapIter.Key() or Value(), and that would be used to pass on Set the value on each iteration?

@randall77

This comment has been minimized.

Copy link
Contributor

commented Jun 4, 2019

From my understanding, the reason for this is that reflection works off interface{}, and so we MUST create a pointer for each non-pointer that is wrapped in an interface.

Correct.

I am hoping there is a way to create an API that works around this. For example, we could pass in a reflect.Value to each call to MapIter.Key() or Value(), and that would be used to pass on Set the value on each iteration?

That's a possibility. That would be a pretty ugly API though.

In general there are lots of places (not just map iteration) where reflect is slow because we're putting non-pointers in interfaces. We've historically come down on the side of cleaner API even though there might be faster ways with an uglier API.

@josharian

This comment has been minimized.

Copy link
Contributor

commented Jun 4, 2019

For single byte things, we could use runtime.staticbytes. For non-large zero-valued things we could use runtime.zeroval. I don’t know how common those cases are.

@ugorji

This comment has been minimized.

Copy link
Contributor Author

commented Jun 4, 2019

@randall77

I am hoping there is a way to create an API that works around this. For example, we could pass in a reflect.Value to each call to MapIter.Key() or Value(), and that would be used to pass on Set the value on each iteration?

That's a possibility. That would be a pretty ugly API though.

In general there are lots of places (not just map iteration) where reflect is slow because we're putting non-pointers in interfaces. We've historically come down on the side of cleaner API even though there might be faster ways with an uglier API.

It's just that I can make a workaround for all other types because I control their location in memory i.e. for a (u)int(|8|16|32|64), float(32|64), bool, string, I can put these in a struct, slice or array or standalone variable, and make an addressable reference. For slices/array (pointers), all elements are already addressable (in general).

However, for maps, because the "slot" can be reused and reclaimed under the hood, we do not expose the allocated space and MUST always re-allocate to expose the entries as an interface{}.

It is thus the case that only map iteration has no workaround.

This is why the option of passing a reference is the only workaround specifically for maps.

I am currently looking to mimic such an interface, leveraging unsafe and cull'ing from the reflection codebase, for API below:

//- look up key, and set val (expecting settable)
//- if return false, then no entry for that key
mapIndex(mapp reflect.Value, key reflect.Value, val reflect.Value) bool
//- creates an iterator, using key and val (expecting both settable) to set the next entries on each iteration
// - if any of key or val is invalid, treat as if a _ was passed to "for ... range"
// 
//- if Next() returns true, then key and val have been set to the next iteration key/value pair
//- if Next() returns false, no more entries
mapIter(key, val reflect.Value) *mapIter
(*mapIter).Next()

IMO, it is not an inelegant API. If retrofited to the current reflection API, it would look like:

// Look up key, and set its mapping into val, returning true if mapping exists
Value.MapIndexUsing(key, val reflect.Value) bool
// Create an iterator, using key and val for each key/value mapping
// Treat as _ if invalid (zero value) allowing us simulate for k, _ = range; _, v = range; k, v = range
Value.MapRangeUsing(key, val reflect.Value) *MapIter

Again, I personally do not see this as an inelegant or unweildly API.

My usecase is the go-codec encoding library: https://github.com/ugorji/go. This is the one place where code-generation is far and away more performant than reflection-based.

Others "encoders" in the space e.g. json-iterator went the way of completely re-writing the reflection package, which I think is extremely fraught with peril: see https://github.com/json-iterator/go https://github.com/modern-go/reflect2 . But they have gotten a lot of mileage doing this, gaining performance equivalent to static codebases.

@randall77

This comment has been minimized.

Copy link
Contributor

commented Jun 4, 2019

We could add methods to *reflect.MapIter:

// SetKey does dst.Set(it.Key()).
func (it *MapIter) SetKey(dst Value) { ... }

(and the same for values)

I think that would provide an allocation-free path.

It seems strange to me that we'd provide this optimized path for map iteration, and not for other things that construct a Value. Just scanning value.go, you could make a case for *Recv, Select, Convert, MapIndex, Zero, and Append.

@ugorji

This comment has been minimized.

Copy link
Contributor Author

commented Jun 4, 2019

Nice - the methods on *reflect.MapIter work nicely!

I would love to have for MapIndex also, as that is a common use-case. That is the Value.MapIndexUsing(key, dst reflect.Value) reflect.Value above.

Specifically regarding Zero, I always assumed that it would be allocation-free. I thought it would be sufficient to return a readonly non-addressable and non-settable static value as the zero value. That way, every caller gets the same one, but they cannot do much with it except set something else with it or use it readonly.

Regarding Append, there is a somewhat allocation free path with AppendSlice (as you can pass a slice of non-pointers). You can also pass Value which are addressable elements of a slice. So - there is not much of a gap here.

The remaining gaps will be the chan ones (Recv, Select) and Convert. Those are seemingly less common in typical usage (in my limited experience - YMMV). I also do not have a common use for them and do not have any good ideas of how to present it on the API.

@bradfitz bradfitz added the Performance label Jun 4, 2019

ugorji added a commit to ugorji/go that referenced this issue Jun 8, 2019

codec: add (unsafe) alloc-free support for map iteration 1.12+
We do this by introducing safe and unsafe 1.12+ variants for
map iteration and map indexing.

This is necessary due to golang/go#32424
- Map Iteration using reflection always causes allocation
  because every key and value might have to be converted to an
  interface{} first, meaning scalars may be allocated on the heap.
- Map Indexing also causes allocation

We workaround both by implementing a copy-free allocation
where a holder value is passed into the indexing or iterating operation,
and we "copy" the value into that pointer.

Also, improve heuristics of decInferLen

Also, fix lint, staticcheck and other issues raised via static analysis.

Clean out xdebugf comments and remove much old commented code

Finally, make more changes to assist the bounds check elimination part of compiler.

@dmitshur dmitshur added this to the Go1.14 milestone Jun 10, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.