Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: large literal maps excessively contribute to binary size #71937

Open
tdewolff opened this issue Feb 24, 2025 · 8 comments
Open

cmd/compile: large literal maps excessively contribute to binary size #71937

tdewolff opened this issue Feb 24, 2025 · 8 comments
Assignees
Labels
BugReport Issues describing a possible bug in the Go implementation. compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@tdewolff
Copy link

Go version

go version go1.24.0 linux/amd64

Output of go env in your module/workspace:

AR='ar'
CC='gcc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='g++'
GCCGO='gccgo'
GO111MODULE=''
GOAMD64='v1'
GOARCH='amd64'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/home/taco/.cache/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/home/taco/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build2465725287=/tmp/go-build -gno-record-gcc-switches'
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMOD='/home/taco/go/src/github.com/tdewolff/crm/go.mod'
GOMODCACHE='/home/taco/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/taco/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/lib/go'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/taco/.config/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/lib/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.24.0'
GOWORK=''
PKG_CONFIG='pkg-config'

What did you do?

Compile a program with dependencies with large literal maps (global or not)

What did you see happen?

A literal map with ~2000 entries inflated the final binary size with ~525kB

What did you expect to see?

Assuming that a gob encoded version of the map is an efficient representation of the map, I assume that the binary should inflate by approximately that much. The gob encoded version of that map was 52kB, or a 1/10th of what actually happens.

What is the reason a map cannot be stored literally in the final binary instead of initializing each value separately and causing a huge (implicit) init() function? I'm sure there's a good reason, but is there room for improvement? Besides, would inserting a large number of entries "at once" be algorithmically faster than inserting one by one (thinking about parallels with priority queues / heaps)?

See #20095 for a similar issue. Also related partly to #6853. May fix or improve #19751.

See yuin/goldmark#469 (comment) for a case study.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Feb 24, 2025
@randall77
Copy link
Contributor

Maps have a hash seed that is chosen at startup dynamically (to avoid collision attacks). So the layout of a map changes from run to run; we cannot lay it out statically.

If the large literal map has all constant entries, it should be loaded from a []struct{key;value} array. That's about as compact as we could get (without a decompression step).
If the entries are not constant, then it is initialized with code (a bunch of m[k] = v assignments), which uses substantially more binary space.

Can you post your example (or a shrunken version, say ~100 representative entries)? It would be interesting to see what case you're hitting above.

@gabyhelp gabyhelp added the BugReport Issues describing a possible bug in the Go implementation. label Feb 24, 2025
@adonovan
Copy link
Member

adonovan commented Feb 24, 2025

Even large arrays can contribute to large binary sizes. For example, this 104 entry table in the edge package adds 120 bytes of ARM code per line. 5KB of source generates 13KB of code.

The total information content of each row is a triple (index, type argument, string). The indices are worth zero bits, because they are sequential; the type argument is a choice among 55, and the strings are a choice among 47 values, so a total of 12 bits. If it wasn't important for each line to have a unique breakpoint, the compiler could reduce the entire thing to a loop over a compact table of about 200 bytes with two side tables of 55 reflect.Types and 47 strings respectively, for a total of 1.8KB.

var fieldInfos = [...]fieldInfo{
	Invalid:               {},
	ArrayType_Elt:         info[*ast.ArrayType]("Elt"),
	ArrayType_Len:         info[*ast.ArrayType]("Len"),
	AssignStmt_Lhs:        info[*ast.AssignStmt]("Lhs"),
	AssignStmt_Rhs:        info[*ast.AssignStmt]("Rhs"),
	BinaryExpr_X:          info[*ast.BinaryExpr]("X"),
...
	0x0128 00296 	L0193	PCDATA	$0, $-1
	0x0128 00296 	L0193	MOVD	$golang.org/x/tools/internal/astutil/edge..dict.info[*go/ast.AssignStmt](SB), R0
	0x0130 00304 	L0193	MOVD	$go:string."Lhs"(SB), R1
	0x0138 00312 	L0193	MOVD	$3, R2
	0x013c 00316 	L0193	CALL	golang.org/x/tools/internal/astutil/edge.info[go.shape.*uint8](SB)
	0x0140 00320 	L0193	MOVD	R0, golang.org/x/tools/internal/astutil/edge..autotmp_3-56(SP)
	0x0144 00324 	L0193	STP	(R1, R2), golang.org/x/tools/internal/astutil/edge..autotmp_3-48(SP)
	0x0148 00328 	L0193	STP	(R3, R4), golang.org/x/tools/internal/astutil/edge..autotmp_3-32(SP)
	0x014c 00332 	L0193	STP	(R5, R6), golang.org/x/tools/internal/astutil/edge..autotmp_3-16(SP)
	0x0150 00336 	L0193	PCDATA	$0, $-3
	0x0150 00336 	L0193	MOVWU	runtime.writeBarrier(SB), R3
	0x0158 00344 	L0193	PCDATA	$0, $-1
	0x0158 00344 	L0193	PCDATA	$0, $-2
	0x0158 00344 	L0193	CBZW	R3, 372
	0x015c 00348 	L0193	MOVD	$type:golang.org/x/tools/internal/astutil/edge.fieldInfo(SB), R0
	0x0164 00356 	L0193	MOVD	$golang.org/x/tools/internal/astutil/edge.fieldInfos+168(SB), R1
	0x016c 00364 	L0193	MOVD	$golang.org/x/tools/internal/astutil/edge..autotmp_3-56(SP), R2
	0x0170 00368 	L0193	CALL	runtime.wbMove(SB)
	0x0174 00372 	L0193	LDP	golang.org/x/tools/internal/astutil/edge..autotmp_3-56(SP), (R3, R4)
	0x0178 00376 	L0193	LDP	golang.org/x/tools/internal/astutil/edge..autotmp_3-40(SP), (R5, R6)
	0x017c 00380 	L0193	LDP	golang.org/x/tools/internal/astutil/edge..autotmp_3-24(SP), (R7, R8)
	0x0180 00384 	L0193	STP	(R3, R4), golang.org/x/tools/internal/astutil/edge.fieldInfos+168(SB)
	0x018c 00396 	L0193	STP	(R5, R6), golang.org/x/tools/internal/astutil/edge.fieldInfos+184(SB)
	0x0198 00408 	L0193	STP	(R7, R8), golang.org/x/tools/internal/astutil/edge.fieldInfos+200(SB)
	0x01a4 00420 	L0193	MOVD	golang.org/x/tools/internal/astutil/edge..autotmp_3-8(SP), R3
	0x01a8 00424 	L0193	MOVD	R3, golang.org/x/tools/internal/astutil/edge.fieldInfos+216(SB)
...

@randall77
Copy link
Contributor

Yes, if each entry (in a map, or a slice) involves a function call to get the entry's contents, then it is going to be a lot of generated code.
(In Alan's example, it would be more compact to make a slice of (idx,func,string) triples (which are constants and the compiler can put it in a data section) and put a loop in an init that calls the function with the string argument and writes the result to the entry. I don't think the compiler can do that transformation automatically.)

@adonovan
Copy link
Member

I don't think the compiler can do that transformation automatically.

Yeah, I meant a fantasy compiler could implement a loop "un-unrolling" optimization when it sees a sequence of statements each of the form table[i] = f(k1, k2), and build the table of k1 and k2. Of course, you would lose the ability to step through the sequence, and the backtraces would suffer too.

@tdewolff
Copy link
Author

Thanks for the responses. You can find an example of the case study I did for the goldmark library for Markdown parsing and formatting. You can find here an example of a large map that is initialized when needed: https://github.com/yuin/goldmark/blob/master/util/html5entities.go Presumably, he does this to avoid an initial delay at program startup, but it does nothing to reduce the binary size.

If the large literal map has all constant entries, it should be loaded from a []struct{key;value} array. That's about as compact as we could get (without a decompression step). If the entries are not constant, then it is initialized with code (a bunch of m[k] = v assignments), which uses substantially more binary space.

This helps, but is only a reduction of about 60%, much like @adonovan mentions. The array still occupies about 200kB, which is surprisingly big. Perhaps something else is off here?

Maps have a hash seed that is chosen at startup dynamically (to avoid collision attacks). So the layout of a map changes from run to run; we cannot lay it out statically.

Alright, good to learn something new. So the map must be built anew at program start, but could the data be stored internally as a compact []struct{key, val}? And does the loop in the implicit init() that is generated to fill that map have to be unrolled? Could it generate a for loop over an array? Just mentioning ideas out of the top of my head here ;-)

@randall77
Copy link
Contributor

@tdewolff
Looks like even in the array case there are lots of allocations going on in your table. For each entry, we do something like:

p := new(HTML5Entity)
p.CodePoints = new([1]int)[:]
p.Characters = new([2]byte)[:]
m["str"] = p

(and some other initialization assignments.)

That's 3 allocations per entry. Those happen even if you make an array instead.
We might be able to do all those allocations at compile time (for either the map or array case). It might be complicated, but doable. I'm not entirely sure why we don't handle that already. I believe we do handle some cases where allocations are in literals.

The only thing I worry about is making #29068 worse.

@prattmic
Copy link
Member

cc @golang/compiler

@prattmic prattmic added this to the Backlog milestone Feb 25, 2025
@prattmic prattmic added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BugReport Issues describing a possible bug in the Go implementation. compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Development

No branches or pull requests

6 participants