Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: long function names due to generic instantiation with large struct types #65030

Closed
thanm opened this issue Jan 9, 2024 · 3 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@thanm
Copy link
Contributor

thanm commented Jan 9, 2024

Go version

go version devel go1.22-b702e0438a Mon Jan 8 20:15:52 2024 +0000 darwin/arm64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='/Users/thanm/Library/Caches/go-build'
GOENV='/Users/thanm/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/thanm/go1/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/thanm/go1'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/Users/thanm/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/Users/thanm/go/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='devel go1.22-b702e0438a Mon Jan 8 20:15:52 2024 +0000'
GCCGO='gccgo'
AR='ar'
CC='clang'
CXX='clang++'
CGO_ENABLED='1'
GOMOD='/Users/thanm/go/src/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/tf/lh5xw2y1657fwx2q0vxrt_5m008r__/T/go-build1807409156=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

Build this program:

https://go.dev/play/p/6Jsq0rUhQ9K

Source:

package main

import (
	"os/exec"
	"slices"
)

func main() {
	var s []exec.Cmd
	s = slices.Insert(s, 0, exec.Cmd{})
}

and examine the assembly (gcflags=-S).

What did you see happen?

At the call to slices.Insert in main function, here is an assembly fragment (from "go build -gcflags=-S -trimpath main.go"):

	0x0054 00084 (./main.go:10)	MOVD	$1, R6
	0x0058 00088 (./main.go:10)	MOVD	R6, R7
	0x005c 00092 (./main.go:10)	PCDATA	$1, $0
	0x005c 00092 (./main.go:10)	CALL	slices.Insert[go.shape.[]os/exec.Cmd,go.shape.struct { Path string; Args []string; Env []string; Dir string; Stdin io.Reader; Stdout io.Writer; Stderr io.Writer; ExtraFiles []*os.File; SysProcAttr *syscall.SysProcAttr; Process *os.Process; ProcessState *os.ProcessState; os/exec.ctx context.Context; Err error; Cancel func() error; WaitDelay time.Duration; os/exec.childIOFiles []io.Closer; os/exec.parentIOPipes []io.Closer; os/exec.goroutine []func() error; os/exec.goroutineErr <-chan error; os/exec.ctxResult <-chan os/exec.ctxResult; os/exec.createdByStack []uint8; os/exec.lookPathErr error }](SB)
	0x0060 00096 (./main.go:11)	LDP	-8(RSP), (R29, R30)
	0x0064 00100 (./main.go:11)	ADD	$432, RSP

The program works properly, but note the name for the instantiated version of slices.Insert, which is almost 600 characters long. Of particular concern is that although the generic function is instantiated with the type exec.Cmd, the function name looks like it refers to an anonymous struct type that has all the fields of cmd.Exec, but in which all the field names/types are spelled out (including package paths, etc).

In this example, the long function name is not that big a deal, but creates needless binary bloat, and for profilers (ex: pprof) and other tools that capture data files based on function names, the long names can result in much bigger data files.

More importantly however if you have a package with very long paths, and you import a large struct type from that package and instantiate a generic based on that, you can get absurdly long function names.

What did you expect to see?

Ideally it would be better to have generic functions names that are proportional in size to the length of their qualified type parameters (e.g. "os/exec.Cmd") as opposed to O(N * P) where N is the number of struct fields in the type param and P is the length of the import path of the type's package.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jan 9, 2024
@thanm
Copy link
Contributor Author

thanm commented Jan 9, 2024

I think the root of the problem is that we're picking the underlying type of a given type T when forming the shape type for it, as in this code:

https://go.googlesource.com/go/+/b46e44a399045d0177dd063dc192168f0b5b3f55/src/cmd/compile/internal/noder/reader.go#874

This is eminently sensible (I might even go so far to say required) in a majority of the cases (since obviously we want to common up named types whose underlying types are the same), but in the case of huge structures with very long link strings it's a problem.

I will send a tentative CL shortly to switch to using the type hash instead.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/554975 mentions this issue: cmd/compile: use hashed symbol name for go.shape types if too long

@Jorropo
Copy link
Member

Jorropo commented Jan 9, 2024

👍 this issue affects pprof, I had a profile with a function name bigger than the screen at default zoom level once.

@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jan 9, 2024
@dmitshur dmitshur added this to the Backlog milestone Jan 9, 2024
@thanm thanm self-assigned this Jan 9, 2024
@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jan 10, 2024
@dmitshur dmitshur modified the milestones: Backlog, Go1.22 Jan 10, 2024
ezz-no pushed a commit to ezz-no/go-ezzno that referenced this issue Feb 18, 2024
Shape-based stenciling in the Go compiler's generic instantiation
phase looks up shape types using the underlying type of a given target
type. This has a beneficial effect in most cases (e.g. we can use the
same shape type for two different named types whose underlying type is
"int"), but causes some problems when the underlying type is a very
large structure. The link string for the underlying type of a large
imported struct can be extremely long, since the link string
essentially enumerates the full package path for every field type;
this can produce a "go.shape.struct { ... " symbol name that is
absurdly long.

This patch switches the compiler to use a hash of the underlying type
link string instead of the string itself, which should continue to
provide commoning but keep symbol name lengths reasonable for shape
types based on large imported structs.

Fixes golang#65030.

Change-Id: I87d602626c43172beb99c186b8ef72327b8227a2
Reviewed-on: https://go-review.googlesource.com/c/go/+/554975
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Than McIntosh <thanm@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

4 participants