Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: Extremely slow start for buildmode=PIE linkmode=internal binaries (R_X86_64_64 relocations) #36028

Open
prattmic opened this issue Dec 6, 2019 · 6 comments
Labels
Milestone

Comments

@prattmic
Copy link
Contributor

@prattmic prattmic commented Dec 6, 2019

Currently, internally linked PIE binaries take an extremely long time to start.

Suppose an example Go http server. Large enough to produce an interesting binary, but still simple:

hello.go:

package main

import (
        "fmt"
        "html"
        "log"
        "net/http"
)

func main() {
        http.HandleFunc("/bar", func(w http.ResponseWriter, r *http.Request) {
                fmt.Fprintf(w, "Hello, %q", html.EscapeString(r.URL.Path))
        })

        go http.ListenAndServe(":8080", nil)

        // Exit immediately so we can easily measure startup.
        log.Printf("Exiting...")
}

Let's build it a few different ways:

$ go build -o ./http.static http.go
$ go build -buildmode=pie -o ./http.ext.pie http.go
$ CGO_ENABLED=0 go build -buildmode=pie -ldflags=-linkmode=internal -o ./http.int.pie http.go

(cgo is not supported for internally linked PIE)

Now let's run them:

$ time ./http.static
2019/12/06 17:16:58 Exiting...
./http.static  0.00s user 0.01s system 102% cpu 0.005 total

$ time ./http.ext.pie
2019/12/06 17:17:17 Exiting...
./http.ext.pie  0.01s user 0.01s system 103% cpu 0.015 total

$ time ./http.int.pie    
2019/12/06 17:17:30 Exiting...
./http.int.pie  1.03s user 0.00s system 99% cpu 1.032 total

The internally linked binary takes over 1s to start!

The problem is the relocations. Both the internally and externally binaries contain ~60k relocations (!). The externally linked version has R_X86_64_RELATIVE relocations, which are trivial for the dynamic linker to resolve. The internally linked version has R_X86_64_64 relocations, which require a symbol lookup to resolve.

Sure enough, profiling reveals that most of that 1s is going to symbol resolution in the dynamic linker. We should switch to more efficient relocations to keep dynamic linking times reasonable.

We already have a TODO to switch these relocations:

// TODO: We generate a R_X86_64_64 relocation for every R_ADDR, even
. It turns out making the switch is quite simple (unless I missed something), so I'll send a CL shortly.

@prattmic prattmic changed the title Generate R_X86_64_RELATIVE relocations instead of R_X86_64_64 for buildmode=PIE linkmode=internal Extremely slow start for buildmode=PIE linkmode=internal binaries (R_X86_64_64 relocations) Dec 6, 2019
@prattmic

This comment has been minimized.

Copy link
Contributor Author

@prattmic prattmic commented Dec 6, 2019

For context, we'd like to build gVisor's binaries as PIE (with internal linking, as we forbid cgo/libc). Our main binary ends up with 222k relocations and takes 12s to start (vs ~35ms normally linked).

This seems like a massive number of relocations (it seems like type are mostly for type information and constant strings? I'm not very familiar with the extra data Go generates). It would be great to reduce the number of relocations, but just switching to R_X86_64_RELATIVE will make things at least reasonable.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Dec 6, 2019

Change https://golang.org/cl/210181 mentions this issue: cmd/link: generate R_X86_64_RELATIVE relocs for PIE

@toothrot toothrot changed the title Extremely slow start for buildmode=PIE linkmode=internal binaries (R_X86_64_64 relocations) cmd/link: Extremely slow start for buildmode=PIE linkmode=internal binaries (R_X86_64_64 relocations) Dec 9, 2019
@toothrot toothrot added this to the Backlog milestone Dec 9, 2019
@toothrot

This comment has been minimized.

Copy link
Contributor

@toothrot toothrot commented Dec 9, 2019

@mdempsky

This comment has been minimized.

Copy link
Member

@mdempsky mdempsky commented Dec 9, 2019

Seems like a clearly worthwhile improvement, and there's already a CL under review. It'll have to wait for 1.15 though (and the CL is tagged R=go1.15 accordingly), since it's not a regression.

@mdempsky mdempsky added NeedsFix and removed NeedsDecision labels Dec 9, 2019
@mdempsky mdempsky modified the milestones: Backlog, Go1.15 Dec 9, 2019
@odeke-em

This comment has been minimized.

Copy link
Member

@odeke-em odeke-em commented Dec 15, 2019

Hey @prattmic, thank you for the report and for the CL.

I examined this bug and your CL but noticed there isn't a test and thought that perhaps if need be we can create a regression test that'll check for an anomaly between those numbers and then report one. The caveat here is that my version of Go complains about your flags for the last build as per (also I don't have formal education in statistics :))

internal linking requested but external linking required: buildmode=pie

but perhaps this test can help after you tweak the slow buildmode arguments

package main

import (
	"io/ioutil"
	"math"
	"os"
	"os/exec"
	"path/filepath"
	"testing"
	"time"
)

const prog = `
package main

import (
        "fmt"
        "html"
        "log"
        "net/http"
)

func main() {
        http.HandleFunc("/bar", func(w http.ResponseWriter, r *http.Request) {
                fmt.Fprintf(w, "Hello, %q", html.EscapeString(r.URL.Path))
        })

        go http.ListenAndServe(":8080", nil)

        // Exit immediately so we can easily measure startup.
        log.Printf("Exiting...")
}`

func TestBuildModeSpeedComparisons(t *testing.T) {
	tmpDir, err := ioutil.TempDir("", "issue36028")
	if err != nil {
		t.Fatalf("Failed to create tmp directory: %v", err)
	}
	defer os.RemoveAll(tmpDir)

	mainGoFile := filepath.Join(tmpDir, "http.go")
	if err := ioutil.WriteFile(mainGoFile, []byte(prog), 0600); err != nil {
		t.Fatalf("Failed to write file: %v", err)
	}

	staticFile := filepath.Join(tmpDir, "http.static")
	pieFile := filepath.Join(tmpDir, "http.ext.pie")
	pieLinkFile := filepath.Join(tmpDir, "http.int.pie")
	var noEnv []string
	_, output, err := execIt(noEnv, "go", "build", "-o", staticFile, mainGoFile)
	if err != nil {
		t.Fatalf("Failed to build object file: %v\n%s", err, output)
	}
	_, output, err = execIt(noEnv, "go", "build", "-buildmode=pie", "-o", pieFile, mainGoFile)
	if err != nil {
		t.Fatalf("Failed to build PIE file: %v\n%s", err, output)
	}
	cgoEnabled0 := append(os.Environ(), "CGO_ENABLED=0")
	// TODO: Get @prattmic to properly add the slow build flags here.
	_, output, err = execIt(cgoEnabled0, "go", "build", "-ldflags=-linkmode=internal", "-o", pieLinkFile, mainGoFile)
	if err != nil {
		t.Fatalf("Failed to build PIE -linkmode file: %v\n%s", err, output)
	}

	durStatic, _, _ := execIt(noEnv, staticFile)
	durPIE, _, _ := execIt(noEnv, pieFile)
	durPIELink, _, _ := execIt(noEnv, pieLinkFile)

	values := []struct {
		name  string
		value time.Duration
	}{
		{"durStatic", durStatic},
		{"durPIE", durPIE},
		{"durPIELink", durPIELink},
	}
	sum := 0.0
	for _, value := range values {
		sum += float64(value.value)
	}
	mean := sum / float64(len(values))
	sumOfSquares := 0.0
	for _, value := range values {
		sumOfSquares += math.Pow(float64(value.value)-mean, 2.0)
	}
	variance := sumOfSquares / float64(len(values))
	σ := math.Pow(variance, 0.5)

        // Heuristic: outlier detection by 3 sigma-rule.
	t.Logf("σ stdDev: %s", time.Duration(σ))
	meanMinus2σ := time.Duration(mean - 2*σ)
	meanPlus2σ := time.Duration(mean + 2*σ)
	for _, value := range values {
		if value.value > meanPlus2σ {
			t.Errorf("Anomaly: %q: %s > (μ + 2σ = %s)", value.name, value.value, meanPlus2σ)
		} else if value.value  < meanMinus2σ {
			t.Errorf("Anomaly: %q: %s < (μ - 2σ = %s)", value.name, value.value, meanMinus2σ)
		}
	}
}

func execIt(env []string, prog string, args ...string) (dur time.Duration, output []byte, err error) {
	startTime := time.Now()
	defer func() {
		dur = time.Now().Sub(startTime)
	}()
	cmd := exec.Command(prog, args...)
	cmd.Env = env
	output, err = cmd.CombinedOutput()
	return
}
@prattmic

This comment has been minimized.

Copy link
Contributor Author

@prattmic prattmic commented Dec 17, 2019

I can certainly add a timing test like that if desired. Are such tests common/desired in Go? These kinds of tests have the potential to be flaky due to small timing differences.

I could also add a test looking for specific kinds of relocations in the output, but that is fragile w.r.t. future changes to the linker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.