Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: stack check too deep when compiling deep call graph with -gcflags=-l #51814

Open
maruel opened this issue Mar 19, 2022 · 4 comments
Open
Labels
NeedsInvestigation WaitingForInfo
Milestone

Comments

@maruel
Copy link
Contributor

@maruel maruel commented Mar 19, 2022

What version of Go are you using (go version)?

New regression in go1.18, works on go1.7.

Culprit commit is c991278.

cc @mdempsky

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOOS="linux"
GOVERSION="go1.18"
GCCGO="gccgo"
GOAMD64="v1"
CGO_ENABLED="1"

What did you do?

Run go test on github.com/maruel/panicparse/v2/...

Here's a smaller repro case with main.go.txt attached file:

go build -gcflags -l main.go

What did you expect to see?

Works as before.

What did you see instead?

go build -gcflags -l main.go      
# command-line-arguments                                         
runtime.morestack: nosplit stack check too deep
runtime.morestack: nosplit stack overflow        
        792     guaranteed after split check in main.main<1>
        800     after main.main<1> uses -8       
        792     on entry to main.recurse2000<1>
        800     after main.recurse2000<1> uses -8
        792     on entry to main.recurse1999<1>
        800     after main.recurse1999<1> uses -8
        792     on entry to main.recurse1998<1>
(...)
        800     after main.recurse1503<1> uses -8
        792     on entry to main.recurse1502<1>
        800     after main.recurse1502<1> uses -8
        792     on entry to main.recurse1501<1>
        784     on entry to runtime.morestack<0> (nosplit)
        0       after runtime.morestack<0> (nosplit) uses 784

Workaround

Using -G=0 works most of the time but not always (?) -covermode=count is broken in panicparse/cmd/panic but I haven't successfully extracted a repro yet.

@ianlancetaylor ianlancetaylor changed the title compiler: regression in go1.18 for very deep static call graph when using -gcflags -l cmd/compile: nosplit stack overflow when compiling deep call graph with -gcflags=-l Mar 19, 2022
@ianlancetaylor ianlancetaylor added the NeedsInvestigation label Mar 19, 2022
@ianlancetaylor ianlancetaylor added this to the Backlog milestone Mar 19, 2022
@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 19, 2022

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 19, 2022

The failure here is "nosplit stack check too deep" reported by the linker (https://go.googlesource.com/go/+/refs/heads/master/src/cmd/link/internal/ld/lib.go#2295). When not using -gcflags=-l these simple functions are inlined and the stack check is not too deep. With -gcflags=-l you run into an intentional limit in the linker.

We can increase the stack limit if there are real programs with a stack check that deep. But I'm not sure we should increase it for a test program. Any limit we set can be easily exceeded by any test program. Is it important for the test to have such a deep call graph? Does that correspond to real programs?

@ianlancetaylor ianlancetaylor added the WaitingForInfo label Mar 19, 2022
@ianlancetaylor ianlancetaylor changed the title cmd/compile: nosplit stack overflow when compiling deep call graph with -gcflags=-l cmd/link: stack check too deep when compiling deep call graph with -gcflags=-l Mar 19, 2022
@maruel
Copy link
Contributor Author

@maruel maruel commented Mar 19, 2022

It is a case of "It used to work it doesn't anymore"; that's was caught by a real project that is currently used, the real world value of this specific case is discutable.

I wrote panic in panicparse to test edge cases. The test case stack_cut_off_named is to ensure that stack elision works both when inlining occurs and doesn't. If I recall correctly, the reason it's so deep is that on the previous optimizer it used to optimize a depth of 10 functions at a time, leaving ~200 frames in the stack so I could test elision even when the optimizer is enabled.

My primary concern is that it only fails when running with coverage or when disabling the inliner to debug something. Otherwise it works just fine.

This can be surprising to developers when they disable optimizations to debug something more easily and the program stops compiling. I'd prefer if the behavior was consistent. I would suspect this would risk happening mostly in generated code as I wouldn't expect humans to naturally such deep recursive code.

As I noted in my report, it does fail when using coverage but I failed to extract a simplified repro case, you can reproduce locally with:

git clone https://github.com/maruel/panicparse
cd panicparse
go test -covermode=count ./cmd/panic

The fact that this occurs with go test -covermode=(atomic|count|set) is much more material to me than with -gcflags -l.

@maruel
Copy link
Contributor Author

@maruel maruel commented Mar 22, 2022

I figured out a reduced repro case for go test -covermode=set :

Observations:

  • go test and go test -race passes, but go test -covermode=set fails to link.
  • It doesn't repro if the code is all in main_test.go.
  • For the function pointer, using recurse2000 works, but func() { recurse2000() } fails.
  • It doesn't matter if the leaf function do nothing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation WaitingForInfo
Projects
None yet
Development

No branches or pull requests

2 participants