-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Description
What version of Go are you using (go version)?
$ go version go version go1.21rc3 linux/amd64
Does this issue reproduce with the latest release?
Yes.
It also reproduces with 1.20, 1.19, and maybe older versions too.
What operating system and processor architecture are you using (go env)?
go env Output
$ go env GO111MODULE='' GOARCH='amd64' GOBIN='' GOCACHE='/home/andrei/.cache/go-build' GOENV='/home/andrei/.config/go/env' GOEXE='' GOEXPERIMENT='' GOFLAGS='' GOHOSTARCH='amd64' GOHOSTOS='linux' GOINSECURE='' GOMODCACHE='/home/andrei/go/pkg/mod' GONOPROXY='' GONOSUMDB='' GOOS='linux' GOPATH='/home/andrei/go' GOPRIVATE='' GOPROXY='https://proxy.golang.org,direct' GOROOT='/home/andrei/sdk/go1.21rc3' GOSUMDB='sum.golang.org' GOTMPDIR='' GOTOOLCHAIN='auto' GOTOOLDIR='/home/andrei/sdk/go1.21rc3/pkg/tool/linux_amd64' GOVCS='' GOVERSION='go1.21rc3' GCCGO='gccgo' GOAMD64='v1' AR='ar' CC='gcc' CXX='g++' CGO_ENABLED='1' GOMOD='/dev/null' GOWORK='' CGO_CFLAGS='-O2 -g' CGO_CPPFLAGS='' CGO_CXXFLAGS='-O2 -g' CGO_FFLAGS='-O2 -g' CGO_LDFLAGS='-O2 -g' PKG_CONFIG='pkg-config' GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build1118470848=/tmp/go-build -gno-record-gcc-switches'
What did you do?
Meta-comment: I understand that the production of DWARF location lists for optimized code is not perfect. I'd like to generally form an understanding about which problems are hard and which are tractable. I have found one case where I'm hoping the debug info can be improved, stemming from a real case where a variable that I've tried to read is unavailable where I think it should be available.
With the (optimized) CockroachDB binary, I'm frequently trying to read context.Context function arguments with a debugger, and more often than not they are not available. I have failed to create a small repro (my experimentation is also hindered by #61700), but the problems seem widespread in the CockroachDB binary. For this report, I have selected a Cockroach case that seems the simplest, out of multiple that I've investigated by hand.
In this example, what seems to happen is that a function argument that starts off in registers becomes unavailable once the registers are clobbered by a function call, even though the argument was spilled to the stack (and not just spilled anywhere, but spilled in its designated regabi spill slot, FWIW). In other cases, I see the compiler able to use the spill locations and switch the location lists from registers to the stack; I'd like to understand if there's a good reason why this doesn't happen in this example.
Consider the following method (Github link):
// Init is part of the colexecop.Operator interface.
func (bic *batchInfoCollector) Init(ctx context.Context) {
bic.ctx = ctx
bic.stopwatch.Start()
// Wrap the call to Init() with a panic catcher in order to get the correct
// execution time (e.g. in the statement bundle).
err := colexecerror.CatchVectorizedRuntimeError(bic.init)
bic.stopwatch.Stop()
if err != nil {
colexecerror.InternalError(err)
}
// Unset the context so that it's not used outside of the init() function.
bic.ctx = nil
bic.mu.Lock()
// If we got here, then Init above succeeded, so the wrapped operator has
// been properly initialized.
bic.mu.initialized = true
bic.mu.Unlock()
}The location lists for the ctx argument are quite incomplete:
0x02be341f: DW_TAG_formal_parameter
DW_AT_name ("ctx")
DW_AT_variable_parameter (0x00)
DW_AT_decl_line (98)
DW_AT_type (0x00000000000ac025 "context.Context")
DW_AT_location (0x03ea7218:
[0x0000000002fb8f80, 0x0000000002fb8fc9): DW_OP_reg3 RBX, DW_OP_piece 0x8, DW_OP_reg2 RCX, DW_OP_piece 0x8
[0x0000000002fb8fc9, 0x0000000002fb8fd1): DW_OP_reg3 RBX, DW_OP_piece 0x8, DW_OP_piece 0x8)
The 128-bit argument starts off in RBX, RCX. Then, the PC location 2fb8fd1 corresponds to the first function call -> bic.stopwatch.Start(). That's where the location lists stop, so, for the majority of the function's code, ctx is not available.
The 2nd location list (2fb8fc9 - 2fb8fd1) corresponds to only two instructions where RBX continues to be available after RCX was just clobbered; I believe this is not very interesting.
Now, I claim that the compiler should be able to give me a location list for all of this function, based on the fact that ctx is spilled to the stack in the beginning of the function. Again, in other random instances I've looked at, the compiler does sometimes seem to be able to make use of the stack spill.
Here is the function's disassembly with minor commentary: https://gist.github.com/andreimatei/fa1575dbfa8e04fb0271d02587a0bdf6
I'll paste the beginning here:
0000000002fb8f80 <github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).Init>:
; github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).Init():
; /home/andrei/src/github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:98
2fb8f80: 49 3b 66 10 cmpq 0x10(%r14), %rsp
2fb8f84: 0f 86 38 01 00 00 jbe 0x2fb90c2 <github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).Init+0x142>
2fb8f8a: 55 pushq %rbp
2fb8f8b: 48 89 e5 movq %rsp, %rbp
2fb8f8e: 48 83 ec 38 subq $0x38, %rsp
============== ctx is spilled to the stack over the next two instructions ==============
2fb8f92: 48 89 5c 24 50 movq %rbx, 0x50(%rsp)
2fb8f97: 48 89 4c 24 58 movq %rcx, 0x58(%rsp)
; /home/andrei/src/github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:99
2fb8f9c: 48 89 58 60 movq %rbx, 0x60(%rax)
2fb8fa0: 83 3d 59 41 b8 08 00 cmpl $0x0, 0x8b84159(%rip) # 0xbb3d100 <runtime.writeBarrier>
2fb8fa7: 74 10 je 0x2fb8fb9 <github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).Init+0x39>
2fb8fa9: e8 12 ee 4c fd callq 0x487dc0 <runtime.gcWriteBarrier2>
2fb8fae: 49 89 0b movq %rcx, (%r11)
2fb8fb1: 48 8b 50 68 movq 0x68(%rax), %rdx
2fb8fb5: 49 89 53 08 movq %rdx, 0x8(%r11)
; /home/andrei/src/github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:98
2fb8fb9: 48 89 44 24 48 movq %rax, 0x48(%rsp)
; /home/andrei/src/github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:99
2fb8fbe: 48 89 48 68 movq %rcx, 0x68(%rax)
; /home/andrei/src/github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:100
2fb8fc2: 48 8b 88 80 00 00 00 movq 0x80(%rax), %rcx
======= beginning of the second location list; RCX became unavailable just above ==========================
2fb8fc9: 48 89 c8 movq %rcx, %rax
2fb8fcc: e8 8f 2d 93 fd callq 0x8ebd60 <github.com/cockroachdb/cockroach/pkg/util/timeutil.(*StopWatch).Start>
======= end of the 2nd location list; RBX also became unavailable here =====================
2fb8fd1: 44 0f 11 7c 24 28 movups %xmm15, 0x28(%rsp)
...
Notice that RBX and RCX are spilled in the beginning. And note how the call StopWatch.Start marks the end of the 2nd location list, and there's no further location list.
What did you expect to see?
I expected a location list pointing to the ctx value on the stack.
If this proves to be a tractable problem, myself or one of my colleagues might be interested in putting some effort in improving it.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status