New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: Incorrect DWARF scope representation #12899

Open
derekparker opened this Issue Oct 10, 2015 · 21 comments

Comments

Projects
None yet
10 participants
@derekparker
Contributor

derekparker commented Oct 10, 2015

The DWARF information emitted does not correctly represent the scoping of variables within a procedure.

To illustrate, given the following Go source code:

package main                                                                                                         

import "fmt"

func main() {
        mystr := "foo"
        {
                mystr := "bar"
                fmt.Println(mystr)
        }
        fmt.Println(mystr)
}

The relevant information produced in .debug_info is as follows:

 <1><58>: Abbrev Number: 2 (DW_TAG_subprogram)
    <59>   DW_AT_name        : main.main
    <63>   DW_AT_low_pc      : 0x401000
    <6b>   DW_AT_high_pc     : 0x401240
    <73>   DW_AT_external    : 1   
 <2><74>: Abbrev Number: 4 (DW_TAG_variable)
    <75>   DW_AT_name        : mystr                                                                                 
    <7b>   DW_AT_location    : 5 byte block: 9c 11 80 7f 22 >   (DW_OP_call_frame_cfa; DW_OP_consts: -128; DW_OP_plus)
    <81>   DW_AT_type        : <0x34e72>
 <2><89>: Abbrev Number: 4 (DW_TAG_variable)
    <8a>   DW_AT_name        : mystr 
    <90>   DW_AT_location    : 5 byte block: 9c 11 90 7f 22 >   (DW_OP_call_frame_cfa; DW_OP_consts: -112; DW_OP_plus)
    <96>   DW_AT_type        : <0x34e72>

As you can see, both variables are represented at depth <2> in the tree, however, one should be at depth <3>, following a depth <2> entry of a DW_AT_lexical_block with low/high pc values.

The first solution is to produce a tree that properly reflects the lexical blocks in the source code. Secondly, it would be great for DW_TAG_variable, DW_TAG_formal_parameter and family to include DW_AT_decl_file and DW_AT_decl_line attributes within the entries.

@ianlancetaylor ianlancetaylor added this to the Unplanned milestone Oct 10, 2015

@derekparker

This comment has been minimized.

Contributor

derekparker commented Oct 20, 2015

cc: @ianlancetaylor @rsc

I would really like to work on this, however I know the code freeze is coming up. I feel like the addition of DW_AT_decl_* attributes should be relatively easy. The other issue of correctly representing scope seems more difficult, and I would like to discuss that as well.

I can work on these issues to try to get them in (at least DW_AT_decl_*) before the code freeze if at all possibly, especially if other folks familiar with this are busy. I've been familiarizing myself with more of the code that is generating the Dwarf information.

Getting the decl_* attributes emitted is trivial, however having the correct values for them seems less so. While building up the DIE at https://github.com/golang/go/blob/master/src/cmd/link/internal/ld/dwarf.go#L1709 it seems we do not have access to the file/line information that represents where the symbol was declared. Even from what I've dug into with a.Asym.Pcln.* the information is not available. So, where would be the most appropriate place to encode this information, or where is it currently accessible at?

Regarding the missing lexical scope information, since from what I know the Dwarf information is being generated based off the program table, is it possible with the current implementation to know about lexical scoping at the point the Dwarf information is being generated? It seems all the information Dwarf generation has access to is symbols (funcs, etc..), but has no way of knowing whether a symbol was defined within a particular lexical block (as demonstrated in the initial example in this issue report) and what the high/low PC values are for that block. So essentially my question is whether I'm incorrect about any of these assumptions, and if so how? The "how" will hopefully guide me to an implementation.

Again, I could be incorrect about my understanding / assumptions as I've only recently really dug into the Dwarf generation code in the linker. Mostly I would like to start a discussion about the above points and how to move forward with what little time is left before the code freeze.

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Oct 20, 2015

@derekparker These are good questions and I do not know the answers. I think you should ask on golang-dev, as there are more people there who understand the compiler and the object file format. However it may be that nobody really knows.

@derekparker

This comment has been minimized.

Contributor

derekparker commented Oct 20, 2015

@aclements

This comment has been minimized.

Member

aclements commented Oct 21, 2015

@derekparker, what's the advantage of having the DW_AT_decl_* attributes? What are they used for?

Having correct scope information would definitely be good. Unfortunately, from what I understand, you're correct that the scope information simply isn't available at the point where the DWARF is generated. However, I'll defer to anyone who knows this code better for a more definitive answer. If this is the case, I think you can work out scope information from the VARDEF/VARKILL pseudo-instructions in the compiler. Then you'll need to extend the compiler's Auto type (in internal/obj/link.go) to store scope information, the loop over s.Autom in writesym to write out scope information in the Go object format, readsym to read it back in, and the linker's Auto type (in link/internal/ld/link.go) to store it. Then it will be available when writing out the DWARF info.

(This makes we wonder why we generate DWARF in the linker instead of the more common approach of generating it in the compiler...)

@derekparker

This comment has been minimized.

Contributor

derekparker commented Oct 21, 2015

@aclements the advantage of the DW_AT_decl_* attrs is that the debugger can know where a specific variable was declared, most importantly in relation to where the debugger has stopped the program. It comes into play for commands such as locals which will print all local variables in scope. The debugger needs a way to omit uninitialized variables so it doesn't display garbage.

Thank you very much for the pointers on the scope issue, I'll dig into that some more!

@aclements

This comment has been minimized.

Member

aclements commented Oct 21, 2015

@aclements the advantage of the DW_AT_decl_* attrs is that the debugger can know where a specific variable was declared, most importantly in relation to where the debugger has stopped the program.

Given the complicated connection between PCs and line numbers in optimized code, I would think a debugger would have to depend on DW_AT_start_scope or DW_AT_location with a location list to detect uninitialized variables if a variable's lifetime was anything less than the lifetime in the DIE of its enclosing scope.

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Oct 21, 2015

@aclements We generate DWARF in the linker because when we added the DWARF support the linker was also responsible for code generation. At that time it would have been even more awkward to generate DWARF in the compiler. Clearly, though, in the future we should generate DWARF in the compiler, if for no other reason than that it would make builds slightly faster.

@aclements

This comment has been minimized.

Member

aclements commented Oct 21, 2015

@aclements We generate DWARF in the linker because when we added the DWARF support the linker was also responsible for code generation. At that time it would have been even more awkward to generate DWARF in the compiler. Clearly, though, in the future we should generate DWARF in the compiler, if for no other reason than that it would make builds slightly faster.

Interesting. That makes sense. I suppose there are still some advantages to doing this in the linker, like that we don't generate DWARF at all for unused symbols and that we don't have to worry about deduplicating the DWARF for equivalent types (this is less of an issue than in C, but would still come up for, e.g., *package.T).

@derekparker

This comment has been minimized.

Contributor

derekparker commented Oct 21, 2015

@aclements

Given the complicated connection between PCs and line numbers in optimized code, I would think a debugger would have to depend on DW_AT_start_scope or DW_AT_location with a location list to detect uninitialized variables if a variable's lifetime was anything less than the lifetime in the DIE of its enclosing scope.

Great point. Delve compiles programs with optimizations disabled for this reason, however when it is debugging a binary compiled by other means this still becomes an issue. I'm certainly in favor of preferring DW_AT_start_scope over the DW_AT_decl_* attributes, should be more correct generally and save some space as well.

Additionally @ianlancetaylor and @aclements if the decision is made to move DWARF generation into the compiler I'm more than happy to help work on that, as it's something that I have a direct interest in improving.

@gopherbot

This comment has been minimized.

gopherbot commented Feb 11, 2016

CL https://golang.org/cl/19452 mentions this issue.

@gopherbot

This comment has been minimized.

gopherbot commented Sep 22, 2016

CL https://golang.org/cl/29591 mentions this issue.

@mdempsky

This comment has been minimized.

Member

mdempsky commented Jan 12, 2017

One of the complexities at the moment is that we need to associate each instruction with a lexical scope in the original source. I believe this will become much easier in 1.9, once we've merged dev.inline and have fine-grained position information available.

During parsing we just need to record where each scope starts/ends (and maybe their nesting, though that's implicit). At DWARF generation time, we can use instruction position information (which is already available) to identify which lexical scope it's in. (Similar to using go/types.(*Scope).Innermest).

This should make DWARF lexical scope support much less intrusive than we were previously going with CL 29591.

@gopherbot

This comment has been minimized.

gopherbot commented Feb 13, 2017

CL https://golang.org/cl/36879 mentions this issue.

gopherbot pushed a commit that referenced this issue Apr 7, 2017

cmd/compile: output DWARF lexical blocks for local variables
Change compiler and linker to emit DWARF lexical blocks in debug_info.
Version of debug_info is updated from DWARF v.2 to DWARF v.3 since version 2
does not allow lexical blocks with discontinuous ranges.

Second attempt at https://go-review.googlesource.com/#/c/29591/

Remaining open problems:
- scope information is removed from inlined functions
- variables in debug_info do not have DW_AT_start_scope attributes so a
variable will shadow other variables with the same name as soon as its
containing scope begins, before its declaration.

Updates #12899, #6913

Change-Id: I0e260a45b564d14a87b88974eb16c5387cb410a5
Reviewed-on: https://go-review.googlesource.com/36879
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
@gopherbot

This comment has been minimized.

gopherbot commented Apr 10, 2017

CL https://golang.org/cl/40095 mentions this issue.

lparth added a commit to lparth/go that referenced this issue Apr 13, 2017

cmd/compile: output DWARF lexical blocks for local variables
Change compiler and linker to emit DWARF lexical blocks in debug_info.
Version of debug_info is updated from DWARF v.2 to DWARF v.3 since version 2
does not allow lexical blocks with discontinuous ranges.

Second attempt at https://go-review.googlesource.com/#/c/29591/

Remaining open problems:
- scope information is removed from inlined functions
- variables in debug_info do not have DW_AT_start_scope attributes so a
variable will shadow other variables with the same name as soon as its
containing scope begins, before its declaration.

Updates golang#12899, golang#6913

Change-Id: I0e260a45b564d14a87b88974eb16c5387cb410a5
Reviewed-on: https://go-review.googlesource.com/36879
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>

gopherbot pushed a commit that referenced this issue May 18, 2017

cmd/compile: output DWARF lexical blocks for local variables
Change compiler and linker to emit DWARF lexical blocks in .debug_info
section when compiling with -N -l.

Version of debug_info is updated from DWARF v2 to DWARF v3 since
version 2 does not allow lexical blocks with discontinuous PC ranges.

Remaining open problems:
- scope information is removed from inlined functions
- variables records do not have DW_AT_start_scope attributes so a
variable will shadow other variables with the same name as soon as its
containing scope begins, even before its declaration.

Updates #6913.
Updates #12899.

Change-Id: Idc6808788512ea20e7e45bcf782453acb416fb49
Reviewed-on: https://go-review.googlesource.com/40095
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
@heschik

This comment has been minimized.

Contributor

heschik commented Jul 17, 2017

I believe this is fixed. Closing.

@heschik heschik closed this Jul 17, 2017

@aarzilli

This comment has been minimized.

Contributor

aarzilli commented Jul 17, 2017

@heschik there's the fact that DW_AT_start_scope is not emitted and it's disabled when inlining or optimizations are enabled.

@heschik heschik reopened this Jul 17, 2017

@ignatov

This comment has been minimized.

ignatov commented Jul 17, 2017

@rsc Could you please add the debugger tag?

@bradfitz

This comment has been minimized.

Member

bradfitz commented Jul 17, 2017

Done.

@gopherbot

This comment has been minimized.

gopherbot commented Jul 25, 2017

CL https://golang.org/cl/51150 mentions this issue.

@rsc rsc changed the title from cmd/ld: Incorrect DWARF scope representation to cmd/link: Incorrect DWARF scope representation Nov 22, 2017

@rsc

This comment has been minimized.

Contributor

rsc commented Nov 22, 2017

Not sure what's left here (@heschik?) but whatever is will need to wait for Go 1.11 at this point.

@rsc rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017

@heschik

This comment has been minimized.

Contributor

heschik commented Nov 22, 2017

#12899 (comment) is still accurate. However, I'm not sure either is important enough that anyone would work on it for 1.11.

@bradfitz bradfitz modified the milestones: Go1.11, Go1.12 Jun 20, 2018

@ianlancetaylor ianlancetaylor modified the milestones: Go1.12, Unplanned Dec 7, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment