debuginfo: Generate debugging information for global variables. #9227

Closed
michaelwoerister opened this Issue Sep 16, 2013 · 26 comments

Projects

None yet

2 participants

@michaelwoerister
Contributor

There is no description of global variables yet. This will involve using llvm::DIBuilder::createGlobalVariable(). The implementation will be similar to those for other kinds of variables.

@michaelwoerister
Contributor

Here is a little guide on how this could be implemented. Maybe someone is interested in giving it a try.

  • Wrap llvm::DIBuilder::createGlobalVariable() from src/llvm/include/llvm/DIBuilder.h. That means
    • adding a C++ to C wrapper in src/rustllvm/RustWrapper.cpp (also adding the function name to src/rustllvm/rustllvm.def.in), and
    • adding a C to Rust wrapper in src/librustc/lib/llvm.rs. The respective files contain many examples of how other functions are wrapped.
  • Add a create_global_var_metadata() function to src/librustc/middle/trans/debuginfo.rs. This function should call the wrapped DIBuilder method with the appropriate arguments. See the debuginfo::file_metadata() function for a simple example of how the wrapped function can be called. The arguments needed by the DIBuilder should be made available from the create_global_var_metadata() arguments in some form or other.
  • Find the right places in the trans module to call the new debuginfo::create_global_var_metadata() function. trans::consts::trans_const() looks like a very good candidate. You can take a look at trans::base::trans_stmt() for an example of how the debuginfo module is used from the outside to create the debuginfo for local variables.
  • Write as many test cases as you can think of and add them to src/test/debug-info. The folder already contains many examples of how test cases are written. Some inspiration: globals should be visible from many different positions in the source code, so testing from multiple breakpoints in different scopes would be a good idea. Testing different data types also is a good idea. Testing immutable and mutable variables should be done too.
  • Make a pull request, lean back and enjoy the glory of having contributed an important piece of code to the coolest project around ;)

Bonus points for also handling static variables within a function.

@gentlefolk
Contributor

This sounds interesting, I'm going to start looking into it. Thanks for the guide here @michaelwoerister.

@michaelwoerister
Contributor

I've taken a look at this again some time ago and found that it would be better to use llvm::DIBuilder::createStaticVariable() instead of llvm::DIBuilder::createGlobalVariable() for this. createStaticVariable supports namespaces and Clang itself uses it for all C++ globals.createGlobalVariable is only used for something related to Objective-C interfaces, it seems.

If you have any questions or run into a dead-end just ask here and I'll be happy to help you out :)

@gentlefolk
Contributor

A little update since it's been a while - I have an implementation for this and have started writing tests. Questions and/or a pull request should follow shortly once I've worked through some test cases.

@michaelwoerister
Contributor

Great. If you do a pull request, be sure to cc me, so I see it in time.

@gentlefolk
Contributor

Will do.

I'm not handling mutable statics properly at the moment so I need to fix that, but immutable static globals are working with the exception of floating point types.

Floating point constants can be inspected with GDB and the whatis command reports the correct type, but the value is always reported as <optimized out>.

@michaelwoerister do you have any idea what would cause this in the underlying debug info?

@gentlefolk
Contributor

Alright, so looking at the DWARF information with objdump reveals that the float types aren't getting the DW_AT_const_value attribute that is generated for the other variables. The presence of this tag on mutable statics is also why they can't be debugged properly, since it indicates that the described variable doesn't exist in the address space of the program and can never change.

I'm still perplexed as to why float constants aren't being handled properly, but I have some ideas on how to fix the issue with mutable statics at least.

@michaelwoerister
Contributor

If gentlefolk@032e206 still represents the code you are testing, the problem might be that you pass v instead of g to debuginfo::create_global_var_metadata(). Apart from that your code looks great so far, very clean.
Keep up the good work!

@gentlefolk
Contributor

Right, taking a closer look at the code it makes sense why using v would cause this. However, when using g it seems gdb doesn't see any of the symbols although the generated DWARF info looks more sensible.

Using v generates the following for a pub static mut B: bool = false declaration in a file called basic-types-mut-globals.rs:

 <2><2b>: Abbrev Number: 3 (DW_TAG_variable)
    <2c>   DW_AT_name        : (indirect string, offset: 0x4d64): B 
    <30>   DW_AT_type        : <0x100>  
    <34>   DW_AT_decl_file   : 1    
    <35>   DW_AT_decl_line   : 91   
    <36>   DW_AT_const_value : 0    

While using g generates the following:

 <2><2b>: Abbrev Number: 3 (DW_TAG_variable)
    <2c>   DW_AT_name        : (indirect string, offset: 0x4f36): B 
    <30>   DW_AT_type        : <0x116>  
    <34>   DW_AT_decl_file   : 1    
    <35>   DW_AT_decl_line   : 91   
    <36>   DW_AT_declaration : 1    
    <36>   DW_AT_MIPS_linkage_name: (indirect string, offset: 0x9d): _ZN23basic-types-mut-globals1BE    

The referenced DW_AT_type describes a single byte encoded as a bool in both cases.

The difference in specifying the DW_AT_MIPS_linkage_name in the 2nd case is the obvious suspect, although I'm not sure what specifically the problem is. The DWARF specification also only seems to describe DW_AT_linkage_name and not DW_AT_MIPS_linkage_name.

@gentlefolk
Contributor

Dumping the symbol table I get the following, where _ZN1B20hfccf93e461ed376afaa4v0.0E looks to be the mangled name of B.

This suggests that the linkage name in the DWARF info is incorrect. Is debuginfo::NameSpaceTreeNode::mangled_name_of_contained_item() not handling this correctly? I don't have enough knowledge of Rust name mangling to answer this myself.

0000000000000000 l    df *ABS*  0000000000000000              basic-types-mut-globals.rs
000000000079a9f0 l     O .bss   0000000000000001              _ZN1B20hfccf93e461ed376afaa4v0.0E
@michaelwoerister
Contributor

The linkage name in DWARF does not actually correspond to the name of the symbol (which in Rust is not only a mangled name like in C++, but also contains some hash-value, see https://github.com/mozilla/rust/blob/master/src/librustc/back/link.rs#L454 for more info). The DWARF linkage name has just one purpose: to be printed nicely in GDB (which would not handle namespaces very well otherwise).

info variables won't list basic-types-mut-globals::B?

@gentlefolk
Contributor

info variables does indeed list basic-types-mut-globals::B but attempting to use the print or whatis commands gives me a No symbol "B" in current context. error.

@gentlefolk
Contributor

Comparing the DWARF information with that generated for a local variable indicates that the DW_AT_location attribute is missing, which would explain why GDB can't access it.

@michaelwoerister
Contributor

I often find it helpful in such a case to take a look at the DWARF generated by gcc and clang.

@gentlefolk
Contributor

Yeah, looking at the equivalent case with GCC confirms that the DW_AT_location attribute is missing. From my understanding this means that llvm doesn't have information to figure where the variable exists in the address space of the program when it generates the DWARF information.

My preliminary suspicion is that the scope metadata provided to llvm::DIBuilder::createStaticVariable may be incomplete in some way. I'll continue digging, but if you have any insight into how llvm::DIBuilder methods figure out where variables exist within a binary it'd be appreciated.

@gentlefolk
Contributor

Alright, so I can see that debuginfo::populate_scope_map() picks up the scope of everything within a function. This obviously would not include global variable declarations.

If that seems like a sensible reason for llvm not being able to figure out where the global variables are defined, then it looks like I would similarly need to walk the AST looking for decl's within a crate (i.e. outside of any functions).

If the above doesn't make sense, then at least it looks like updating debuginfo::populate_scope_map::walk_decl() to handle the DeclItem case for ItemStatic items would get us the ability to debug statics declared within a function.

@michaelwoerister
Contributor

For statics outside any function debuginfo::namespace_for_item() will create a correct (namespace/module) scope. For statics inside functions debuginfo::populate_scope_map() will have to be extend, as you say.

As for the missing DW_AT_location attribute, I'll need to look into that. The relevant LLVM code is in CompileUnit::createGlobalVariableDIE() in src/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp.

@gentlefolk
Contributor

It turns out things work just fine, but you have to access the variables using the filename as a namespace. So for example, static B: bool = true; in file some_bool.rs needs to be inspected as 'some_bool'::B in gdb.

The location in the DWARF information is present, but it is specified using the DW_AT_declaration and DW_AT_specification attributes instead of being associated with the reset of the variable's information (name, type, etc.).

@michaelwoerister
Contributor

If you use debuginfo::namespace_for_item() to find the correct namespace for the variable (as you should), then the variable will be in a namespace reflecting the module structure of the program, with the name of the crate prepended. As files often define new modules in Rust, module names and file names will often conincide.

In your example above, what happens if you have a file like the following:

// file name: some_bool.rs

static B: bool = true;

mod nested {
    static C: bool = false;
}

Is C accessible in GDB as some_bool::C or as some_bool::nested::C? Is it accessible at all? Do you need the single quotes around the namespace?

My guess would be that it's not the file name that one needs to prefix but the module path.

@gentlefolk
Contributor

The single quotes are only necessary to escape the "-" in file names at the GDB prompt. Looks like C is not accessible at all.

(gdb) p some_bool::B
$1 = true
(gdb) p some_bool::C
No symbol "C" in namespace "some_bool".
(gdb) p some_bool::nested::C
No type "nested" within class or namespace "some_bool".
@michaelwoerister
Contributor

This seems to be a known issue with GDB:
http://osdir.com/ml/gdb.bugs.discuss/2002-11/msg00063.html (that's right, 2002 😵)
If you write p 'some_bool::nested::C'it should work just fine.

Great work so far!

@gentlefolk
Contributor

Good find, p 'some_bool::nested::C' does indeed work. At this point I'd say crate level static variables are working as they should from gdb. I may write a few more tests, but I don't expect any issues.

I've now started looking into adding debug info for statics within functions, and am stuck trying track down the source of a segmentation fault that occurs after passing the scope metadata for a function level static into llvm::LLVMDiBuilderCreateStaticVariable().

If you could take a look at the code below and see if you can spot anything it'd be appreciated, as attempting to build rustc from this with debug info enabled seems to trigger the segfault as well.

gentlefolk@294d042#diff-97348589594fa57d1623a3da78753558R326

@michaelwoerister
Contributor

Sure, I'll take a look at it tomorrow.

@michaelwoerister
Contributor

The FunctionContext you pass to create_global_var_metadata() does not reliable reflect whether the given static variable is indeed declared within a function or not. For example, fcx is not propagated by the TransItemVisitor used for generic functions and there is a code path through consts::get_const_val() which also always passes None to consts::trans_const().

However, that isn't the cause of the segfault. It seems that LLVM just doesn't fully support debuginfo for function static variables yet. In theory you do the right thing, but in practice the scope of a static variable mustn't be a lexical scope DIE. If you always assign the function DIE as scope for the static, the crash goes away.

Of course we loose information this way. Clang does so too, at the moment:

#include<stdio.h>
int main() {
    static int X = 10;
    {
        static bool X = false;
        printf("...");
    }
    printf("...");
}

If you compile this with clang 3.4 and break at the printf() statements (gdb or lldb) then p X will always show X to be 10. The same code compiled with GCC shows X to be false in the first case and 10 in the second, as it should be.

So it seems we are out of luck with this, at the moment. My suggestion would be to file a bug report with LLVM and fall back to just always assigning function scope to statics for now.

@gentlefolk
Contributor

Thanks, good find. I'll take a look at the LLVM bugzilla and if this isn't in there I'll file a new issue.

Since getting function statics working looks like its going to take a fair chunk of work, I think I'd rather polish off what I have so far for crate level statics and capture what we've discovered for functions in another bug.

@michaelwoerister
Contributor

Sounds like a plan :)
The work you've done so far is already very valueable on its own. Function statics are much less common anyway.

@gentlefolk gentlefolk added a commit to gentlefolk/rust that referenced this issue Mar 25, 2014
@gentlefolk gentlefolk Initial support for emitting DWARF for static vars.
Only supports crate level statics. No debug info is generated for
function level statics. Closes #9227.
df7e14a
@gentlefolk gentlefolk added a commit to gentlefolk/rust that referenced this issue Mar 28, 2014
@gentlefolk gentlefolk Initial support for emitting DWARF for static vars.
Only supports crate level statics. No debug info is generated for
function level statics. Closes #9227.
190c240
@bors bors added a commit that referenced this issue Mar 29, 2014
@bors bors auto merge of #13143 : gentlefolk/rust/issue-9227, r=michaelwoerister
Only supports crate level statics. No debug info is generated for function level statics. Closes #9227.

As discussed at the end of the comments for #9227, I took an initial stab at adding support for function level statics and decided it would be enough work to warrant being split into a separate issue.

See #13144 for the new issue describing the need to add support for function level static variables.
4f86de5
@bors bors added a commit that referenced this issue Mar 29, 2014
@bors bors auto merge of #13143 : gentlefolk/rust/issue-9227, r=michaelwoerister
Only supports crate level statics. No debug info is generated for function level statics. Closes #9227.

As discussed at the end of the comments for #9227, I took an initial stab at adding support for function level statics and decided it would be enough work to warrant being split into a separate issue.

See #13144 for the new issue describing the need to add support for function level static variables.
2a9eddb
@bors bors added a commit that referenced this issue Mar 29, 2014
@bors bors auto merge of #13143 : gentlefolk/rust/issue-9227, r=michaelwoerister
Only supports crate level statics. No debug info is generated for function level statics. Closes #9227.

As discussed at the end of the comments for #9227, I took an initial stab at adding support for function level statics and decided it would be enough work to warrant being split into a separate issue.

See #13144 for the new issue describing the need to add support for function level static variables.
3eb3a02
@bors bors pushed a commit that closed this issue Mar 29, 2014
@gentlefolk gentlefolk Initial support for emitting DWARF for static vars.
Only supports crate level statics. No debug info is generated for
function level statics. Closes #9227.
f4518cd
@bors bors closed this in f4518cd Mar 29, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment