amdllpc does not preserve debug information #513

inequation · 2020-03-18T18:56:45Z

Hello there,

I'm not even sure if I'm posting this issue on the correct project in the suite, so apologies if that isn't the case.

I'm trying to compile a simple Vulkan fragment shader with amdllpc. According to the AMDGPU target user guide, the ELFs produced by it can contain a .debug section with DWARF data. However, I can't seem to be able to get it:

$ llvm-objdump --all-headers ellipse.elf

ellipse.elf:    file format ELF64-amdgpu

architecture: amdgcn
start address: 0x0000000000000000

Program Header:

Dynamic Section:
Sections:
Idx Name            Size     VMA              Type
  0                 00000000 0000000000000000
  1 .strtab         00000052 0000000000000000
  2 .text           00000250 0000000000000000 TEXT
  3 .note           000002a4 0000000000000000
  4 .AMDGPU.disasm  000017c8 0000000000000000
  5 .note.GNU-stack 00000000 0000000000000000
  6 .symtab         00000048 0000000000000000

SYMBOL TABLE:
0000000000000098         .text  00000000 BB0_1
0000000000000000 g     F .text  00000250 _amdgpu_ps_main

I've figured out that I need to set -trim-debug-info=false not to strip the SPIR-V debug info, and I had a look at the SPIR-V lowering code and it seems like the preserves debug info. I can also in the LLVM bitcode emitted to stdout that some symbols are there. How do I get the DWARF info out?

Here's the command line I'm using:

$ amdllpc -gfxip=9.0.6 ../ellipse.frag -trim-debug-info=false -enable-outs

And attached is the stdout output, which contains the GLSL source and all the intermediate stages: outs.log

The text was updated successfully, but these errors were encountered:

kuhar · 2020-03-18T23:56:55Z

Hi @inequation,
From what I know, shader debugging is an area under active development across all fronts (dxc, SPIRV-Tools, llpc, driver), but is not there yet. @jaebaek is an expert on the DXC and SPIRV-Tools side of things.

What do you want to do with the debug info?

Although this is not directly related to your question, note that llvm-objdump doesn't currently produce complete disassemby that would allow you to modify the dump and get a valid elf after re-assembling with llvm-mc. IIRC the .note section is not being fully dumped.

inequation · 2020-03-19T07:52:54Z

Hi @kuhar,

I'm working on a shader programming educational tool. GCN/RDNA ISA can be intimidating initially for inexperienced programmers, especially long, unrolled loops, and I'm in need of a way to provide some context for the generated disassembly ("where does this massive blob of instructions come from?"). I would have linked a publication about the software I'm writing, but it's a team effort and it hasn't been published yet. :)

I'm sure you can see how having that debug info available would be useful in other contexts.

I'm fine with parsing the DWARF myself, I just don't know the LLVM/LLPC codebases nearly well enough to know where to start looking to ensure it is actually dumped.

nhaehnle · 2020-03-24T13:51:42Z

I agree with @kuhar: it's a known limitation because so far, nobody has done the work of just tracking the debug info through the compile pipeline to make sure it doesn't disappear. Work on this would of course be welcome, otherwise we'll surely get to it eventually, but with no date attached ;)

inequation · 2020-03-25T08:27:34Z

I'm fine with working on this myself and contributing the work here, but some guidance would be necessary. For instance, I wasn't able to find my way through all the abstraction layers of LLVM to where bitcode gets lowered to GCN ISA, or where the .debug ELF section gets populated. All I need is a shortcut to these places, and I can begin further investigation on my own. Can you guys help with that?

Flakebi · 2020-03-25T10:01:51Z

The part in LLPC which adds the necessary LLVM passes to convert IR to an ELF should be this line:

llpc/llpc/builder/llpcBuilderContext.cpp

Line 264 in d58bd5f

    
           if (GetTargetMachine()->addPassesToEmitFile(passMgr, outStream, nullptr, codegen::getFileType()))

The ELF writing code in LLVM that is specific to the AMDGPU backend should be in the llvm/lib/Target/AMDGPU/MCTargetDesc directory of LLVM: https://github.com/llvm/llvm-project/tree/master/llvm/lib/Target/AMDGPU/MCTargetDesc
I think AMDGPUTargetStreamer.cpp and AMDGPUAsmPrinter.cpp are the main classes responsible for creating ELF files (please correct me if I’m mistaken).

Btw, if you start amdllpc with -debug you get loads of output of all the other stages that happen after IR (SelectionDAG and MachineIR). If symbols get lost somewhere this might be helpful.

nhaehnle · 2020-03-25T10:53:56Z

Btw, if you start amdllpc with -debug you get loads of output of all the other stages that happen after IR (SelectionDAG and MachineIR). If symbols get lost somewhere this might be helpful.

-print-before-all / -print-after-all also applies to amdllpc and is extremely helpful for understanding the flow of compilation.

…ure use' The previous fix broke compatibility in some AMD internal builds. This commit, in conjunction with the corresponding XGL commit, fixes that. Change-Id: Iec0ee5e489b2a15b8eb30add8ddadddeb0f20fad Pull-Request: GPUOpen-Drivers#513 Author: Tim Renouf <tim.renouf@amd.com> git-pf-change: stg@2087392

inequation · 2020-06-18T21:36:27Z

Hello there! Long time, no see.

I finally got to explore this a bit, and with two hacks inside the LLVM codebase, I was able to generate an ELF with the following debug line info:

inequation@Spearhead:/mnt/d/projects/GPUOpen-Drivers/vulkandriver/drivers/xgl/builds/Debug64$ readelf --debug-dump=decodedline ellipse.elf
readelf: Error: Missing knowledge of 32-bit reloc types used in DWARF sections of machine number 224
readelf: Warning: unable to apply unsupported reloc type 3 to section .debug_line
Decoded dump of debug contents of section .debug_line:

CU: <stdin>:
File name                            Line number    Starting address
<stdin>                                       20                0x18

<stdin>                                       25                0x30
<stdin>                                       20                0x38
<stdin>                                       25                0x4c
<stdin>                                       21                0x58
<stdin>                                       20                0x60
<stdin>                                        0                0x68
<stdin>                                       12                0x98
<stdin>                                       15                0x9c
<stdin>                                       13                0xa0
<stdin>                                       15                0xa4
<stdin>                                       13                0xa8
<stdin>                                       15                0xac
<stdin>                                       36                0xc0
<stdin>                                       13                0xc8
<stdin>                                       15                0xcc
<stdin>                                       36                0xd0
<stdin>                                       15                0xd8
<stdin>                                       39                0xe0
<stdin>                                       13                0xe4
<stdin>                                       15                0xe8
<stdin>                                       36                0xf4
<stdin>                                       13                0xfc
<stdin>                                       39               0x100
<stdin>                                       15               0x108
<stdin>                                       36               0x114
<stdin>                                       39               0x11c
<stdin>                                       40               0x120
<stdin>                                       15               0x128
<stdin>                                       36               0x130
<stdin>                                       39               0x138
<stdin>                                       42               0x13c
<stdin>                                       13               0x140
<stdin>                                       40               0x144
<stdin>                                       15               0x14c
<stdin>                                       36               0x150
<stdin>                                       39               0x158
<stdin>                                       42               0x15c
<stdin>                                       40               0x168
<stdin>                                       13               0x170
<stdin>                                       15               0x174
<stdin>                                       39               0x17c
<stdin>                                       42               0x180
<stdin>                                       40               0x18c
<stdin>                                       15               0x194
<stdin>                                       36               0x198
<stdin>                                       42               0x1a0
<stdin>                                       40               0x1ac
<stdin>                                       15               0x1b4
<stdin>                                       39               0x1bc
<stdin>                                       13               0x1c0
<stdin>                                       42               0x1c4
<stdin>                                       15               0x1d0
<stdin>                                       36               0x1d4
<stdin>                                       40               0x1dc
<stdin>                                       42               0x1e4
<stdin>                                       15               0x1e8
<stdin>                                       39               0x1f0
<stdin>                                       42               0x1f4
<stdin>                                       36               0x1f8
<stdin>                                       40               0x200
<stdin>                                       42               0x208
<stdin>                                       39               0x210
<stdin>                                       40               0x214
<stdin>                                       42               0x218
<stdin>                                       34               0x21c
<stdin>                                       42               0x220
<stdin>                                       53               0x224
<stdin>                                       42               0x228
<stdin>                                       53               0x22c
<stdin>                                       45               0x230

And the sequence of line numbers makes sense, comparing it to the source GLSL (available in the attachment to the OP), so this is progress!

There are, however, clear problems (empty source file name replaced with <stdin>, for starters). They are explained to some extent by the hacks which were necessary to get this to work. I'm attaching them as patch - in both cases, I'm disabling checks that would prevent emission of debug symbols. The problems, as far as my limited understanding of LLVM goes, amount to:

machine instructions not belonging to a subprogram, which made DwarfDebug::beginInstruction() bail out before recording any source lines,
compilation units containing none of the following: types, retained types, global variables, and macros, preventing the registration of the DWARF compile unit and hitting an assertion, once the above hurdle was removed.

I'm quite sure these are just symptoms of problems that happen somewhere earlier up the pipeline. I'll keep digging, but I'd appreciate any guidance I can get.

inequation · 2020-06-18T23:13:07Z

I synced to latest, and of course the patch no longer applied cleanly. Here it is attached, updated to match latest GPUOpen-Drivers/llvm-project, along with the new, much shorter output:

inequation@Spearhead:/mnt/d/projects/GPUOpen-Drivers/vulkandriver/drivers/xgl/builds/Debug64$ readelf --debug-dump=decodedline ellipse.elf
readelf: Error: Missing knowledge of 32-bit reloc types used in DWARF sections of machine number 224
readelf: Warning: unable to apply unsupported reloc type 3 to section .debug_line
Decoded dump of debug contents of section .debug_line:

CU: <stdin>:
File name                            Line number    Starting address
<stdin>                                       20                0x20

<stdin>                                       25                0x3c
<stdin>                                       20                0x44
<stdin>                                       25                0x58
<stdin>                                       20                0x64
<stdin>                                       21                0x6c
<stdin>                                        0                0x74
<stdin>                                       12                0xa0
<stdin>                                       15                0xa4
<stdin>                                       13                0xa8
<stdin>                                       15                0xac
<stdin>                                       13                0xb0
<stdin>                                       15                0xb4
<stdin>                                       36                0xc8
<stdin>                                       15                0xd8
<stdin>                                       39                0xe0
<stdin>                                       36                0xe4
<stdin>                                       39                0xec
<stdin>                                       40                0xf4
<stdin>                                       39                0xfc
<stdin>                                       42               0x100
<stdin>                                       40               0x104
<stdin>                                       34               0x108
<stdin>                                       42               0x10c
<stdin>                                       53               0x118
<stdin>                                       42               0x11c
<stdin>                                       45               0x124

hacks.txt

inequation · 2020-06-19T17:48:09Z

I believe I have a proper fix for half of the issue. See the PR mentioned above (#772).

The other half is that source file name gets lost somewhere on the way and becomes empty, which is later interpreted as <stdin>. #line directives are also lost, in terms of source string number.

inequation · 2020-06-23T19:23:07Z

#756 has the potential to resolve all my problems, and now includes my changes from #772. Needs testing.

inequation · 2020-07-05T19:22:02Z

As of current head, this almost works as I need it to! Fixing source file name requires a fix within glslang, which I'll be trying to get in via KhronosGroup/glslang#2321.

jinjianrong assigned amdrexu Mar 26, 2020

amdrexu removed their assignment Jun 2, 2020

brianwatling self-assigned this Jun 16, 2020

inequation mentioned this issue Jun 19, 2020

Ensure the association of functions with DI subprograms #772

Closed

jinjianrong added the enhancement New feature or request label Jul 14, 2020

jinjianrong closed this as completed Mar 5, 2021

jinjianrong added fixed and removed enhancement New feature or request labels Mar 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

amdllpc does not preserve debug information #513

amdllpc does not preserve debug information #513

inequation commented Mar 18, 2020

kuhar commented Mar 18, 2020 •

edited

Loading

inequation commented Mar 19, 2020

nhaehnle commented Mar 24, 2020

inequation commented Mar 25, 2020 •

edited

Loading

Flakebi commented Mar 25, 2020

nhaehnle commented Mar 25, 2020

inequation commented Jun 18, 2020

inequation commented Jun 18, 2020

inequation commented Jun 19, 2020 •

edited

Loading

inequation commented Jun 23, 2020

inequation commented Jul 5, 2020

amdllpc does not preserve debug information #513

amdllpc does not preserve debug information #513

Comments

inequation commented Mar 18, 2020

kuhar commented Mar 18, 2020 • edited Loading

inequation commented Mar 19, 2020

nhaehnle commented Mar 24, 2020

inequation commented Mar 25, 2020 • edited Loading

Flakebi commented Mar 25, 2020

nhaehnle commented Mar 25, 2020

inequation commented Jun 18, 2020

inequation commented Jun 18, 2020

inequation commented Jun 19, 2020 • edited Loading

inequation commented Jun 23, 2020

inequation commented Jul 5, 2020

kuhar commented Mar 18, 2020 •

edited

Loading

inequation commented Mar 25, 2020 •

edited

Loading

inequation commented Jun 19, 2020 •

edited

Loading