Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect link of .debug_macro with mold 1.1.1 #438

Closed
tromey opened this issue Apr 17, 2022 · 8 comments
Closed

Incorrect link of .debug_macro with mold 1.1.1 #438

tromey opened this issue Apr 17, 2022 · 8 comments

Comments

@tromey
Copy link

tromey commented Apr 17, 2022

I noticed that gdb became very slow when I switched to linking with mold, so I spent a bit of time tracking it down. I found a link problem. The short form is that the resulting .debug_macro has imports like this:

prentzel. readelf --debug-dump=macro ./r | grep 'import.*0x0'
 DW_MACRO_import - offset : 0x0
 DW_MACRO_import - offset : 0x0
 DW_MACRO_import - offset : 0x0

I'm using

prentzel. mold --version
mold 1.1.1 (compatible with GNU ld)

on x86-64 Fedora 34.

To reproduce, make three files. First, r.h:

#define A 23
#define B 99

Then, r.cc:

#include "r.h"
extern int z();
int main () { return  z() - 122; }

Finally, z.cc:

#include "r.h"
int z()  { return A+B; }

Now:

prentzel. gcc -g3 -O0 -o r r.cc z.cc
prentzel. readelf --debug-dump=macro ./r | grep 'import.*0x0'
 DW_MACRO_import - offset : 0x0
 DW_MACRO_import - offset : 0x0
 DW_MACRO_import - offset : 0x0
@tromey
Copy link
Author

tromey commented Apr 17, 2022

Sorry, hit C-enter and that submitted this before I was ready. Anyway, if I switch to ld.gold, the result seems alright. That doesn't rule out some sort of assembler or compiler bug, but I suppose makes it less likely.

@rui314
Copy link
Owner

rui314 commented Apr 18, 2022

Thank you for your report! What happens here is that

  1. the linker de-duplicate COMDAT groups to remove duplicate definitions, which in this case removes 3 out of 4 .debug_macro sections from z.o
  2. The remaining .debug_macro section has references to the now-removed .debug_macro sections, so mold handle it as if they were at address zero. That's why DW_MACRO_IMPORT has offset 0.

Handling references to dead sections as zeros is a common practice in the debug sections, so I don't know if this is immediately a bug. As far as I know, LLVM lld linker also sets 0 to .debug_macro sections, so can you try again with LLVM lld to see if the problem still persists?

@tromey
Copy link
Author

tromey commented Apr 18, 2022

Handling references to dead sections as zeros is a common practice in the debug sections, so I don't know if this is immediately a bug.

It seems to me that if section A has a reference to section B (which is identical to C), then either removing B is incorrect (since it is referenced) or that the references must then be switched to refer to C (since they are identical).

As far as I know, LLVM lld linker also sets 0 to .debug_macro sections, so can you try again with LLVM lld to see if the problem still persists?

LLVM does show the problem, but gold and the binutils ld do not.

@rui314
Copy link
Owner

rui314 commented Apr 19, 2022

It seems to me that if section A has a reference to section B (which is identical to C), then either removing B is incorrect (since it is referenced) or that the references must then be switched to refer to C (since they are identical).

This is controlled by the COMDAT group mechanism. According to the spec, if two COMDAT groups whose identifiers are the same, we can choose one and remove the other. Their members' identities are not examined. So, it is valid to remove sections here. I believe GNU linkers also remove sections.

LLVM does show the problem, but gold and the binutils ld do not.

Did you observe not only some fields were set to zero but also the same slowness if you use lld?

@tromey
Copy link
Author

tromey commented Apr 19, 2022

LLVM does show the problem, but gold and the binutils ld do not.

Did you observe not only some fields were set to zero but also the same slowness if you use lld?

Yes, just now I re-did my experiment of running gdb on itself and setting a breakpoint. For a gdb linked with gold, the CU expansion took ~0.3 seconds; for a gdb linked with lld, it took ~2.7 seconds. The difference is caused by pathological .debug_macros data.

@tromey
Copy link
Author

tromey commented Apr 19, 2022

This is controlled by the COMDAT group mechanism. According to the spec, if two COMDAT groups whose identifiers are the same, we can choose one and remove the other. Their members' identities are not examined. So, it is valid to remove sections here. I believe GNU linkers also remove sections.

This behavior still seems weird to me, but I did find that there's a GCC bug report about this being separately discovered, reported against lld, and being rejected there: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91239

I'm going to reply there with a link back to this bug.

@Romain-Geissler-1A
Copy link
Contributor

Romain-Geissler-1A commented Apr 20, 2022

Hi,

Gcc folks have replied again that they think it shall be supported by the linker. I don't have the technical knowledge to say wether it's a compiler issue or a linker issue ;) Somehow a discussion will need to be held between all of you to find an agreement. CC-ing @MaskRay for the lld side.

Cheers,
Romain

@rui314
Copy link
Owner

rui314 commented Apr 20, 2022

Thank you for letting me know. I left a comment at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91239

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants