Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GDB hardcodes startswith(producer, "clang ") to cope with LLVM debug line info in prologues. #41252

Open
eddyb opened this issue Apr 12, 2017 · 14 comments
Labels
A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.) C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@eddyb
Copy link
Member

eddyb commented Apr 12, 2017

Only observed on x86_32, see #40367 (comment) and relevant code in GDB source.

LLVM appears to generate the .loc` for the first LLVM IR instruction in a function just before the last instruction of the prologue, resulting in GDB refusing to break on that location (by file and line number).

Not sure if this is solvable in GDB - perhaps by using the location information from the prologue after the prologue? Or maybe the intention is to actually never do that? It's really not clear from the GDB source.

Before #40367, we used to have a branch from the LLVM IR entry block to the MIR start block, which triggered the problem, but not in an user-visible way. Subsequent instructions were and still are unaffected.

As a workaround, I changed producer from rustc version _ to clang LLVM (rustc version _).

cc @michaelwoerister @Manishearth

@eddyb eddyb changed the title GDB hardcodes startswith(producer, "clang ") to handle LLVM debug line info. GDB hardcodes startswith(producer, "clang ") to cope with LLVM debug line info in prologues. Apr 12, 2017
@bstrie
Copy link
Contributor

bstrie commented Apr 12, 2017

Why not future-proof this sucker and go with Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (LLVM, like Clang) Rust/1.18.0 Safari/537.36?

@bstrie bstrie added the A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.) label Apr 12, 2017
@tromey
Copy link
Contributor

tromey commented May 7, 2017

The original gdb thread starts here: https://sourceware.org/ml/gdb-patches/2012-11/msg00492.html

It seems that the idea was to use producer sniffing to decide when the DWARF prologue information can be trusted; but it was done in a way that is specific to clang. LLVM doesn't seem to mention its own presence generically in the producer; that is, for clang I see just "clang version 3.8.1 (tags/RELEASE_381/final)".

I filed a gdb bug for this here: https://sourceware.org/bugzilla/show_bug.cgi?id=21470

@est31
Copy link
Member

est31 commented May 7, 2017

IMO gdb should check for the presence of "clang" or for "LLVM". Then all languages like Rust that use llvm can put "LLVM" into their producer string when encountering the bug. We can then kindly ask clang to include "LLVM" into its producer string as well, and long term the "clang" check in gdb can be removed some day.

@tromey
Copy link
Contributor

tromey commented May 7, 2017

Yeah, that would be an improvement. I wonder though if gdb can just blacklist producers instead though. I plan to try that.

@est31
Copy link
Member

est31 commented May 30, 2022

The check has now been updated to either look for clang or for flang at the start, so it still can't be changed back in rustc.

@Nilstrieb Nilstrieb added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Apr 5, 2023
@dwblaikie
Copy link

Anyone happen to have a link to a bug or summary/example of the issue so that we might fix this on the LLVM side? Or discuss more in detail what the right behavior/solution is on all fronts?

@est31
Copy link
Member

est31 commented Jan 11, 2024

Some producer strings used in the wild:

$ echo "int main(void) {return 0;}" > a.c ; clang -g a.c -o a.c.out ; dwarfdump ./a.c.out  | rg DW_AT_producer
                    DW_AT_producer              (indexed string: 0x00000000)clang version 16.0.6
$ echo "fn main(){}" > a.rs; rustc a.rs -o a.rs.out ; dwarfdump a.rs.out | rg DW_AT_producer
                    DW_AT_producer              clang LLVM (rustc version 1.76.0-beta.1 (0e09125c6 2023-12-21))
$ echo "int main(void) {return 0;}" > a.c ; gcc -g a.c -o a.c.out ; dwarfdump ./a.c.out  | rg DW_AT_producer
                    DW_AT_producer              GNU C17 12.3.0 -mtune=generic -march=x86-64 -g -O2 -fPIC -fstack-protector-strong -fno-strict-overflow -frandom-seed=jz8xqjsrfn --param=ssp-buffer-size=4
$ printf 'package main\n\nfunc main() {}' > a.go ; go build -o a.go.out a.go ; dwarfdump a.go.out | rg DW_AT_producer
                    DW_AT_producer              Go cmd/compile go1.21.4; regabi

I don't think we should address this via improved producer string sniffing however, it's better to fix the debuginfo output.

@dwblaikie
Copy link

oh, sorry, yes - that's what I meant. Does someone have an example of the problematic debuginfo output, so we could file a clang bug/look into it on that side?

@est31
Copy link
Member

est31 commented Jan 14, 2024

@dwblaikie maybe this? #40367 (comment)

@dwblaikie
Copy link

@dwblaikie maybe this? #40367 (comment)

Any sense of what the input (LLVM IR, presumably) is to reproduce that output/behavior in LLVM?

& I'm a bit confused by https://sourceware.org/bugzilla/show_bug.cgi?id=21470 - which seems to suggest the Clang behavior is the correct one, as far as GDB's concerned - and it's basically "if clang, trust the line table, otherwise go and do some manual work to workaround some bad producers (GCC)" & then they filed/got some GCC bug(s) fixed...

@tromey
Copy link
Contributor

tromey commented Feb 1, 2024

@dwblaikie maybe this? #40367 (comment)

& I'm a bit confused by https://sourceware.org/bugzilla/show_bug.cgi?id=21470 - which seems to suggest the Clang behavior is the correct one, as far as GDB's concerned

Yes, my understanding is that LLVM emits the end-of-prologue marker correctly, so GDB should respect it. However, GCC historically (don't know about today) did not, so has to be blacklisted. The issue for GDB, then, is detecting "LLVM" since there's no general indication in the producer string.

If that's incorrect, it would be good to have a note in the GDB bug for future reference.

I don't know if there's a separate rustc and/or LLVM problem here or not.

@dwblaikie
Copy link

@dwblaikie maybe this? #40367 (comment)

& I'm a bit confused by https://sourceware.org/bugzilla/show_bug.cgi?id=21470 - which seems to suggest the Clang behavior is the correct one, as far as GDB's concerned

Yes, my understanding is that LLVM emits the end-of-prologue marker correctly, so GDB should respect it. However, GCC historically (don't know about today) did not, so has to be blacklisted. The issue for GDB, then, is detecting "LLVM" since there's no general indication in the producer string.

Ah, OK - I just assumed the special case in GDB was to workaround an LLVM bug, rather than to work around a GCC bug...

Huh, I wonder why rust or LLVM needed the fix - if GDB's fallback was to ignore the prologue marker and derive it itself - presumably that'd be no worse for LLVM as it is for GCC... still a bit confused. But at least seems like there's not some latent issue we should be fixing in LLVM - so I appreciate the discussion/clarifications.

@tromey
Copy link
Contributor

tromey commented Feb 4, 2024

Huh, I wonder why rust or LLVM needed the fix - if GDB's fallback was to ignore the prologue marker and derive it itself - presumably that'd be no worse for LLVM as it is for GCC... still a bit confused.

I think GCC follows the old convention of marking the prologue end with a double line note (two statements starting at the same spot), and GDB looks for this. Not sure if GDB falls back to prologue analyzer if this isn't found but I suspect if there's a line table it would not.

Anyway probably not an LLVM issue.

@dwblaikie
Copy link

Huh, I wonder why rust or LLVM needed the fix - if GDB's fallback was to ignore the prologue marker and derive it itself - presumably that'd be no worse for LLVM as it is for GCC... still a bit confused.

I think GCC follows the old convention of marking the prologue end with a double line note (two statements starting at the same spot), and GDB looks for this. Not sure if GDB falls back to prologue analyzer if this isn't found but I suspect if there's a line table it would not.

Oh, I think I'm following. That then makes more sense why this should be a GCC opt-in "oh, it's GCC, I expect this particular representation (double line note) to tell me something" rather than an LLVM opt-out. Thanks for the context!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.) C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants