-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
coff-go32-exe: support variable length stub #1
Conversation
…_coff_bfd_copy_private_bfd_data
…y created, and add 20 bytes for the coff header.
Most things seem to work now, eg |
I think this is the problem: |
…ILHDR structure, instead of the actual file header size from coff_backend_info.
Previous commit fixes the segfault. |
…shorter than total header size.
d51ec80 fixes the clobbering issue, it was a use-after-free bug. |
new attempt --> #2 |
Running anything with the fission.exp board fails since commit c0ab21c ("Replace init_cutu_and_read_dies with a class"). GDB crashes while reading the DWARF info. cu is NULL in read_signatured_type: Thread 1 "gdb" received signal SIGSEGV, Segmentation fault. 0x000055555780663e in read_signatured_type sig_type=0x6210000c3600) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22782 22782 gdb_assert (cu->die_hash == NULL); (top-gdb) bt #0 0x000055555780663e in read_signatured_type (sig_type=0x6210000c3600) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22782 #1 0x00005555578062dd in load_full_type_unit (per_cu=0x6210000c3600) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:22758 #2 0x00005555577c5fb7 in queue_and_load_dwo_tu (slot=0x60600007fc00, info=0x6210000c34e0) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:12674 #3 0x0000555559934232 in htab_traverse_noresize (htab=0x60b000063670, callback=0x5555577c5e61 <queue_and_load_dwo_tu(void**, void*)>, info=0x6210000c34e0) at /home/simark/src/binutils-gdb/libiberty/hashtab.c:775 bminor#4 0x00005555577c6252 in queue_and_load_all_dwo_tus (per_cu=0x6210000c34e0) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:12701 bminor#5 0x000055555777ebd8 in dw2_do_instantiate_symtab (per_cu=0x6210000c34e0, skip_partial=false) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:2371 bminor#6 0x000055555777eea2 in dw2_instantiate_symtab (per_cu=0x6210000c34e0, skip_partial=false) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:2395 bminor#7 0x0000555557786ab6 in dw2_lookup_symbol (objfile=0x614000007240, block_index=GLOBAL_BLOCK, name=0x602000025310 "main", domain=VAR_DOMAIN) at /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:3539 After creating the reader object, the reader.cu field should not be NULL. By checking the commit previous to the faulty one mentioned above, I noticed that the cu field is normally set by init_cu_die_reader, called from read_cutu_die_from_dwo, itself called from cutu_reader::init_tu_and_read_dwo_dies, itself called from cutu_reader's constructor. However, cutu_reader::init_tu_and_read_dwo_dies calls read_cutu_die_from_dwo, passing a pointer to a local `die_reader_specs` variable. So it's the `cu` field of that object that gets set. cutu_reader itself is a `die_reader_specs` (it inherits from it), and the intention was most likely to pass `this` to read_cutu_die_from_dwo. This way, the fields of the cutu_reader object, which read_signatured_type will use, are set. With this, I am able to use: make check RUNTESTFLAGS='--target_board=fission' and it looks much better. There are still some failures to be investigated, but that's the usual state of the testsuite. gdb/ChangeLog: * dwarf2/read.c (cutu_reader::init_tu_and_read_dwo_dies): Remove reader variable, pass `this` to read_cutu_die_from_dwo.
I'm running into the following failure (and 17 more like it) in gdb.base/break-interp.exp: ... (gdb) bt^M #0 0x00007fde85a3b0c1 in __GI___nanosleep \ (requested_time=requested_time@entry=0x7ffe5044ee70, \ remaining=remaining@entry=0x7ffe5044ee70) at nanosleep.c:27^M #1 0x00007fde85a3affa in __sleep (seconds=0) at sleep.c:55^M #2 0x00007fde8606789c in libfunc (Reading in symbols for libc-start.c...^M action=0x7ffe5044fa12 "sleep") at gdb.base/break-interp-lib.c:41^M #3 0x0000000000400708 in main ()^M Reading in symbols for ../sysdeps/x86_64/start.S...^M (gdb) FAIL: gdb.base/break-interp.exp: LDprelinkNOdebugNO: \ BINprelinkNOdebugNOpieNO: INNER: attach: attach main bt ... The problem is that the test uses verbose mode to detect the "PIE (Position Independent Executable) displacement" messages, but the verbose mode also triggers "Reading in symbols for" messages, which may appear in the middle of a backtrace (or not, depending on whether debug info is available). [ In fact, the messages appear in the middle of a backtrace line, which is PR25613. ] Fix these FAILs by limiting the scope of verbose to the parts of the test that need it. Tested on x86_64-linux. gdb/testsuite/ChangeLog: 2020-03-11 Tom de Vries <tdevries@suse.de> * gdb.base/break-interp.exp: Limit verbose scope.
A patch somewhat like this patch has been in Fedora GDB for well over a decade. The Fedora patch was written by Jan Kratochvil. The Fedora version prints a warning and attempts to continue. This version will error out, fatally. An earlier version of this patch was more like the Fedora version than this one. Simon Marchi recommended use of an assertion to test for the infinite recursion; I decided to use an explicit test (with an "if" statement) along with a call to internal_error() if the condition is met. This way, I could include a plea to file a bug report. It was motivated by a customer reported bug (back in 2006!) which showed infinite mutual recursion between find_pc_sect_line and find_pc_line. Here is a portion of the backtrace from the bug report: (gdb) bt #0 0x00000000004450a4 in lookup_minimal_symbol_by_pc_section ( pc=251700325328, section=0x570f500) at gdb/minsyms.c:484 #1 0x00000000004bbfb2 in find_pc_sect_line (pc=251700325328, section=0x570f500, notcurrent=0) at gdb/symtab.c:2057 #2 0x00000000004bc480 in find_pc_line (pc=251700325328, notcurrent=0) at gdb/symtab.c:2232 #3 0x00000000004bc1ff in find_pc_sect_line (pc=251700325328, section=0x570f500, notcurrent=0) at gdb/symtab.c:2081 ... (lots and lots of the same two functions with the same parameters) #1070 0x00000000004bc480 in find_pc_line (pc=251700325328, notcurrent=0) at gdb/symtab.c:2232 #1071 0x00000000004bc1ff in find_pc_sect_line (pc=251700325328, section=0x570f500, notcurrent=0) at gdb/symtab.c:2081 #1072 0x00000000004bc480 in find_pc_line (pc=251700325328, notcurrent=0) at gdb/symtab.c:2232 #1073 0x00000000004bc1ff in find_pc_sect_line (pc=251700325328, section=0x570f500, notcurrent=0) at gdb/symtab.c:2081 #1074 0x00000000004bc480 in find_pc_line (pc=251700325328, notcurrent=0) at gdb/symtab.c:2232 #1075 0x00000000004bc1ff in find_pc_sect_line (pc=251696794399, section=0x59b0df8, notcurrent=0) at gdb/symtab.c:2081 #1076 0x00000000004bc480 in find_pc_line (pc=251696794399, notcurrent=0) at gdb/symtab.c:2232 #1077 0x000000000055550e in find_frame_sal (frame=0xb3f3e0, sal=0x7fff1d1a8200) at gdb/frame.c:1392 #1078 0x00000000004d86fd in set_current_sal_from_frame (frame=0x1648, center=1) at gdb/stack.c:379 #1079 0x00000000004cf137 in normal_stop () at gdb/infrun.c:3147 ... The test case was a large application. Attempts were made to make a small(er) test case, but those attempts were not successful. Therefore, I cannot provide a new test for this patch. That said, we ought to guard against recursively calling find_pc_sect_line (via find_pc_line) with the identical PC value that it had been called with. Should this happen, infinite recursion (as shown in the above backtrace) is the result. This patch prevents that from happening. If this should happens, there is a bug somewhere, perhaps in GDB, perhaps in some other part of the toolchain or a library. We error out fatally with a message briefly describing the condition along with a plea to file a bug report. I spent some time looking at the surrounding code and commentary which handle the case of PC being in a stub/trampoline. It first appeared in the public GDB repository in April, 1999. The ChangeLog entry for this commit is from 1998-12-31. The relevant portion is: (find_pc_sect_line): Return correct information if pc is in import or export stub (trampoline). What's remarkable about the overall ChangeLog entry is that it's over 2500+ lines long! I believe that this was part of the infamous "HP merge" (in which insufficient due diligence was given in accepting a large batch of changes from an outside source). In the years that followed, much of this code was either significantly revised or outright removed. For this particular case, I'm grateful that extensive comments were provided by "RT". (I haven't been able to figure out who RT is/was.) I've decided against attempting to revise this stub/trampoline handling code any further than adding Jan's test which prevents an obvious case of infinite recursion. I've tested on Fedora 31, x86-64. I see no regressions. I've also searched the logfile for the new message, but as expected, no message was found (which is good). gdb/ChangeLog: * symtab.c (find_pc_sect_line): Add check which prevents infinite recursion. Change-Id: I595470be6ab5f61ca7e4e9e70c61a252c0deaeaa
The type struct compunit_symtab contains two fields (disregarding field next) that express relations with other compunit_symtabs: user and includes. These fields are currently not printed with "maint info symtabs" and "maint print symbols". Fix this such that for "maint info symtabs" we print: ... { ((struct compunit_symtab *) 0x23e8450) debugformat DWARF 2 producer (null) dirname (null) blockvector ((struct blockvector *) 0x23e8590) + user ((struct compunit_symtab *) 0x2336280) + ( includes + ((struct compunit_symtab *) 0x23e85e0) + ((struct compunit_symtab *) 0x23e8960) + ) { symtab <unknown> ((struct symtab *) 0x23e85b0) fullname (null) linetable ((struct linetable *) 0x0) } } ... And for "maint print symbols" we print: ... -Symtab for file <unknown> +Symtab for file <unknown> at 0x23e85b0 Read from object file /data/gdb_versions/devel/a.out (0x233ccf0) Language: c Blockvector: block #000, object at 0x23e8530, 0 syms/buckets in 0x0..0x0 block #1, object at 0x23e84d0 under 0x23e8530, 0 syms/buckets in 0x0..0x0 +Compunit user: 0x2336300 +Compunit include: 0x23e8900 +Compunit include: 0x23dd970 ... Note: for user and includes we don't list the actual compunit_symtab address, but instead the corresponding symtab address, which allows us to find that symtab elsewhere in the output (given that we also now print the address of symtabs). gdb/ChangeLog: 2020-03-25 Tom de Vries <tdevries@suse.de> * symtab.h (is_main_symtab_of_compunit_symtab): New function. * symmisc.c (dump_symtab_1): Print user and includes fields. (maintenance_info_symtabs): Same.
When you have a Thumb only PLT then the address in the GOT for PLT0 needs to have the Thumb bit set since the instruction used in PLTn to get there is `ldr.w pc` which is an inter-working instruction: the PLT sequence in question is 00000120 <foo@plt>: 120: f240 0c98 movw ip, #152 ; 0x98 124: f2c0 0c01 movt ip, #1 128: 44fc add ip, pc 12a: f8dc f000 ldr.w pc, [ip] 12e: e7fc b.n 12a <foo@plt+0xa> Disassembly of section .text: 00000130 <bar>: 130: b580 push {r7, lr} 132: af00 add r7, sp, #0 134: f7ff fff4 bl 120 <foo@plt> and previously the linker would generate Hex dump of section '.got': ... 0x000101b8 40010100 00000000 00000000 10010000 @............... Which would make it jump and transition out of thumb mode and crash since you only have thumb mode on such cores. Now it correctly generates Hex dump of section '.got': ... 0x000101b8 40010100 00000000 00000000 11010000 @............... Thanks to Amol for testing patch and to rgujju for reporting it. bfd/ChangeLog: PR ld/16017 * elf32-arm.c (elf32_arm_populate_plt_entry): Set LSB of the PLT0 address in the GOT if in thumb only mode. ld/ChangeLog: PR ld/16017 * testsuite/ld-arm/arm-elf.exp (thumb-plt-got): New. * testsuite/ld-arm/thumb-plt-got.d: New test.
In PR28004 the following warning / Internal error is reported: ... $ gdb -q -batch \ -iex "set sysroot $(pwd -P)/repro" \ ./repro/gdb \ ./repro/core \ -ex bt ... Program terminated with signal SIGABRT, Aborted. #0 0x00007ff8fe8e5d22 in raise () from repro/usr/lib/libc.so.6 [Current thread is 1 (LWP 1762498)] #1 0x00007ff8fe8cf862 in abort () from repro/usr/lib/libc.so.6 warning: (Internal error: pc 0x7ff8feb2c21d in read in psymtab, \ but not in symtab.) warning: (Internal error: pc 0x7ff8feb2c218 in read in psymtab, \ but not in symtab.) ... #2 0x00007ff8feb2c21e in __gnu_debug::_Error_formatter::_M_error() const \ [clone .cold] (warning: (Internal error: pc 0x7ff8feb2c21d in read in \ psymtab, but not in symtab.) ) from repro/usr/lib/libstdc++.so.6 ... The warning is about the following: - in find_pc_sect_compunit_symtab we try to find the address (0x7ff8feb2c218 / 0x7ff8feb2c21d) in the symtabs. - that fails, so we try again in the partial symtabs. - we find a matching partial symtab - however, the partial symtab has a full symtab, so we should have found a matching symtab in the first step. The addresses are: ... (gdb) info sym 0x7ff8feb2c218 __gnu_debug::_Error_formatter::_M_error() const [clone .cold] in \ section .text of repro/usr/lib/libstdc++.so.6 (gdb) info sym 0x7ff8feb2c21d __gnu_debug::_Error_formatter::_M_error() const [clone .cold] + 5 in \ section .text of repro/usr/lib/libstdc++.so.6 ... which correspond to unrelocated addresses 0x9c218 and 0x9c21d: ... $ nm -C repro/usr/lib/libstdc++.so.6.0.29 | grep 000000000009c218 000000000009c218 t __gnu_debug::_Error_formatter::_M_error() const \ [clone .cold] ... which belong to function __gnu_debug::_Error_formatter::_M_error() in /build/gcc/src/gcc/libstdc++-v3/src/c++11/debug.cc. The partial symtab that is found for the addresses is instead the one for /build/gcc/src/gcc/libstdc++-v3/src/c++98/bitmap_allocator.cc, which is incorrect. This happens as follows. The bitmap_allocator.cc CU has DW_AT_ranges at .debug_rnglist offset 0x4b50: ... 00004b50 0000000000000000 0000000000000056 00004b5a 00000000000a4790 00000000000a479c 00004b64 00000000000a47a0 00000000000a47ac ... When reading the first range 0x0..0x56, it doesn't trigger the "start address of zero" complaint here: ... /* A not-uncommon case of bad debug info. Don't pollute the addrmap with bad data. */ if (range_beginning + baseaddr == 0 && !per_objfile->per_bfd->has_section_at_zero) { complaint (_(".debug_rnglists entry has start address of zero" " [in module %s]"), objfile_name (objfile)); continue; } ... because baseaddr != 0, which seems incorrect given that when loading the shared library individually in gdb (and consequently baseaddr == 0), we do see the complaint. Consequently, we run into this case in dwarf2_get_pc_bounds: ... if (low == 0 && !per_objfile->per_bfd->has_section_at_zero) return PC_BOUNDS_INVALID; ... which then results in this code in process_psymtab_comp_unit_reader being called with cu_bounds_kind == PC_BOUNDS_INVALID, which sets the set_addrmap argument to 1: ... scan_partial_symbols (first_die, &lowpc, &highpc, cu_bounds_kind <= PC_BOUNDS_INVALID, cu); ... and consequently, the CU addrmap gets build using address info from the functions. During that process, addrmap_set_empty is called with a range that includes 0x9c218 and 0x9c21d: ... (gdb) p /x start $7 = 0x9989c (gdb) p /x end_inclusive $8 = 0xb200d ... but it's called for a function at DIE 0x54153 with DW_AT_ranges at 0x40ae: ... 000040ae 00000000000b1ee0 00000000000b200e 000040b9 000000000009989c 00000000000998c4 000040c3 <End of list> ... and neither range includes 0x9c218 and 0x9c21d. This is caused by this code in partial_die_info::read: ... if (dwarf2_ranges_read (ranges_offset, &lowpc, &highpc, cu, nullptr, tag)) has_pc_info = 1; ... which pretends that the function is located at addresses 0x9989c..0xb200d, which is indeed not the case. This patch fixes the first problem encountered: fix the "start address of zero" complaint warning by removing the baseaddr part from the condition. Same for dwarf2_ranges_process. The effect is that: - the complaint is triggered, and - the warning / Internal error is no longer triggered. This does not fix the observed problem in partial_die_info::read, which is filed as PR28200. Tested on x86_64-linux. Co-Authored-By: Simon Marchi <simon.marchi@polymtl.ca> gdb/ChangeLog: 2021-07-29 Simon Marchi <simon.marchi@polymtl.ca> Tom de Vries <tdevries@suse.de> PR symtab/28004 * gdb/dwarf2/read.c (dwarf2_rnglists_process, dwarf2_ranges_process): Fix zero address complaint. * gdb/testsuite/gdb.dwarf2/dw2-zero-range-shlib.c: New test. * gdb/testsuite/gdb.dwarf2/dw2-zero-range.c: New test. * gdb/testsuite/gdb.dwarf2/dw2-zero-range.exp: New file.
While working on the testsuite, I ended up noticing that GDB fails to produce a full backtrace from a thread waiting in pthread_join. When selecting the waiting thread and using the 'bt' command, the following result can be observed: (gdb) bt #0 0x0000003ff7fccd20 in __futex_abstimed_wait_common64 () from /lib/riscv64-linux-gnu/libpthread.so.0 #1 0x0000003ff7fc43da in __pthread_clockjoin_ex () from /lib/riscv64-linux-gnu/libpthread.so.0 Backtrace stopped: frame did not save the PC On my platform, I do not have debug symbols for glibc, so I need to rely on prologue analysis in order to unwind stack. Here is what the function prologue looks like: (gdb) disassemble __pthread_clockjoin_ex Dump of assembler code for function __pthread_clockjoin_ex: 0x0000003ff7fc42de <+0>: addi sp,sp,-144 0x0000003ff7fc42e0 <+2>: sd s5,88(sp) 0x0000003ff7fc42e2 <+4>: auipc s5,0xd 0x0000003ff7fc42e6 <+8>: ld s5,-2(s5) # 0x3ff7fd12e0 0x0000003ff7fc42ea <+12>: ld a5,0(s5) 0x0000003ff7fc42ee <+16>: sd ra,136(sp) 0x0000003ff7fc42f0 <+18>: sd s0,128(sp) 0x0000003ff7fc42f2 <+20>: sd s1,120(sp) 0x0000003ff7fc42f4 <+22>: sd s2,112(sp) 0x0000003ff7fc42f6 <+24>: sd s3,104(sp) 0x0000003ff7fc42f8 <+26>: sd s4,96(sp) 0x0000003ff7fc42fa <+28>: sd s6,80(sp) 0x0000003ff7fc42fc <+30>: sd s7,72(sp) 0x0000003ff7fc42fe <+32>: sd s8,64(sp) 0x0000003ff7fc4300 <+34>: sd s9,56(sp) 0x0000003ff7fc4302 <+36>: sd a5,40(sp) As far as prologue analysis is concerned, the most interesting part is done at address 0x0000003ff7fc42ee (<+16>): 'sd ra,136(sp)'. This stores the RA (return address) register on the stack, which is the information we are looking for in order to identify the caller. In the current implementation of the prologue scanner, GDB stops when hitting 0x0000003ff7fc42e6 (<+8>) because it does not know what to do with the 'ld' instruction. GDB thinks it reached the end of the prologue but have not yet reached the important part, which explain GDB's inability to unwind past this point. The section of the prologue starting at <+4> until <+12> is used to load the stack canary[1], which will then be placed on the stack at <+36> at the end of the prologue. In order to have the prologue properly handled, this commit proposes to add support for the ld instruction in the RISC-V prologue scanner. I guess riscv32 would use lw in such situation so this patch also adds support for this instruction. With this patch applied, gdb is now able to unwind past pthread_join: (gdb) bt #0 0x0000003ff7fccd20 in __futex_abstimed_wait_common64 () from /lib/riscv64-linux-gnu/libpthread.so.0 #1 0x0000003ff7fc43da in __pthread_clockjoin_ex () from /lib/riscv64-linux-gnu/libpthread.so.0 #2 0x0000002aaaaaa88e in bar() () #3 0x0000002aaaaaa8c4 in foo() () bminor#4 0x0000002aaaaaa8da in main () I have had a look to see if I could reproduce this easily, but in my simple testcases using '-fstack-protector-all', the canary is loaded after the RA register is saved. I do not have a reliable way of generating a prologue similar to the problematic one so I forged one instead. The testsuite have been run on riscv64 ubuntu 21.01 with no regression observed. [1] https://en.wikipedia.org/wiki/Buffer_overflow_protection#Canaries
The original reproducer for PR28030 required use of a specific compiler version - gcc-c++-11.1.1-3.fc34 is mentioned in the PR, though it seems probable that other gcc versions might also be able to reproduce the bug as well. This commit introduces a test case which, using the DWARF assembler, provides a reproducer which is independent of the compiler version. (Well, it'll work with whatever compilers the DWARF assembler works with.) To the best of my knowledge, it's also the first test case which uses the DWARF assembler to provide debug info for a shared object. That being the case, I provided more than the usual commentary which should allow this case to be used as a template when a combo shared library / DWARF assembler test case is required in the future. I provide some details regarding the bug in a comment near the beginning of locexpr-dml.exp. This problem was difficult to reproduce; I found myself constantly referring to the backtrace while trying to figure out what (else) I might be missing while trying to create a reproducer. Below is a partial backtrace which I include for posterity. #0 internal_error ( file=0xc50110 "/ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/gdbtypes.c", line=5575, fmt=0xc520c0 "Unexpected type field location kind: %d") at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdbsupport/errors.cc:51 #1 0x00000000006ef0c5 in copy_type_recursive (objfile=0x1635930, type=0x274c260, copied_types=0x30bb290) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/gdbtypes.c:5575 #2 0x00000000006ef382 in copy_type_recursive (objfile=0x1635930, type=0x274ca10, copied_types=0x30bb290) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/gdbtypes.c:5602 #3 0x0000000000a7409a in preserve_one_value (value=0x24269f0, objfile=0x1635930, copied_types=0x30bb290) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/value.c:2529 bminor#4 0x000000000072012a in gdbscm_preserve_values ( extlang=0xc55720 <extension_language_guile>, objfile=0x1635930, copied_types=0x30bb290) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/guile/scm-value.c:94 bminor#5 0x00000000006a3f82 in preserve_ext_lang_values (objfile=0x1635930, copied_types=0x30bb290) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/extension.c:568 bminor#6 0x0000000000a7428d in preserve_values (objfile=0x1635930) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/value.c:2579 bminor#7 0x000000000082d514 in objfile::~objfile (this=0x1635930, __in_chrg=<optimized out>) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/objfiles.c:549 bminor#8 0x0000000000831cc8 in std::_Sp_counted_ptr<objfile*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x1654580) at /usr/include/c++/11/bits/shared_ptr_base.h:348 bminor#9 0x00000000004e6617 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x1654580) at /usr/include/c++/11/bits/shared_ptr_base.h:168 bminor#10 0x00000000004e1d2f in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x190bb88, __in_chrg=<optimized out>) at /usr/include/c++/11/bits/shared_ptr_base.h:705 bminor#11 0x000000000082feee in std::__shared_ptr<objfile, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x190bb80, __in_chrg=<optimized out>) at /usr/include/c++/11/bits/shared_ptr_base.h:1154 bminor#12 0x000000000082ff0a in std::shared_ptr<objfile>::~shared_ptr ( this=0x190bb80, __in_chrg=<optimized out>) at /usr/include/c++/11/bits/shared_ptr.h:122 #13 0x000000000085ed7e in __gnu_cxx::new_allocator<std::_List_node<std::shared_ptr<objfile> > >::destroy<std::shared_ptr<objfile> > (this=0x114bc00, __p=0x190bb80) at /usr/include/c++/11/ext/new_allocator.h:168 #14 0x000000000085e88d in std::allocator_traits<std::allocator<std::_List_node<std::shared_ptr<objfile> > > >::destroy<std::shared_ptr<objfile> > (__a=..., __p=0x190bb80) at /usr/include/c++/11/bits/alloc_traits.h:531 #15 0x000000000085e50c in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::_M_erase (this=0x114bc00, __position= std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x1635930}) at /usr/include/c++/11/bits/stl_list.h:1925 #16 0x000000000085df0e in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::erase (this=0x114bc00, __position= std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x1635930}) at /usr/include/c++/11/bits/list.tcc:158 #17 0x000000000085c748 in program_space::remove_objfile (this=0x114bbc0, objfile=0x1635930) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/progspace.c:210 #18 0x000000000082d3ae in objfile::unlink (this=0x1635930) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/objfiles.c:487 #19 0x000000000082e68c in objfile_purge_solibs () at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/objfiles.c:875 #20 0x000000000092dd37 in no_shared_libraries (ignored=0x0, from_tty=1) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/solib.c:1236 #21 0x00000000009a37fe in target_pre_inferior (from_tty=1) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/target.c:2496 #22 0x00000000007454d6 in run_command_1 (args=0x0, from_tty=1, run_how=RUN_NORMAL) at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/infcmd.c:437 I'll note a few points regarding this backtrace: Frame #1 is where the internal error occurs. It's caused by an unhandled case for FIELD_LOC_KIND_DWARF_BLOCK. The fix for this bug adds support for this case. Frame #22 - it's a partial backtrace - shows that GDB is attempting to (re)run the program. You can see the exact command sequence that was used for reproducing this problem in the PR (at https://sourceware.org/bugzilla/show_bug.cgi?id=28030), but in a nutshell, after starting the program and advancing to the appropriate source line, GDB was asked to step into libstdc++; a "finish" command was issued, returning a value. The fact that a value was returned is very important. GDB was then used to step back into libstdc++. A breakpoint was set on a source line in the library after which a "run" command was issued. Frame #19 shows a call to objfile_purge_solibs. It's aptly named. Frame bminor#7 is a call to the destructor for one of the objfile solibs; it turned out to be the one for libstdc++. Frames bminor#6 thru #3 show various value preservation frames. If you look at preserve_values() in gdb/value.c, the value history is preserved first, followed by internal variables, followed by values for the extension languages (python and guile).
…es.exp When running test-case gdb.base/break-probes.exp on ubuntu 18.04.5, we have: ... (gdb) bt^M #0 0x00007ffff7dd6e12 in ?? () from /lib64/ld-linux-x86-64.so.2^M #1 0x00007ffff7dedf50 in ?? () from /lib64/ld-linux-x86-64.so.2^M #2 0x00007ffff7dd5128 in ?? () from /lib64/ld-linux-x86-64.so.2^M #3 0x00007ffff7dd4098 in ?? () from /lib64/ld-linux-x86-64.so.2^M bminor#4 0x0000000000000001 in ?? ()^M bminor#5 0x00007fffffffdaac in ?? ()^M bminor#6 0x0000000000000000 in ?? ()^M (gdb) FAIL: gdb.base/break-probes.exp: ensure using probes ... The test-case intends to emit an UNTESTED in this case, but fails to do so because it tries to do it in a regexp clause in a gdb_test_multiple, which doesn't trigger. Instead, a default clause triggers which produces the FAIL. Also the use of UNTESTED is not appropriate, and we should use UNSUPPORTED instead. Fix this by silencing the FAIL, and emitting an UNSUPPORTED after the gdb_test_multiple: ... if { ! $using_probes } { + unsupported "probes not present on this system" return -1 } ... Tested on x86_64-linux.
When running test-case gdb.base/break-probes.exp on ubuntu 18.04.5, we have: ... (gdb) run^M Starting program: break-probes^M Stopped due to shared library event (no libraries added or removed)^M (gdb) bt^M #0 0x00007ffff7dd6e12 in ?? () from /lib64/ld-linux-x86-64.so.2^M #1 0x00007ffff7dedf50 in ?? () from /lib64/ld-linux-x86-64.so.2^M #2 0x00007ffff7dd5128 in ?? () from /lib64/ld-linux-x86-64.so.2^M #3 0x00007ffff7dd4098 in ?? () from /lib64/ld-linux-x86-64.so.2^M bminor#4 0x0000000000000001 in ?? ()^M bminor#5 0x00007fffffffdaac in ?? ()^M bminor#6 0x0000000000000000 in ?? ()^M (gdb) UNSUPPORTED: gdb.base/break-probes.exp: probes not present on this system ... Using the backtrace, the test-case tries to establish that we're stopped in dl_main, which is used as proof that we're using probes. However, the backtrace only shows an address, because: - the dynamic linker contains no minimal symbols and no debug info, and - gdb is build without --with-separate-debug-dir so it can't find the corresponding .debug file, which does contain the mimimal symbols and debug info. Fix this by instead printing the pc and grepping for the value in the info probes output: ... (gdb) p /x $pc^M $1 = 0x7ffff7dd6e12^M (gdb) info probes^M Type Provider Name Where Object ^M ... stap rtld init_start 0x00007ffff7dd6e12 /lib64/ld-linux-x86-64.so.2 ^M ... (gdb) ... Tested on x86_64-linux.
When running test-case gdb.base/break-interp.exp on ubuntu 18.04.5, we have: ... (gdb) bt^M #0 0x00007eff7ad5ae12 in ?? () from break-interp-LDprelinkNOdebugNO^M #1 0x00007eff7ad71f50 in ?? () from break-interp-LDprelinkNOdebugNO^M #2 0x00007eff7ad59128 in ?? () from break-interp-LDprelinkNOdebugNO^M #3 0x00007eff7ad58098 in ?? () from break-interp-LDprelinkNOdebugNO^M bminor#4 0x0000000000000002 in ?? ()^M bminor#5 0x00007fff505d7a32 in ?? ()^M bminor#6 0x00007fff505d7a94 in ?? ()^M bminor#7 0x0000000000000000 in ?? ()^M (gdb) FAIL: gdb.base/break-interp.exp: ldprelink=NO: ldsepdebug=NO: \ first backtrace: dl bt ... Using the backtrace, the test-case tries to establish that we're stopped in dl_main. However, the backtrace only shows an address, because: - the dynamic linker contains no minimal symbols and no debug info, and - gdb is build without --with-separate-debug-dir so it can't find the corresponding .debug file, which does contain the mimimal symbols and debug info. As in "[gdb/testsuite] Improve probe detection in gdb.base/break-probes.exp", fix this by doing info probes and grepping for the address. Tested on x86_64-linux.
I build gdb without xml support using --without-expat, and ran into: ... (gdb) target remote | vgdb --wait=2 --max-invoke-ms=2500 --pid=22032^M Remote debugging using | vgdb --wait=2 --max-invoke-ms=2500 --pid=22032^M relaying data between gdb and process 22032^M warning: Can not parse XML target description; XML support was disabled at \ compile time^M ... (gdb) PASS: gdb.base/valgrind-infcall.exp: continue #1 p gdb_test_infcall ()^M Remote 'g' packet reply is too long (expected 560 bytes, got 800 bytes): ...^M (gdb) FAIL: gdb.base/valgrind-infcall.exp: p gdb_test_infcall () ... After googling the error message with context valgrind gdbserver, I found indications that the Remote 'g' packet reply error is due to missing xml support. And here ( https://www.valgrind.org/docs/manual/manual-core-adv.html ) I found: ... GDB version needed for ARM and PPC32/64. You must use a GDB version which is able to read XML target description sent by a gdbserver. This is the standard setup if GDB was configured and built with the "expat" library. If your GDB was not configured with XML support, it will report an error message when using the "target" command. Debugging will not work because GDB will then not be able to fetch the registers from the Valgrind gdbserver. ... So I guess I'm running into the same problem for x86_64. Fix this by skipping all gdb.base/valgrind-*.exp tests if xml support is not available. Although only the gdb.base/valgrind-infcall*.exp produce fails, the Remote 'g' packet reply error occurs in all tests, so it seems prudent to disable them all. Tested on x86_64-linux.
The gdb.multi/multi-term-settings.exp testcase sometimes fails like so: Running /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.multi/multi-term-settings.exp ... FAIL: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: stop with control-c (SIGINT) It's easier to reproduce if you stress the machine at the same time, like e.g.: $ stress -c 24 Looking at gdb.log, we see: (gdb) attach 60422 Attaching to program: build/gdb/testsuite/outputs/gdb.multi/multi-term-settings/multi-term-settings, process 60422 [New Thread 60422.60422] Reading symbols from /lib/x86_64-linux-gnu/libc.so.6... Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libc-2.31.so... Reading symbols from /lib64/ld-linux-x86-64.so.2... (No debugging symbols found in /lib64/ld-linux-x86-64.so.2) 0x00007f2fc2485334 in __GI___clock_nanosleep (clock_id=<optimized out>, clock_id@entry <mailto:clock_id@entry>=0, flags=flags@entry <mailto:flags@entry>=0, req=req@entry <mailto:req@entry>=0x7ffe23126940, rem=rem@entry <mailto:rem@entry>=0x0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78 78 ../sysdeps/unix/sysv/linux/clock_nanosleep.c: No such file or directory. (gdb) PASS: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: inf2: attach set schedule-multiple on (gdb) PASS: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: set schedule-multiple on info inferiors Num Description Connection Executable 1 process 60404 1 (extended-remote localhost:2349) build/gdb/testsuite/outputs/gdb.multi/multi-term-settings/multi-term-settings * 2 process 60422 1 (extended-remote localhost:2349) build/gdb/testsuite/outputs/gdb.multi/multi-term-settings/multi-term-settings (gdb) PASS: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: info inferiors pid=60422, count=46 pid=60422, count=47 pid=60422, count=48 pid=60422, count=49 pid=60422, count=50 pid=60422, count=51 pid=60422, count=52 pid=60422, count=53 pid=60422, count=54 pid=60422, count=55 pid=60422, count=56 pid=60422, count=57 pid=60422, count=58 pid=60422, count=59 pid=60422, count=60 pid=60422, count=61 pid=60422, count=62 pid=60422, count=63 pid=60422, count=64 pid=60422, count=65 pid=60422, count=66 pid=60422, count=67 pid=60422, count=68 pid=60422, count=69 pid=60404, count=54 pid=60404, count=55 pid=60404, count=56 pid=60404, count=57 pid=60404, count=58 PASS: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: continue Quit (gdb) FAIL: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: stop with control-c (SIGINT) If you look at the testcase's sources, you'll see that the intention is to resumes the program with "continue", wait to see a few of those "pid=..., count=..." lines, and then interrupt the program with Ctrl-C. But somehow, that resulted in GDB printing "Quit", instead of the Ctrl-C stopping the program with SIGINT. Here's what is happening: #1 - those "pid=..., count=..." lines we see above weren't actually output by the inferior after it has been continued (see #1). Note that "inf1_how" and "inf2_how" are "attach". What happened is that those "pid=..., count=..." lines were output by the inferiors _before_ they were attached to. We see them at that point instead of earlier, because that's where the testcase reads from the inferiors' spawn_ids. #2 - The testcase mistakenly thinks those "pid=..., count=..." lines happened after the continue was processed by GDB, meaning it has waited enough, and so sends the Ctrl-C. GDB hasn't yet passed the terminal to the inferior, so the Ctrl-C results in that Quit. The fix here is twofold: #1 - flush inferior output right after attaching #2 - consume the "Continuing" printed by "continue", indicating the inferior has the terminal. This is the same as done throughout the testsuite to handle this exact problem of sending Ctrl-C too soon. gdb/testsuite/ChangeLog: yyyy-mm-dd Pedro Alves <pedro@palves.net <mailto:pedro@palves.net>> * gdb.multi/multi-term-settings.exp (create_inferior): Flush inferior output. (coretest): Use $gdb_test_name. After issuing "continue", wait for "Continuing". Change-Id: Iba7671dfe1eee6b98d29cfdb05a1b9aa2f9defb9
On openSUSE Tumbleweed with glibc-debuginfo installed I get: ... (gdb) PASS: gdb.threads/linux-dp.exp: continue to breakpoint: thread 5's print where^M #0 print_philosopher (n=3, left=33 '!', right=33 '!') at linux-dp.c:105^M #1 0x0000000000401628 in philosopher (data=0x40537c) at linux-dp.c:148^M #2 0x00007ffff7d56b37 in start_thread (arg=<optimized out>) \ at pthread_create.c:435^M #3 0x00007ffff7ddb640 in clone3 () \ at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81^M (gdb) PASS: gdb.threads/linux-dp.exp: first thread-specific breakpoint hit ... while without debuginfo installed I get instead: ... (gdb) PASS: gdb.threads/linux-dp.exp: continue to breakpoint: thread 5's print where^M #0 print_philosopher (n=3, left=33 '!', right=33 '!') at linux-dp.c:105^M #1 0x0000000000401628 in philosopher (data=0x40537c) at linux-dp.c:148^M #2 0x00007ffff7d56b37 in start_thread () from /lib64/libc.so.6^M #3 0x00007ffff7ddb640 in clone3 () from /lib64/libc.so.6^M (gdb) FAIL: gdb.threads/linux-dp.exp: first thread-specific breakpoint hit ... The problem is that the regexp used: ... "\(from .*libpthread\|at pthread_create\|in pthread_create\)" ... expects the 'from' part to match libpthread, but in glibc 2.34 libpthread has been merged into libc. Fix this by updating the regexp. Tested on x86_64-linux.
Currently for a binary compiled normally (without -fsanitize=address) but with LD_PRELOAD of ASAN one gets: $ ASAN_OPTIONS=detect_leaks=0:alloc_dealloc_mismatch=1:abort_on_error=1:fast_unwind_on_malloc=0 LD_PRELOAD=/usr/lib64/libasan.so.6 gdb ================================================================= ==1909567==ERROR: AddressSanitizer: alloc-dealloc-mismatch (malloc vs operator delete []) on 0x602000001570 #0 0x7f1c98e5efa7 in operator delete[](void*) (/usr/lib64/libasan.so.6+0xb0fa7) ... 0x602000001570 is located 0 bytes inside of 2-byte region [0x602000001570,0x602000001572) allocated by thread T0 here: #0 0x7f1c98e5cd1f in __interceptor_malloc (/usr/lib64/libasan.so.6+0xaed1f) #1 0x557ee4a42e81 in operator new(unsigned long) (/usr/libexec/gdb+0x74ce81) SUMMARY: AddressSanitizer: alloc-dealloc-mismatch (/usr/lib64/libasan.so.6+0xb0fa7) in operator delete[](void*) ==1909567==HINT: if you don't care about these errors you may set ASAN_OPTIONS=alloc_dealloc_mismatch=0 ==1909567==ABORTING Despite the code called properly operator new[] and operator delete[]. But GDB's new-op.cc provides its own operator new[] which gets translated into malloc() (which gets recogized as operatore new(size_t)) but as it does not translate also operators delete[] Address Sanitizer gets confused. The question is how many variants of the delete operator need to be provided. There could be 14 operators new but there are only 4, GDB uses 3 of them. There could be 16 operators delete but there are only 6, GDB uses 2 of them. It depends on libraries and compiler which of the operators will get used. Currently being used: U operator new[](unsigned long) U operator new(unsigned long) U operator new(unsigned long, std::nothrow_t const&) U operator delete[](void*) U operator delete(void*, unsigned long) Tested on x86_64-linux.
This commit fixes Bug 28308, titled "Strange interactions with dprintf and break/commands": Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28308 Since creating that bug report, I've found a somewhat simpler way of reproducing the problem. I've encapsulated it into the GDB test case which I've created along with this bug fix. The name of the new test is gdb.base/dprintf-execution-x-script.exp, I'll demonstrate the problem using this test case, though for brevity, I've placed all relevant files in the same directory and have renamed the files to all start with 'dp-bug' instead of 'dprintf-execution-x-script'. The script file, named dp-bug.gdb, consists of the following commands: dprintf increment, "dprintf in increment(), vi=%d\n", vi break inc_vi commands continue end run Note that the final command in this script is 'run'. When 'run' is instead issued interactively, the bug does not occur. So, let's look at the interactive case first in order to see the correct/expected output: $ gdb -q -x dp-bug.gdb dp-bug ... eliding buggy output which I'll discuss later ... (gdb) run Starting program: /mesquite2/sourceware-git/f34-master/bld/gdb/tmp/dp-bug vi=0 dprintf in increment(), vi=0 Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26 26 in dprintf-execution-x-script.c vi=1 dprintf in increment(), vi=1 Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26 26 in dprintf-execution-x-script.c vi=2 dprintf in increment(), vi=2 Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26 26 in dprintf-execution-x-script.c vi=3 [Inferior 1 (process 1539210) exited normally] In this run, in which 'run' was issued from the gdb prompt (instead of at the end of the script), there are three dprintf messages along with three 'Breakpoint 2' messages. This is the correct output. Now let's look at the output that I snipped above; this is the output when 'run' is issued from the script loaded via GDB's -x switch: $ gdb -q -x dp-bug.gdb dp-bug Reading symbols from dp-bug... Dprintf 1 at 0x40116e: file dprintf-execution-x-script.c, line 38. Breakpoint 2 at 0x40113a: file dprintf-execution-x-script.c, line 26. vi=0 dprintf in increment(), vi=0 Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26 26 dprintf-execution-x-script.c: No such file or directory. vi=1 Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26 26 in dprintf-execution-x-script.c vi=2 Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26 26 in dprintf-execution-x-script.c vi=3 [Inferior 1 (process 1539175) exited normally] In the output shown above, only the first dprintf message is printed. The 2nd and 3rd dprintf messages are missing! However, all three 'Breakpoint 2...' messages are still printed. Why does this happen? bpstat_do_actions_1() in gdb/breakpoint.c contains the following comment and code near the start of the function: /* Avoid endless recursion if a `source' command is contained in bs->commands. */ if (executing_breakpoint_commands) return 0; scoped_restore save_executing = make_scoped_restore (&executing_breakpoint_commands, 1); Also, as described by this comment prior to the 'async' field in 'struct ui' in top.h, the main UI starts off in sync mode when processing command line arguments: /* True if the UI is in async mode, false if in sync mode. If in sync mode, a synchronous execution command (e.g, "next") does not return until the command is finished. If in async mode, then running a synchronous command returns right after resuming the target. Waiting for the command's completion is later done on the top event loop. For the main UI, this starts out disabled, until all the explicit command line arguments (e.g., `gdb -ex "start" -ex "next"') are processed. */ This combination of things, the state of the static global 'executing_breakpoint_commands' plus the state of the async field in the main UI causes this behavior. This is a backtrace after hitting the dprintf breakpoint for the second time when doing 'run' from the script file, i.e. non-interactively: Thread 1 "gdb" hit Breakpoint 3, bpstat_do_actions_1 (bsp=0x7fffffffc2b8) at /ironwood1/sourceware-git/f34-master/bld/../../worktree-master/gdb/breakpoint.c:4431 4431 if (executing_breakpoint_commands) #0 bpstat_do_actions_1 (bsp=0x7fffffffc2b8) at gdb/breakpoint.c:4431 #1 0x00000000004d8bc6 in dprintf_after_condition_true (bs=0x1538090) at gdb/breakpoint.c:13048 #2 0x00000000004c5caa in bpstat_stop_status (aspace=0x116dbc0, bp_addr=0x40116e, thread=0x137f450, ws=0x7fffffffc718, stop_chain=0x1538090) at gdb/breakpoint.c:5498 #3 0x0000000000768d98 in handle_signal_stop (ecs=0x7fffffffc6f0) at gdb/infrun.c:6172 bminor#4 0x00000000007678d3 in handle_inferior_event (ecs=0x7fffffffc6f0) at gdb/infrun.c:5662 bminor#5 0x0000000000763cd5 in fetch_inferior_event () at gdb/infrun.c:4060 bminor#6 0x0000000000746d7d in inferior_event_handler (event_type=INF_REG_EVENT) at gdb/inf-loop.c:41 bminor#7 0x00000000007a702f in handle_target_event (error=0, client_data=0x0) at gdb/linux-nat.c:4207 bminor#8 0x0000000000b8cd6e in gdb_wait_for_event (block=block@entry=0) at gdbsupport/event-loop.cc:701 bminor#9 0x0000000000b8d032 in gdb_wait_for_event (block=0) at gdbsupport/event-loop.cc:597 bminor#10 gdb_do_one_event () at gdbsupport/event-loop.cc:212 bminor#11 0x00000000009d19b6 in wait_sync_command_done () at gdb/top.c:528 bminor#12 0x00000000009d1a3f in maybe_wait_sync_command_done (was_sync=0) at gdb/top.c:545 #13 0x00000000009d2033 in execute_command (p=0x7fffffffcb18 "", from_tty=0) at gdb/top.c:676 #14 0x0000000000560d5b in execute_control_command_1 (cmd=0x13b9bb0, from_tty=0) at gdb/cli/cli-script.c:547 #15 0x000000000056134a in execute_control_command (cmd=0x13b9bb0, from_tty=0) at gdb/cli/cli-script.c:717 #16 0x00000000004c3bbe in bpstat_do_actions_1 (bsp=0x137f530) at gdb/breakpoint.c:4469 #17 0x00000000004c3d40 in bpstat_do_actions () at gdb/breakpoint.c:4533 #18 0x00000000006a473a in command_handler (command=0x1399ad0 "run") at gdb/event-top.c:624 #19 0x00000000009d182e in read_command_file (stream=0x113e540) at gdb/top.c:443 #20 0x0000000000563697 in script_from_file (stream=0x113e540, file=0x13bb0b0 "dp-bug.gdb") at gdb/cli/cli-script.c:1642 #21 0x00000000006abd63 in source_gdb_script (extlang=0xc44e80 <extension_language_gdb>, stream=0x113e540, file=0x13bb0b0 "dp-bug.gdb") at gdb/extension.c:188 #22 0x0000000000544400 in source_script_from_stream (stream=0x113e540, file=0x7fffffffd91a "dp-bug.gdb", file_to_open=0x13bb0b0 "dp-bug.gdb") at gdb/cli/cli-cmds.c:692 #23 0x0000000000544557 in source_script_with_search (file=0x7fffffffd91a "dp-bug.gdb", from_tty=1, search_path=0) at gdb/cli/cli-cmds.c:750 #24 0x00000000005445cf in source_script (file=0x7fffffffd91a "dp-bug.gdb", from_tty=1) at gdb/cli/cli-cmds.c:759 #25 0x00000000007cf6d9 in catch_command_errors (command=0x5445aa <source_script(char const*, int)>, arg=0x7fffffffd91a "dp-bug.gdb", from_tty=1, do_bp_actions=false) at gdb/main.c:523 #26 0x00000000007cf85d in execute_cmdargs (cmdarg_vec=0x7fffffffd1b0, file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND, ret=0x7fffffffd18c) at gdb/main.c:615 #27 0x00000000007d0c8e in captured_main_1 (context=0x7fffffffd3f0) at gdb/main.c:1322 #28 0x00000000007d0eba in captured_main (data=0x7fffffffd3f0) at gdb/main.c:1343 #29 0x00000000007d0f25 in gdb_main (args=0x7fffffffd3f0) at gdb/main.c:1368 #30 0x00000000004186dd in main (argc=5, argv=0x7fffffffd508) at gdb/gdb.c:32 There are two frames for bpstat_do_actions_1(), one at frame #16 and the other at frame #0. The one at frame #16 is processing the actions for Breakpoint 2, which is a 'continue'. The one at frame #0 is attempting to process the dprintf breakpoint action. However, at this point, the value of 'executing_breakpoint_commands' is 1, forcing an early return, i.e. prior to executing the command(s) associated with the dprintf breakpoint. For the sake of comparison, this is what the stack looks like when hitting the dprintf breakpoint for the second time when issuing the 'run' command from the GDB prompt. Thread 1 "gdb" hit Breakpoint 3, bpstat_do_actions_1 (bsp=0x7fffffffccd8) at /ironwood1/sourceware-git/f34-master/bld/../../worktree-master/gdb/breakpoint.c:4431 4431 if (executing_breakpoint_commands) #0 bpstat_do_actions_1 (bsp=0x7fffffffccd8) at gdb/breakpoint.c:4431 #1 0x00000000004d8bc6 in dprintf_after_condition_true (bs=0x16b0290) at gdb/breakpoint.c:13048 #2 0x00000000004c5caa in bpstat_stop_status (aspace=0x116dbc0, bp_addr=0x40116e, thread=0x13f0e60, ws=0x7fffffffd138, stop_chain=0x16b0290) at gdb/breakpoint.c:5498 #3 0x0000000000768d98 in handle_signal_stop (ecs=0x7fffffffd110) at gdb/infrun.c:6172 bminor#4 0x00000000007678d3 in handle_inferior_event (ecs=0x7fffffffd110) at gdb/infrun.c:5662 bminor#5 0x0000000000763cd5 in fetch_inferior_event () at gdb/infrun.c:4060 bminor#6 0x0000000000746d7d in inferior_event_handler (event_type=INF_REG_EVENT) at gdb/inf-loop.c:41 bminor#7 0x00000000007a702f in handle_target_event (error=0, client_data=0x0) at gdb/linux-nat.c:4207 bminor#8 0x0000000000b8cd6e in gdb_wait_for_event (block=block@entry=0) at gdbsupport/event-loop.cc:701 bminor#9 0x0000000000b8d032 in gdb_wait_for_event (block=0) at gdbsupport/event-loop.cc:597 bminor#10 gdb_do_one_event () at gdbsupport/event-loop.cc:212 bminor#11 0x00000000007cf512 in start_event_loop () at gdb/main.c:421 bminor#12 0x00000000007cf631 in captured_command_loop () at gdb/main.c:481 #13 0x00000000007d0ebf in captured_main (data=0x7fffffffd3f0) at gdb/main.c:1353 #14 0x00000000007d0f25 in gdb_main (args=0x7fffffffd3f0) at gdb/main.c:1368 #15 0x00000000004186dd in main (argc=5, argv=0x7fffffffd508) at gdb/gdb.c:32 This relatively short backtrace is due to the current UI's async field being set to 1. Yet another thing to be aware of regarding this problem is the difference in the way that commands associated to dprintf breakpoints versus regular breakpoints are handled. While they both use a command list associated with the breakpoint, regular breakpoints will place the commands to be run on the bpstat chain constructed in bp_stop_status(). These commands are run later on. For dprintf breakpoints, commands are run via the 'after_condition_true' function pointer directly from bpstat_stop_status(). (The 'commands' field in the bpstat is cleared in dprintf_after_condition_true(). This prevents the dprintf commands from being run again later on when other commands on the bpstat chain are processed.) Another thing that I noticed is that dprintf breakpoints are the only type of breakpoint which use 'after_condition_true'. This suggests that one possible way of fixing this problem, that of making dprintf breakpoints work more like regular breakpoints, probably won't work. (I must admit, however, that my understanding of this code isn't complete enough to say why. I'll trust that whoever implemented it had a good reason for doing it this way.) The comment referenced earlier regarding 'executing_breakpoint_commands' states that the reason for checking this variable is to avoid potential endless recursion when a 'source' command appears in bs->commands. We know that a dprintf command is constrained to either 1) execution of a GDB printf command, 2) an inferior function call of a printf-like function, or 3) execution of an agent-printf command. Therefore, infinite recursion due to a 'source' command cannot happen when executing commands upon hitting a dprintf breakpoint. I chose to fix this problem by having dprintf_after_condition_true() directly call execute_control_commands(). This means that it no longer attempts to go through bpstat_do_actions_1() avoiding the infinite recursion check for potential 'source' commands on the command chain. I think it simplifies this code a little bit too, a definite bonus. Summary: * breakpoint.c (dprintf_after_condition_true): Don't call bpstat_do_actions_1(). Call execute_control_commands() instead.
continuation of jwt27/djgpp-cvs#3