Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debugger: Fix debugging of multi-threaded programs #1170

Merged
merged 1 commit into from
Mar 13, 2023

Conversation

nomadbyte
Copy link
Contributor

  • By the original design, the debugger allows thread context switching only
    interactively, while in the Call Stack pane (in Stopped mode). This is
    being handled directly through GtkTreeView::cursor-changed event.
  • However, this event was also getting triggered when clearing the latest
    thread's Call Stack after continuing the execution (Running mode).
  • This behavior does not seem to have being intended, and it lead to a
    number of issues: stepping hangs, corruption of GDB output processing,
    unnecessary attempts at opening of source files corresponding to thread's
    call-stack frames.
  • To avoid all of these issues, GtkTreeView::cursor-changed event is left
    unhandled while clearing the frames. Also the handler for thread-context
    switching is allowed processing only in debugger's Stopped mode to enforce
    the intended behavior.

Fixes #1069

Copy link

@avafinger avafinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compiled and tested with Geany 1.37 (git >= 5cc69b3d)

@nomadbyte
Copy link
Contributor Author

@elextr Anyone to review this to get it merged? I don't think it's being maintained by anyone current, so I worded in much detail in the commit message to explain both the cause and the fix.

@elextr
Copy link
Member

elextr commented Apr 26, 2022

@nomadbyte yeah, its over 10 years since the maintainer listed contributed.

We don't usually summarily remove maintainers, but since there are no contact details in the file to check if they are still interested I think the plugin could be marked orphaned so someone else can take over, if nobody objects (and someone reminds me) in a couple of weeks I'll do that @frlan @eht16 @b4n

debugger/src/stree.c Outdated Show resolved Hide resolved
@b4n
Copy link
Member

b4n commented Jun 6, 2022

@nomadbyte yeah, its over 10 years since the maintainer listed contributed.

We don't usually summarily remove maintainers, but since there are no contact details in the file to check if they are still interested I think the plugin could be marked orphaned so someone else can take over, if nobody objects (and someone reminds me) in a couple of weeks I'll do that @frlan @eht16 @b4n

I'd look at who committed recently and ask whether they'd be willing to maintain it, but as far as removing maintainership information I really it's fine 10 years later :)

@avafinger
Copy link

Just for the record, a comment about the fix to @nomadbyte , i have been using the debugger with very good results with Geany 1.37 (git >= 5cc69b3d) . I don't know if anyone else tested this fix with newer Geany. I was more interested in debugging multi-threaded applications but recently i had to debug minicom to mimic some of its features in one local application here and depending on where i set the breakpoint (in minicom), the next [Step over] closes Geany. minicom is a terminal program generally used to communicate with some hardware via Serial/USB interface (most common use). If you are interested in knowing more about the issue, please let me know. Or if someone would like to find more, a tip to debug minicom is to comment the setjmp() code and then you are able to set breakpoints anywhere. Maybe the interrupts set by minicom are the main cause. BR.

@nomadbyte
Copy link
Contributor Author

... If you are interested in knowing more about the issue, please let me know.

@avafinger, you could try to see if the same steps work properly in command-line GDB without Geany.

If the issue is persistent in Geany, then you may describe it in a new issue against the Debugger plugin.

@nomadbyte nomadbyte force-pushed the debugger/1069-multithreaded branch from 292e4c7 to 0abd437 Compare June 6, 2022 19:04
@avafinger
Copy link

@nomadbyte
I will do as you suggested.
There are many signal handlers there that might trigger some weird thing.

By the way, your patch cannot be applied over the last one, i had to manually change the code. It works as expected. ;)

@avafinger
Copy link

avafinger commented Jun 6, 2022

@nomadbyte,
I think i found the reason. The problem is with [Step into] and not [Step over] as i mentioned.
Please, see if you can advise how to overcome the error.
In fact, the debugger closes the debugging section and not Geany.

Below is the error:

sudo gdb --args ./minicom -D /dev/ttyUSB1

GNU gdb (Ubuntu 8.1.1-0ubuntu1) 8.1.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./minicom...done.
(gdb) b 77
Breakpoint 1 at 0x5365: file minicom.c, line 77.
(gdb) s
The program is not being run.
(gdb) run
Starting program: /apps/minicom-2.7.1/debian/minicom/usr/bin/minicom -D /dev/ttyUSB1
Lockfile is stale. Overriding it..

Breakpoint 1, port_init () at minicom.c:77
77	  m_setparms(portfd, P_BAUDRATE, P_PARITY, P_BITS, P_STOPB,
(gdb) s
78	             P_HASRTS[0] == 'Y', P_HASXON[0] == 'Y');
(gdb) s
77	  m_setparms(portfd, P_BAUDRATE, P_PARITY, P_BITS, P_STOPB,
(gdb) s
m_setparms (fd=3, baudr=0x5555557866e0 <mpars+4032> "1500000", 
    par=0x555555786800 <mpars+4320> "N", bits=0x555555786770 <mpars+4176> "8", 
    stopb=0x555555786890 <mpars+4464> "1", hwf=1, swf=0) at sysdep1.c:396
396	{
(gdb) s
397	  int spd = -1;
(gdb) s
399	  int bit = bits[0];
(gdb) s
408	  if (portfd_is_socket)
(gdb) s
413	  tcgetattr(fd, &tty);
(gdb) s
__GI___tcgetattr (fd=3, termios_p=0x7fffffffde30)
    at ../sysdeps/unix/sysv/linux/tcgetattr.c:34
34	../sysdeps/unix/sysv/linux/tcgetattr.c: No such file or directory.
(gdb) s
38	in ../sysdeps/unix/sysv/linux/tcgetattr.c
(gdb) s
34	in ../sysdeps/unix/sysv/linux/tcgetattr.c
(gdb) s
38	in ../sysdeps/unix/sysv/linux/tcgetattr.c
(gdb) s
40	in ../sysdeps/unix/sysv/linux/tcgetattr.c
(gdb) s
46	in ../sysdeps/unix/sysv/linux/tcgetattr.c
(gdb) s
63	in ../sysdeps/unix/sysv/linux/tcgetattr.c
(gdb) s
42	in ../sysdeps/unix/sysv/linux/tcgetattr.c
(gdb) s
63	in ../sysdeps/unix/sysv/linux/tcgetattr.c
(gdb) s
42	in ../sysdeps/unix/sysv/linux/tcgetattr.c
(gdb) n
Warning:
Cannot insert breakpoint 0.
Cannot access memory at address 0x1ec4635941ad2a3e

__longjmp () at ../sysdeps/x86_64/__longjmp.S:45
45	../sysdeps/x86_64/__longjmp.S: No such file or directory.
(gdb) 

I know i have to install libc6 sym, but can't remember how...

@elextr
Copy link
Member

elextr commented Jun 7, 2022

@avafinger tcgetattr() is a linux library call so you can't set a breakpoint in it as its read only.

@avafinger
Copy link

Yes, but the breakpoint is not in tcgetattr(). The s (Step into) wants the source.
I would expect that gdb would just skip the sysdeps function if not found.
The GDB way is to tell gdb where the sysdeps source code is, but i think must be a better way.

(gdb) s
413	  tcgetattr(fd, &tty);
(gdb) s
__GI___tcgetattr (fd=3, termios_p=0x7fffffffde30)
    at ../sysdeps/unix/sysv/linux/tcgetattr.c:34
34	../sysdeps/unix/sysv/linux/tcgetattr.c: No such file or directory.
(gdb) info source
Current source file is ../sysdeps/unix/sysv/linux/tcgetattr.c
Compilation directory is /build/glibc-CVJwZb/glibc-2.27/termios
Source language is c.
Producer is GNU C11 7.5.0 -mtune=generic -march=x86-64 -g -O2 -O3 -std=gnu11 -fgnu89-inline -fmerge-all-constants -frounding-math -fstack-protector-strong -fPIC -ftls-model=initial-exec -fstack-protector-strong.
Compiled with DWARF 2 debugging format.
Does not include preprocessor macro info.
(gdb) frame 1
#1  0x000055555557556f in m_setparms (fd=3, 
    baudr=0x5555557866e0 <mpars+4032> "1500000", 
    par=0x555555786800 <mpars+4320> "N", bits=0x555555786770 <mpars+4176> "8", 
    stopb=0x555555786890 <mpars+4464> "1", hwf=1, swf=0) at sysdep1.c:413
413	  tcgetattr(fd, &tty);

@elextr
Copy link
Member

elextr commented Jun 7, 2022

The missing source is ok, in above you stepped into tcgetattr() stepped along a bit which was fine even if no source was shown, then did next not step. I would guess tcgetattr() does a trap to the kernal, and thats probably not tracable by GDB, which then gets very confused trying to do next. Debugging should work ok if you step over the call to tcgetattr() not into it.

@avafinger
Copy link

You are right, n can step over without throwing an error.
I need to find a way to install the sources in case i inadvertently hit [Step into]. :)

@elextr
Copy link
Member

elextr commented Jun 7, 2022

I doubt the source will make it work any better, thats only displayed for the human, GDB doesn't use it. :-)

@nomadbyte
Copy link
Contributor Author

nomadbyte commented Jun 7, 2022

@avafinger, If the minicom-bound issue is not directly related to this multi-threaded debugging issue, I'd rather see this topic discussed in a separate issue for clarity and completeness.

The current "patch" should be functionally equivalent to my original fix; this just addresses the comments raised by @b4n . Thus, any Debugger issues possibly exposed by debugging the minicom code should be summarized in a new issue.

@nomadbyte
Copy link
Contributor Author

Looking back, this may also fix the symptoms in #309 (Error 'Can't find a source file')

@avafinger
Copy link

avafinger commented Jun 7, 2022

Sorry, i didn't want to flood the thread.
The minicom issue has nothing to do with your patch and is related to gdb only. And reading gdb threads here and there the fix is telling gdb where libc6 source code is, or never, ever step in into libc6 sys function.
Update: you can close it. And i applied your patch over the already patched code and not from a clean code.

@avafinger
Copy link

avafinger commented Jun 7, 2022

@nomadbyte and others,

I opened a new issue to discuss the [Can't find a source file and no directory or file]. Please, make your considerations there.
#1179

@nomadbyte
Copy link
Contributor Author

nomadbyte commented Aug 25, 2022

@elextr I wonder if this PR is ready for the merge?

- By original design, the debugger allows thread context switching only
  interactively, while in the Call Stack pane (in Stopped mode). This is
  being handled directly through GtkTreeView::cursor-changed event.
- However, this event was also getting triggered when clearing the latest
  thread's Call Stack after continuing the execution (Running mode).
- This behavior does not seem to have being intended, and it lead to a
  number of issues: stepping hangs, corruption of GDB output processing,
  unnecessary attempts at opening of source files corresponding to thread's
  call-stack frames.
- To avoid all of these issues, GtkTreeView::cursor-changed event is left
  unhandled while clearing the frames. Also the handler for thread-context
  switching is allowed processing only in debugger's Stopped mode to enforce
  the intended behavior.
@elextr
Copy link
Member

elextr commented Mar 10, 2023

@nomadbyte I guess its waiting for "somebody" to review and test, IIRC @avafinger applied changes manually, it would be nice for someone to test the PR explicitly.

LGB(very cursory)I

@nomadbyte
Copy link
Contributor Author

... IIRC @avafinger applied changes manually, it would be nice for someone to test the PR explicitly.

Sure. However, as you probably could see, the changes are very much limited, so applying manually should be just as effective. Of course, I did the testing of the issue, as I described above.

It's been almost a year since this issue is in DONE state. Is there something that I could do to get this finally merged?

@b4n
Copy link
Member

b4n commented Mar 12, 2023

I just tried this using #1069 test case, and it's clearly better. The second example program (more reasonable) seems to work OK.

With the first example it gets confused and state desynchronizes (the plugin doesn't notice the program stopped), leading to the plugin giving weird-looking errors (though they aren't). I got fairly random results though, and believe once I had a hang, but it's a bit too random to tell. Anyway, I guess the issue here is that the threads are dying and thus just vanish, and something isn't coping well with that. Admittedly, an interactive GDB session is a bit confusing and random as well, but makes sense.

So there's something to improve here, but it does help quite a bit indeed 👍

@elextr elextr merged commit 2e1d4ff into geany:master Mar 13, 2023
@elextr
Copy link
Member

elextr commented Mar 13, 2023

@nomadbyte thanks (and thanks for good commit messages)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to debug a multithread application [ Debugger / gdb ]
4 participants