New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to debug a multithread application [ Debugger / gdb ] #1069
Comments
This isn't anything to do with Geany, it definitely to do with the GDB plugin and things hanging looks like a bug in either the plugin or GDB. Will move to geany-plugins. |
Additional info: GDB version(s):
The desired behavior: When the break-point is hit the first time (Thread 0), it should ignore all other hits from another Thread (in this sample, Thread 1) until this Thread is terminated (exited), and not switch to any other threads. An interesting question, How do you debug Geany? |
Please disregard the hang problem, the main thread is more likely to end before Thread 0 or Thread 1 end during debugging, causing the issue. I debugged this in another PC with a newer kernel, can you suggest how to get rid of this:
|
Geany is not a multi-threaded application, its mostly GUI activated code, and GTK is single threaded, so Geany is single threaded. There may be some threads used in the libraries Geany uses, but they are hidden there, and not having their source means its unlikely that breakpoints will be set in those. |
Right, i fixed the errors by installing glibc source code, but the debugger still wants to debug the thread_start in clone.S. Don't know how to disable this "feature". I am on Geany 1.38 built on 2021-02-08 (git code). Anyway, I came up with a better example and tested it with gdb to check if it is a bug in gdb. It worked as expected. Trying to debug with Geany hangs, there seems to be a race condition or it is waiting for some gdb info. Try to do reproduce this inside Geany and it hangs:
breakpoints at lines: 26,38,62 and 65 Sample program:
PS: Geany will hang on line 62 after you click [Continue] |
I don't use or know anything about debugger, but I note in your example that there are thread creation messages for threads 0 and 1 but thread 2 hits the breakpoint first. Perhaps thats just a human UI issue, you need to show the output from the GDB/MI interface that debugger uses. But if it is the same I would expect that having a thread it doesn't know about hit the breakpoint may confuse the debugger plugin. Unfortunately the MAINTAINERS file does not have the debugger plugin maintainer's github username so can't ping them. |
Thank you for your reply. I hope they're still around and can give some input here... I need to learn how to debug the Geany plugin and understand how it works. |
Not a solution, but you may try to use Scope Debugger -- that's another GDB plugin for Geany (to see the 'scope.html' simply click the plugin's Help in Plugin Manager). It also supports threads and seems to cope somewhat better with following the thread context. In your example, Scope will even show two execution-line pointers for each of the worker thread states (in All-stop GDB mode), as these share the same work-function. Minor Scope quirks:
BTW, in your example code, |
Also note that both plugins are just user interfaces for GDB which does the actual debugging, so things like "wants to debug |
Well, Scope does not seem to have this issue (i.e. trying to break in |
Quite possibly, IIRC debugger plugin is pretty old and predates current extensive thread usage in every application. Or as the OP observed, thread execution is non-deterministic, scope may just change the timing on your computer enough that whatever the problem is doesn't occur, ahh the joys of thread debugging 😈. |
Looking closer, it's clearly a problem with the way Debugger processes the thread's call-stack. Somehow it removes the current function frame from the stack on stepping ( |
Maybe its something that changed with newer versions of GDB and debugger hasn't been updated, the last substantive change to debugger (excluding GTK3 port and minor things) seems to have been in 2016 AFAICT. |
@avafinger: I revisited this issue, and wanted to point out that the original expectation that the stepping from the breakpoint while in the thread-2 context should not be switching to thread-3 context is rather unwarranted in this case. In your example, both threads share the same task-function, so the breakpoint is also common, it's hit by thread-2, but the step-over continues the execution of the whole process. Thus the same breakpoint is then swiftly hit by the thread-3, as it should. If you'd like to continue stepping in thread-2 context, you'd need to explicitly switch to thread-2 in the Call Stack pane. The subsequent stepping should indeed preserve the selected thread context (unless another thread hits its own breakpoint someplace), This also aligns with how its handled in the interactive GDB session. Anyway, I did some digging and I believe I've got the intended thread-switching/stepping behavior working (at least with your test code). I'll be pushing the changes soon. |
Hi @nomadbyte , thank you, and nice you are working on this.
In a real-world scenery, the same function is called by many threads, hundreds, or thousands. Can you imagine a web server serving a connection with thousands of users connecting at the same time? Would be impossible to debug a single thread if the user should switch back the context manually in every step. I am basing my assumptions comparing to what Visual studio does, once the breaking point is hit in a thread (first time), all others hit from other threads on the same function are ignored or halted and you can debug that specific thread (the first hit), step by step without worrying about the context switching to other thread. If I am not mistaken, Eclipse does the same but I am not an eclipse user, just mentioning it. Once you push your code I can test with a more complex sample. But I think the sample code should be as simple as possible. Thank you and Cheers |
Here it is (#1170). Mind that this does not change the original design, simply enforces the intended behavior when debugging multi-threaded programs. This aligns with the GDB's All-Stop Mode (default) flow, also as shown in your GDB interactive session above. This should also take care of the "annoying" (I would guess, unintended) error-dialogs popping when trying to step from the thread's breakpoint, which complained about missing system sources for the pthread-frames etc. |
I tested your fix with the sample thread2.c (attached). Here are my findings:
Threads are not predictable, but in the sample, Thread[0] will always finish first, because it is fired first, in theory. First run:
Second run:
So we can assume that. Maybe printf is not the best thing to use in the example. Now let's build and put a breakpoint on lines 25 and 34. But it switches to the new context when the next thread hits the breakpoint (line 25) and if you keep pushing [Step Over] until the end of the thread, you can notice variable i does not change, but we are now on next thread context. I think the next breakpoint hit on line 25 should be ignored and we should stay on the previous context until we exit from the thread. I will port this sample to Visual C and compare the results and mark this Closed or make any new comments. If anyone would like to comment on the assumption, please, be free to do so. Anyway, @nomadbyte thank you for your work. |
@avafinger: Thanks for the quick turn around with the testing. I'm glad that your results seem to show that the mentioned issues are gone. If I understand it correctly, your assumption about disregarding the breakpoints in peer threads is not consistent with the GDB All Stop mode. Just to reiterate this, in GDB All Stop mode:
Not sure how this works for your practical cases, but this GDB behavior does make sense, especially when threads do not share the task-function. When such threads hit breakpoints in their task-function code, it's reasonable -- and convenient too -- to expect thread context to switch to that breaking thread. As for how to achieve your desired thread context switching (or rather non-switching) behavior -- one simple and common way is by making the breakpoints conditional on the intended thread-id (or some surrogate). In your example, the watched variable i could be used in the shared breakpoint condition (e.g. i==0, for making the breakpoint effective only for thread 0). The condition can be added in the Breakpoints pane. P.S. pushed some updates (#1170) to prevent the "annoying popups" also on a premature Stop/End of the debugging run. |
The example ( variable i ) was used just to show the context had changed when you clicked [Step over]. I agree with the "stepping/continuing from the Stopped state will resume all concurrent threads" and "if a breakpoint is hit by a concurrent thread, GDB will halt all running threads and switch to that thread's context", maybe the design should be changed to support my thinking (or the user). The user wants to debug thread-id 0x123456 and not 0x999999 or 0x777777 which share the same task function and same breakpoints. I haven't tested this on Visual C yet. |
@avafinger also just to comment on a couple of things you said above which may indicate a misunderstanding of threads.
No, theory says exactly the opposite, nothing guarantees that, it can be scheduled to run or not at the OS convenience, that depends on what the heaps of other threads running in applications like the desktop and other apps want to do at the same time, no current CPU has enough cores to cope, even hyperthreaded. So it totally depends on scheduling if a newly created thread gets to run right away, or not.
IIUC in practice1 printf is fine for threads, each printf will be atomic, so individual characters won't mix, and therefore the order of output shows the order of execution, useful for debugging. Therefore your two runs output are both perfectly reasonable, remember the main is also a thread and subject to scheduling, so looking at the first few lines, in run 1 main returned from pthread_create and ran its "successful" printf before thread[0] got to its first printf, but in the second case thread[0] got to run two printfs before the main thread got to run its "successful" printf. See comment above about scheduling. [Edit: further interpretation of execution order is left as an exercise for the reader 😄] Footnotes
|
Let's add fuel to the fire. :) https://it-qa.com/why-is-thread-behavior-considered-unpredictable/ PS: Just for reading... i agree with you all. :) |
Today i ported the code to WIndows and tested it on Visual C++ Studio and i found the same behavior with the proposed fix. I tested also your latest update, so far so good. Thank you @nomadbyte and @elextr If you have any suggestions on how to switch to previous context within the Geany debugger would be nice. |
@avafinger: I assume, by that you mean switching to a previous frame's context, so that you could also inspect the frame's local vars in addition to pointing at the frame's entry source. In such case -- click the frame's arrow (in Call Stack pane), it will turn yellow instead of gray, and the Debugger will load the frame's local vars in the Autos pane. |
Good. When opening the Thread ID in the Call stack panel, the debugger switches to this context, which is exactly what I wanted. |
I am not sure where I should open this issue, here or in geany-plugins.
I need to debug a multithreaded app with Geany with debugger plugin (gdb) but Geany hangs inside the thread.
I have been using Geany for a long time but I can't find a way to debug a thread.
To reproduce the problem, build and debug the sample thread.c, setting a break-point at the line 22 and 34 and then run
Build
thread.c
Geany correctly stops at the first break-point, but If i click "Step over", Geany switches to the second thread, hitting "Step over" again hangs Geany. And eventually, a crash occurs if you try to close Geany.
The correct behavior should be to stay in the same thread and walk through the code while "Step over" is hit, line by line.
The text was updated successfully, but these errors were encountered: