Fix GPU process to exit when native process exits #38

qxy11 · 2025-08-25T23:36:01Z

Summary

Right now when the native process exists, we get a lost connection for the GPU target:

(lldb) target select 0
Current targets:
* target #0: /home/qxy11/llvm/Debug/a.out ( arch=x86_64-unknown-linux-gnu, platform=host, pid=242142, state=stopped )
  target #1: <none> ( arch=x86_64-unknown-linux-gnu, platform=host, pid=1234, state=running )
(lldb) c
Process 3805000 resuming
Process 3805000 exited with status = 0 (0x00000000) 
Process 1234 exited with status = -1 (0xffffffff) lost connection
(lldb) q

The desired behavior should be that the GPU connection returns an exit status when the native process exits, returning a $WXX packet. This change fixes this so that when the native process is exiting, it notifies the GPU plugin to exit as well.

This currently is done in the Mock GPU plugin, and sets the exit status for the GPU process to the same one as the native process, but we can extend and follow up on AMD once this is approved.

Tests

We can follow up with unit tests once the basic unit tests are landed from other PRs.

Basic test running until native process reached completion:

(lldb) c
Process 1234 resuming
(lldb) target select 0
Current targets:
* target #0: /home/qxy11/llvm/Debug/a.out ( arch=x86_64-unknown-linux-gnu, platform=host, pid=3805000, state=stopped )
  target #1: <none> ( arch=x86_64-unknown-linux-gnu, platform=host, pid=1234, state=running )
(lldb) c
Process 3805000 resuming
gpu_shlib_load
gpu_third_stop
gpu_shlib_load
gpu_kernel
Process 3805000 exited with status = 0 (0x00000000) 
Process 1234 exited with status = 0 (0x00000000) 
(lldb)

Check server logs:

1756162713.459808350 [3383979/3383979] gdb-server <  22> read packet: $vCont;c:p33a2ae.-1#9d
1756162713.459902287 [3383979/3383979] gdb-server <  61> send packet: $O6770755f73686c69625f6c6f61640d0a6770755f6b65726e656c0d0a#43
1756162713.460208416 [3383979/3383979] ProcessMockGPU::HandleNativeProcessExit() native process exited with status=(Exited with status 0)
1756162713.460271358 [3383979/3383979] mock-gpu.server <   7> send packet: $W00#b7
1756162713.460320950 [3383979/3383979] gdb-server <  22> send packet: $W00;process:33a2ae#ea
lldb-server exiting...

As expected, the both processes send back $W00 packets now. The mock-gpu.server packet doesn't include the process ID since it doesn't have multi-process support enabled.

Test killing the process:

(lldb) target select 0
Current targets:
* target #0: /home/qxy11/llvm/Debug/a.out ( arch=x86_64-unknown-linux-gnu, platform=host, pid=3879593, state=stopped )
  target #1: <none> ( arch=x86_64-unknown-linux-gnu, platform=host, pid=1234, state=running )
(lldb) process kill
Process 1234 exited with status = 9 (0x00000009) 
Process 3879593 exited with status = 9 (0x00000009) killed
(lldb)

Test native process segfaults and exits:

(lldb) intern-state     pid = 2581667, SyncState::SetStateStopped(stop_id=4) m_stop_id = 4, m_state = stopped 
intern-state     pid = 2581667, SyncState::DidResume() m_stop_id = 4, m_state = running
intern-state     pid = 2581667, SyncState::SetStateStopped(stop_id=5) m_stop_id = 5, m_state = stopped 
Process 2581667 stopped
* thread #1, name = 'a.out', stop reason = signal SIGSEGV: address not mapped to object (fault address=0x0)
    frame #0: 0x00005555555551e7 a.out`main(argc=1, argv=0x00007fffffffd6a8) at memory-space-main.c:24:6
   21     gpu_initialize();
   22     // CPU BREAKPOINT - BEFORE LAUNCH
   23     int *p = NULL;
-> 24     *p = 42;
   25     gpu_shlib_load();
   26     gpu_third_stop();
   27     gpu_shlib_load();
Likely cause: p accessed 0x0
(lldb) c
lldb             pid = 2581667, SyncState::DidResume() m_stop_id = 5, m_state = running
Process 2581667 resuming
Process 2581667 exited with status = 11 (0x0000000b) 
Process 1234 exited with status = 11 (0x0000000b) 
(lldb)

dmpots

LGTM

dmpots · 2025-08-26T18:38:17Z

lldb/tools/lldb-server/Plugins/MockGPU/ProcessMockGPU.h

  void SetLaunchInfo(ProcessLaunchInfo &launch_info);
+
+  /// Called when the native process exits to set the GPU process exit status
+  void HandleNativeProcessExit(const WaitStatus &exit_status);


We should mark this as override

this method is not an override. The only mandatory interface in this patch is at the plugin level.

walter-erquinigo

Just minimal changes left. Thank you for doing this :)

walter-erquinigo · 2025-08-27T13:49:15Z

lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp

+  auto exit_status = process->GetExitStatus();
+  if (exit_status.has_value()) {
+    for (auto &plugin_up : m_plugins) {
+      plugin_up->NativeProcessDidExit(*exit_status);
+    }
+  }


don't use auto unless the type is unreadable. In this case it should be very readable

also, the if/for statements are very simple, so you should remove braces. See https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements

walter-erquinigo · 2025-08-27T13:50:10Z

lldb/source/Plugins/Process/gdb-remote/LLDBServerPlugin.h

+  /// GPU plugins to perform proper termination
+  ///
+  /// \param[in] exit_status The exit status of the native process.
+  virtual void NativeProcessDidExit(const WaitStatus &exit_status) {};


make this pure virtual. I think we shuoldn't have a default implementation for this because termination must be handled by each plugin properly

walter-erquinigo · 2025-08-27T13:50:19Z

lldb/source/Plugins/Process/gdb-remote/LLDBServerPlugin.h

+  /// Get the GPU plug-in notified when the native process exits.
+  ///
+  /// This function will get called when the native process exits. This allows
+  /// GPU plugins to perform proper termination


missing period

walter-erquinigo · 2025-08-27T13:50:38Z

lldb/tools/lldb-server/Plugins/MockGPU/LLDBServerPluginMockGPU.cpp

+  if (auto *mock_gpu_process = static_cast<ProcessMockGPU *>(gpu_process)) {
+    mock_gpu_process->HandleNativeProcessExit(exit_status);
+  }


remove braces

…mentation

walter-erquinigo · 2025-09-03T19:25:32Z

lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp

+  // Notify GPU plugins that the native process has exited
+  std::optional<WaitStatus> exit_status = process->GetExitStatus();
+  if (exit_status.has_value())
+    for (auto &plugin_up : m_plugins) {


don't use auto here

The code should look like this

// Notify server plugins that the native process has exited std::optional<WaitStatus> exit_status = process->GetExitStatus(); if (exit_status.has_value()) for (std::unique_ptr<lldb_server::LLDBServerPlugin> &plugin_up : m_plugins) plugin_up->NativeProcessDidExit(*exit_status);

walter-erquinigo · 2025-09-03T19:27:33Z

lldb/include/lldb/Host/common/NativeProcessProtocol.h

+  // Handle exiting the GPU process when a native process exits.
+  virtual void HandleNativeProcessExit(const WaitStatus &exit_status) {};


This function doesn't make sense for CPU processes.
Could you extend NativeProcessProtocol as a new class GPUProcessProtocol that has this additional method?
That would leave CPU process clean.
Then, make this function pure virtual

I've changed my mind. You don't need this function at all. It's enough to add NativeProcessDidExit to LLDBServerPlugin.h. Each plugin should decide how they want to manage this high level event.

walter-erquinigo · 2025-09-15T21:33:46Z

@qxy11 , I've copied most of your changes from this PR onto my own nvidia branch and it works for me :)

Summary: Have each plugin process decide how they want to handle native process exit.

This applies the WIP PR #38 to our branch, which ensures that whenever the CPU exits, the GPU also reports its exit with the same exit code.

Fix GPU process to exit when native process exits

a784a68

dmpots requested review from clayborg and walter-erquinigo and removed request for clayborg August 26, 2025 18:30

dmpots approved these changes Aug 26, 2025

View reviewed changes

walter-erquinigo requested changes Aug 27, 2025

View reviewed changes

Address lints, make exit calls pure virtual, add bare bones AMD imple…

0ccdd79

…mentation

qxy11 requested a review from walter-erquinigo August 28, 2025 17:06

walter-erquinigo requested changes Sep 3, 2025

View reviewed changes

qxy11 added 2 commits September 18, 2025 12:55

Remove HandleNativeProcessExit

c22a71d

Summary: Have each plugin process decide how they want to handle native process exit.

Remote auto and brackets

1dba57c

qxy11 requested a review from walter-erquinigo September 18, 2025 23:52

walter-erquinigo approved these changes Sep 19, 2025

View reviewed changes

walter-erquinigo added a commit that referenced this pull request Sep 30, 2025

[LLDB][NVDIA] Exit both the CPU and GPU with the same exit status

3f7853f

This applies the WIP PR #38 to our branch, which ensures that whenever the CPU exits, the GPU also reports its exit with the same exit code.

walter-erquinigo added a commit that referenced this pull request Oct 9, 2025

[LLDB][NVDIA] Exit both the CPU and GPU with the same exit status

45f25aa

This applies the WIP PR #38 to our branch, which ensures that whenever the CPU exits, the GPU also reports its exit with the same exit code.

clayborg approved these changes Oct 14, 2025

View reviewed changes

dmpots merged commit 08b5c9a into clayborg:llvm-server-plugins Oct 14, 2025
5 checks passed

walter-erquinigo added a commit that referenced this pull request Oct 24, 2025

[LLDB][NVDIA] Exit both the CPU and GPU with the same exit status

35fa345

This applies the WIP PR #38 to our branch, which ensures that whenever the CPU exits, the GPU also reports its exit with the same exit code.

		// Handle exiting the GPU process when a native process exits.
		virtual void HandleNativeProcessExit(const WaitStatus &exit_status) {};

Fix GPU process to exit when native process exits #38

Fix GPU process to exit when native process exits #38

Uh oh!

Conversation

qxy11 commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Uh oh!

dmpots left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

walter-erquinigo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

walter-erquinigo commented Sep 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

qxy11 commented Aug 25, 2025 •

edited

Loading