-
Notifications
You must be signed in to change notification settings - Fork 1
Fix GPU process to exit when native process exits #38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dmpots
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| void SetLaunchInfo(ProcessLaunchInfo &launch_info); | ||
|
|
||
| /// Called when the native process exits to set the GPU process exit status | ||
| void HandleNativeProcessExit(const WaitStatus &exit_status); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should mark this as override
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method is not an override. The only mandatory interface in this patch is at the plugin level.
walter-erquinigo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minimal changes left. Thank you for doing this :)
| auto exit_status = process->GetExitStatus(); | ||
| if (exit_status.has_value()) { | ||
| for (auto &plugin_up : m_plugins) { | ||
| plugin_up->NativeProcessDidExit(*exit_status); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't use auto unless the type is unreadable. In this case it should be very readable
also, the if/for statements are very simple, so you should remove braces. See https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements
| /// GPU plugins to perform proper termination | ||
| /// | ||
| /// \param[in] exit_status The exit status of the native process. | ||
| virtual void NativeProcessDidExit(const WaitStatus &exit_status) {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this pure virtual. I think we shuoldn't have a default implementation for this because termination must be handled by each plugin properly
| /// Get the GPU plug-in notified when the native process exits. | ||
| /// | ||
| /// This function will get called when the native process exits. This allows | ||
| /// GPU plugins to perform proper termination |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing period
| if (auto *mock_gpu_process = static_cast<ProcessMockGPU *>(gpu_process)) { | ||
| mock_gpu_process->HandleNativeProcessExit(exit_status); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove braces
| // Notify GPU plugins that the native process has exited | ||
| std::optional<WaitStatus> exit_status = process->GetExitStatus(); | ||
| if (exit_status.has_value()) | ||
| for (auto &plugin_up : m_plugins) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't use auto here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code should look like this
// Notify server plugins that the native process has exited
std::optional<WaitStatus> exit_status = process->GetExitStatus();
if (exit_status.has_value())
for (std::unique_ptr<lldb_server::LLDBServerPlugin> &plugin_up : m_plugins)
plugin_up->NativeProcessDidExit(*exit_status);
| // Handle exiting the GPU process when a native process exits. | ||
| virtual void HandleNativeProcessExit(const WaitStatus &exit_status) {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function doesn't make sense for CPU processes.
Could you extend NativeProcessProtocol as a new class GPUProcessProtocol that has this additional method?
That would leave CPU process clean.
Then, make this function pure virtual
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed my mind. You don't need this function at all. It's enough to add NativeProcessDidExit to LLDBServerPlugin.h. Each plugin should decide how they want to manage this high level event.
|
@qxy11 , I've copied most of your changes from this PR onto my own nvidia branch and it works for me :) |
Summary: Have each plugin process decide how they want to handle native process exit.
This applies the WIP PR #38 to our branch, which ensures that whenever the CPU exits, the GPU also reports its exit with the same exit code.
This applies the WIP PR #38 to our branch, which ensures that whenever the CPU exits, the GPU also reports its exit with the same exit code.
This applies the WIP PR #38 to our branch, which ensures that whenever the CPU exits, the GPU also reports its exit with the same exit code.
Summary
Right now when the native process exists, we get a lost connection for the GPU target:
The desired behavior should be that the GPU connection returns an exit status when the native process exits, returning a
$WXXpacket. This change fixes this so that when the native process is exiting, it notifies the GPU plugin to exit as well.This currently is done in the Mock GPU plugin, and sets the exit status for the GPU process to the same one as the native process, but we can extend and follow up on AMD once this is approved.
Tests
We can follow up with unit tests once the basic unit tests are landed from other PRs.
Basic test running until native process reached completion:
Check server logs:
As expected, the both processes send back
$W00packets now. Themock-gpu.serverpacket doesn't include the process ID since it doesn't have multi-process support enabled.Test killing the process:
Test native process segfaults and exits: