-
Notifications
You must be signed in to change notification settings - Fork 155
Fix guest tracing deadlock when exception happens during tracing data serialization #1066
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ludfjig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Note that the linked issue will be closed when this merges which may or may not be your intention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR addresses a deadlock issue in guest tracing that occurs when exceptions happen during tracing data serialization. The fix converts direct lock() calls to try_lock() throughout the guest tracing subsystem to detect re-entrant access patterns, and removes the #[instrument] macro from out32() to prevent span creation during exception handling.
Key changes:
- Replaced
lock()withtry_lock().expect()in subscriber methods and lib.rs functions to detect and panic on re-entrant access - Removed
#[instrument]fromout32()which can be called from exception contexts - Simplified string creation in visitor.rs (refactoring cleanup)
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| src/hyperlight_guest_tracing/src/subscriber.rs | Converted all Subscriber trait methods to use try_lock() with panic on failure to detect re-entrant access |
| src/hyperlight_guest_tracing/src/lib.rs | Updated set_start_tsc, end_trace, clean_trace_state, and guest_trace_info to use try_lock(); fixed "guset" typo |
| src/hyperlight_guest/src/exit.rs | Removed #[instrument] from out32() and added documentation explaining why tracing is avoided in exception contexts |
| src/hyperlight_guest_tracing/src/visitor.rs | Refactored string creation to use String::from() directly instead of creating empty String and pushing |
748dd61 to
8045903
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 4 out of 7 changed files in this pull request and generated 1 comment.
ludfjig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…t to report trace data When guest tracing is enabled and an exception occurs while the guest trace state lock is acquired (e.g. no more memory left to allocate a trace event on the heap), the guest tries to report to the host what the error was by calling the `out32` function. This function reports the guest trace data by serializing it on the spot, which needs to acquire the lock again, this leads to a deadlock. Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
8045903 to
2c69974
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 4 out of 7 changed files in this pull request and generated 2 comments.
Partially addresses #1032.
This fixes an issue where the guest hangs because of a deadlock produced when an exception occurs in the tracing code and the
GuestStatelock is acquired. When handling the exception, the call tooutbfunction, tries to acquire the same lock another time, which leads to a deadlock.