-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement native profiling #373
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! The implementation looks good. It doesn't currently build on macOS because some functions and variables are hidden by the cfg
directives.
I'm also thinking about test coverage. Maybe we can do something similar to the existing tests that load a core dump and get a stack trace from it. If you have ideas, feel free to share and/or try them.
Next steps:
- Please rebase onto main (should be easy - minor fixes and dependency upgrades), and then I'll approve the build so that CI can tell us what's broken. Let me know if you'd like help with the macOS build
- I'll put this through its paces in my Linux VM and will think about testing strategies a bit more
@@ -106,15 +127,36 @@ impl RubySpy { | |||
} | |||
} | |||
|
|||
fn get_trace_from_current_thread(&self, lock_process: bool) -> Result<StackTrace> { | |||
fn get_trace_from_current_thread(&mut self, lock_process: bool) -> Result<StackTrace> { | |||
if self.native_profiling && cfg!(any(windows, target_os = "linux")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if self.native_profiling && cfg!(any(windows, target_os = "linux")) { | |
if self.native_profiling && cfg!(any(target_os = "windows", target_os = "linux")) { |
break; | ||
} | ||
|
||
// TODO: find a way to filter out internal ruby core functions, and make it configurable? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give an example of these?
let mut frames = vec![]; | ||
for (i, ruby_frame) in ruby_frames.iter().enumerate() { | ||
let is_cfunc = ruby_frame.name.contains("[c function]"); | ||
let mut found_start_of_cfunc = false || (i == 0 && is_cfunc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let mut found_start_of_cfunc = false || (i == 0 && is_cfunc); | |
let mut found_start_of_cfunc = i == 0 && is_cfunc; |
.function | ||
.as_ref() | ||
.map(|s| s as &str) | ||
.unwrap_or_else(|| "cfunc (unnamed)".into()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's a chance that this will be printed, it should probably use the "[c function]" notation like the existing C func code does.
match self.symbolicator.symbolicate(ip, true, func) { | ||
Ok(_) => Ok(()), | ||
Err(e) => { | ||
// Do not reload more than once per IP which is not found |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, is this a performance optimization? Or does it cause an error if we reload more than once per IP?
@@ -437,6 +446,8 @@ impl Args { | |||
}; | |||
|
|||
let no_drop_root = submatches.occurrences_of("no-drop-root") == 1; | |||
let native_profiling = submatches.occurrences_of("native-profiling") == 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be able to use is_present
here like we do for the boolean options below.
resolves #372
How does it work?
When the
native-profiling
flag is on and we are on linux or windows (remoteprocess only supports unwinding on these 2 platforms afaik) then, after getting the ruby trace, we:The start of a c function is:
rb_vm_exec
and the end of it is:vm_call_cfunc_with_frame
.By interweaving native and ruby frames this way, we get to profile both ruby and native code!
Limitations
It seems like this may not work with YJIT, as it can get rid of some
rb_vm_exec
calls and can optimisevm_call_cfunc_with_frame