Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement native profiling #373

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Implement native profiling #373

wants to merge 5 commits into from

Conversation

Maaarcocr
Copy link

@Maaarcocr Maaarcocr commented Dec 15, 2022

resolves #372

How does it work?

When the native-profiling flag is on and we are on linux or windows (remoteprocess only supports unwinding on these 2 platforms afaik) then, after getting the ruby trace, we:

  • get native stack frames using remoteprocess
  • go through the ruby frames, when a c function is found
    • find the closest start of a c function invocation in the native frames
    • add all the frames from it
    • stop adding when we know we are going back to ruby

The start of a c function is: rb_vm_exec and the end of it is: vm_call_cfunc_with_frame.
By interweaving native and ruby frames this way, we get to profile both ruby and native code!

Limitations

It seems like this may not work with YJIT, as it can get rid of some rb_vm_exec calls and can optimise vm_call_cfunc_with_frame

Copy link
Member

@acj acj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! The implementation looks good. It doesn't currently build on macOS because some functions and variables are hidden by the cfg directives.

I'm also thinking about test coverage. Maybe we can do something similar to the existing tests that load a core dump and get a stack trace from it. If you have ideas, feel free to share and/or try them.

Next steps:

  • Please rebase onto main (should be easy - minor fixes and dependency upgrades), and then I'll approve the build so that CI can tell us what's broken. Let me know if you'd like help with the macOS build
  • I'll put this through its paces in my Linux VM and will think about testing strategies a bit more

@@ -106,15 +127,36 @@ impl RubySpy {
}
}

fn get_trace_from_current_thread(&self, lock_process: bool) -> Result<StackTrace> {
fn get_trace_from_current_thread(&mut self, lock_process: bool) -> Result<StackTrace> {
if self.native_profiling && cfg!(any(windows, target_os = "linux")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if self.native_profiling && cfg!(any(windows, target_os = "linux")) {
if self.native_profiling && cfg!(any(target_os = "windows", target_os = "linux")) {

break;
}

// TODO: find a way to filter out internal ruby core functions, and make it configurable?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example of these?

let mut frames = vec![];
for (i, ruby_frame) in ruby_frames.iter().enumerate() {
let is_cfunc = ruby_frame.name.contains("[c function]");
let mut found_start_of_cfunc = false || (i == 0 && is_cfunc);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mut found_start_of_cfunc = false || (i == 0 && is_cfunc);
let mut found_start_of_cfunc = i == 0 && is_cfunc;

.function
.as_ref()
.map(|s| s as &str)
.unwrap_or_else(|| "cfunc (unnamed)".into());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's a chance that this will be printed, it should probably use the "[c function]" notation like the existing C func code does.

match self.symbolicator.symbolicate(ip, true, func) {
Ok(_) => Ok(()),
Err(e) => {
// Do not reload more than once per IP which is not found
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, is this a performance optimization? Or does it cause an error if we reload more than once per IP?

@@ -437,6 +446,8 @@ impl Args {
};

let no_drop_root = submatches.occurrences_of("no-drop-root") == 1;
let native_profiling = submatches.occurrences_of("native-profiling") == 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be able to use is_present here like we do for the boolean options below.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add native traces to RubySpy::get_stack_trace
2 participants