Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Common crate for building profilers. #180

Closed
benfred opened this issue Sep 11, 2018 · 9 comments
Closed

Common crate for building profilers. #180

benfred opened this issue Sep 11, 2018 · 9 comments

Comments

@benfred
Copy link
Collaborator

benfred commented Sep 11, 2018

Thanks for writing that nice blog post about py-spy! I really appreciate all the support, it seems like interest in py-spy really took off after you first tweeted about it.

You mentioned on twitter maybe collaborating on a crate with some shared functionality between py-spy and rbspy. I think this could be a good win for both of these projects, plus also maybe helping out anyone else that wants to build other profilers.

I was thinking about creating a new crate and pulling some code out of py-spy/rbspy for this (maybe @ http://github.com/rbspy/process-spy ?).

This would be mostly visualization code to start, but I also think there could be some future opportunities for other common code. As an example, I’m using goblin for parsing elf/mach/pe files and I noticed that you have an issue open for changing from using elf to golbin. The code to do this is pretty trivial, but it can’t hurt to share it. Likewise, I’ve been thinking a bit about trying to profile native extensions to python- and any code I write for this might also be useful for rbspy.

For the visualization code, most of it will probably be pulled from rbspy since py-spy just has a subset of rbspy's functionality right now. I’m thinking of including the flamegraph/callgrind and speedscope code from rbspy.

One visualization from py-spy that you might be interested in though is the 'top' like viewer. Most of the differences from the one in rbspy are only minor cosmetic tweaks, but the version in py-spy has one big advantage in that it works much better on windows. Clearing the console with the ANSI codes that you are using in rbspy doesn’t seem to work on windows (at least cmd.exe, iirc I think it worked on powershell), and even when I finally managed to get the console to clear - it still looked terrible since it flickered very noticeably every time it updated (windows console refresh time is slow enough that you can see the screen clear and then redraw). Fixing this took more work than I thought - and I was going to ask you about sharing this code even before you suggested a common crate =).

I’m also working on some new visualization code that might be cool to get going in both rbspy/py-spy. I’m trying to get py-spy include an embedded webserver that serves up live interactive visualizations using d3.js - and I think it might be pretty neat when it’s done.

Anyways, if we define a common trait in this new crate for stack traces - and then change the visualizations slightly to accept this trait, I think it would be pretty easy to get this going. If this sounds good to you let me know and I’ll set up later this week!

@jvns
Copy link
Collaborator

jvns commented Sep 11, 2018

yes go ahead! Maybe a good name would be spytools? Up to you! I added you as a rbspy member so you can create new repositories.

some thoughts:

  • if we could come up with a shared notion of StackTrace between rbspy and py-spy, then the crate could include the Outputter trait and implement visualizations that way https://github.com/rbspy/rbspy/blob/master/src/ui/output.rs
  • replacing the ANSI codes in rbspy with a top-like viewer like what py-spy has sounds great to me. I really liked using py-spy's viewer.
  • the idea of an embedded webserver is cool. maybe it would be possible to somehow integrate speedscope there? https://github.com/jlfwong/speedscope. I don't see an obvious path to integrating it because it's a node app with dependencies but maybe there's a way

@benfred
Copy link
Collaborator Author

benfred commented Sep 11, 2018

Cool! I like the name spytools, I will get this going soon-ish (probably next week).

@njsmith
Copy link

njsmith commented Feb 22, 2019

Another candidate for including in a common crate would be tools to handle irregular sampling: benfred/py-spy#94

(It looks like that exact same bug applies to rbspy btw, not sure if I should copy-paste it here or something?)

@akhramov
Copy link
Contributor

akhramov commented Oct 6, 2019

@benfred
I think we can reuse remoteprocess code here, in rbspy.

I believe, strategically, both rbspy & py-spy would benefit from the common code base:

The question is: what do you think of extracting & publishing remoteprocess crate?
How realistic would it be to build both rbspy & py-spy atop the common remoteprocess crate?

@benfred
Copy link
Collaborator Author

benfred commented Oct 9, 2019

I'm all for getting rbspy to use the remoteprocess code!

I've been making some small steps towards the goal of sharing more code between rbspy and py-spy - and have recently relicensed py-spy to MIT explicitly to make sharing code with rbspy easier.

My goal with remoteprocess was to hold all OS specific functionality we need to get information about another process - so it has a bunch of cross platform code for getting the commandline of the process, figuring out if a thread is active or not, locking the process etc. I think this could be useful to rbspy in addition to unlocking some freebsd functionality.

remoteprocess is published on crates.io - and can be used right now even without extracting it to its own repo. All you should have to do is add a remoteprocess="0.2.0" line to Cargo.toml to pull it into rbspy. Also, using the ProcessMemory trait in remoteprocess will also work with FreeBSD - and should be a drop in replacement for the read-process-memory CopyAddress trait once you've initialized the process.

I think we should still move remoteprocess to it's own repo though, and also add a feature flag for the native unwinding code though. The native unwinding code adds some build dependencies on linux w/ libunwind that we won't want in rbspy in the short term.

I'm also still planning on sharing some profiler specific code in another package (spytools?). py-spy is using the speedscope and flamegraph code from rbspy, and I think we can also use some code from py-spy here: the console viewer in py-spy works better on window as an example. I haven't made any progress at all on this though =(

@liaden
Copy link
Collaborator

liaden commented Oct 9, 2019

I think one thing that would be able to expose py-spy and rbspy as crates/libs to allow for them to be used in other ways without needing the CLI bindings and generating a binary.

For example: At work, we have a small web application (using rocket) that allows us to trigger generating a flamegraph on a host whenever we get a performance alert. Currently, we just use std::process::Command to shell out and run rbspy, but including rbsy more directly as a dependency would be a lot better.

Another hypothetical case would be having a ruby gem that wraps rbspy and allows for a user to send a signal (or have a REST-ful endpoint) to cause ruby to profile itself.

@akhramov
Copy link
Contributor

My (promising) take on using remoteprocess for rbspy is here #238

@benfred
Copy link
Collaborator Author

benfred commented Nov 18, 2019

@liaden you can use py-spy as a library right now, and it should be pretty easy to get the same thing going for rbspy if you want to give it a go. See benfred/py-spy#110 for info about how this was added to py-spy, the only real change was to add a src/lib.rs file (the rest was just adding documentation).

I've spun out remoteprocess to it's own repo https://github.com/benfred/remoteprocess here, and made the unwind code optional - so rbspy won't need to worry about building against libunwind on linux. The latest changes to remoteprocess are published on crates.io @ v0.3.1

@acj
Copy link
Member

acj commented May 8, 2021

Closing since most of the ideas in this thread have been implemented in some form. I opened #320 to track support for using rbspy as a library, which I think we should do.

Thanks, all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants