Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose running py-spy sampling in blocking mode #94

Open
AlexBoyd opened this issue Mar 21, 2023 · 5 comments
Open

Expose running py-spy sampling in blocking mode #94

AlexBoyd opened this issue Mar 21, 2023 · 5 comments

Comments

@AlexBoyd
Copy link

Currently when configuring the py-spy backend under the hood, we always initialize py-spy with locking mode non-blocking. https://github.com/pyroscope-io/pyroscope-rs/blob/main/pyroscope_ffi/python/lib/src/lib.rs#L77

Running spy-spy in non blocking mode has less overhead, but will frequently produce broken partial stack traces when taking a sample on larger applications with large stacks. Py-spy itself now runs with this by default to avoid the issue causes by incorrectly captured stack traces possible in non-blocking mode.

This option was previously toggle-able when using the pyroscope-cli connect options, but has never been exposed when using the native python support.

Suggestion would be to expose a new config option in configure https://github.com/pyroscope-io/pyroscope-rs/blob/main/pyroscope_ffi/python/pyroscope/__init__.py#L12, pass that down to initialize agent, https://github.com/pyroscope-io/pyroscope-rs/blob/main/pyroscope_ffi/python/pyroscope/__init__.py#L36 and then pass that into the construction of the py-spy backend config.

@benfred
Copy link

benfred commented Mar 21, 2023

I wrote a post talking about why py-spy has blocking as the default here https://www.benfrederickson.com/why-python-needs-paused-during-profiling/ - nonblocking is great for reducing perf impact, but can have very misleading results

@AlexBoyd
Copy link
Author

I've attempted to test this change locally, but I do seem to be hitting an error from the underlying py-spy sampling failing to suspend the process without any other error logs.

If I run with logging enabled I see this soon after start up: WARN py_spy::sampler > Failed to profile python from process 18: Failed to suspend process

Oddly running py-spy directly both in blocking or no blocking mode via the py-spy cli tooling works just fine. @benfred Do you know of any way I get can anything more out of the py-spy error messages, to get the reason was on the failed process.lock call when run this pyroscope support? I wonder if there's some issue where it's impossible for code running inside of the python process to suspend itself (even via the redirection out to rust).

@omarabid
Copy link
Contributor

@AlexBoyd I don't think locking is possible under FFI. You are basically asking a child process to lock a parent process. Not sure how the unlocking will happen later and this is not permitted by Linux anyway.

@AlexBoyd
Copy link
Author

Hmm any suggestions for the best way to work around this limitation in the case. For every meaningfully large python program I've tried to profile with pyroscope so far, about 20-30% of the total time ends up being attributed in these partial / broken stack traces.

It looks like the pyroscope standalone agent, also dropped support for py-spy over here: grafana/pyroscope#1608 Running that with connect would previously have been a workaround to get blocking samples working. Is there any remaining way to run a pyroscope agent using the py-spy backend?

@omarabid
Copy link
Contributor

@AlexBoyd You can try the CLI of pyroscope-rs, it's the equivalent agent in Rust (the other, I think, is a mix of Go/C). This will make tagging impossible, however.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants