Output can be truncated #3

wiltzius · 2019-02-14T20:10:03Z

The writer thread is terminated without checking if its finished processing the event queue.

I sent a PR that addresses this: #2

kwlzn · 2019-02-14T20:47:26Z

thanks for filing! closed in #2 - I'll cut a new pypi release to consume this when I have a chance.

wiltzius · 2019-02-15T04:01:55Z

Thanks! And thanks for writing this script.

I worked on the Chrome team that built about:tracing a few years ago, but I work mostly in Python now -- I always wanted to have a go at writing a tracing profiler for Python but by the time I finally got around to it you already had! I appreciate it.

I see so many posts of people mucking with flamegraphs etc; this method is really high overhead but the results are much easier to read and use IMHO.

kwlzn · 2019-02-15T18:57:57Z

glad you're finding it useful.

this method is really high overhead but the results are much easier to read and use IMHO.

totally - this started out largely as a toy for basic larger-grained perf analysis where we were previously using elapsed timing in logs. I'd always intended to circle back and port the trace function to e.g. rust for lower overhead.

wiltzius · 2019-02-15T23:33:21Z

Yeah, I was also thinking about porting to a C extension of some kind. In the script I was profiling (API endpoints that do a lot of heavy lifting with SqlAlchemy) it ran almost 100x slower with the profiler enabled, which was way more of a slowdown than I expected. The relative values were still really useful. Is the best way to do that implement the trace function as the extension, and then still use sys.setprofile to install it? I was trying to find an example of someone doing something similar but haven't turned up anything yet.

…

On Fri, Feb 15, 2019 at 10:57 AM Kris Wilson ***@***.***> wrote: glad you're finding it useful. this method is really high overhead but the results are much easier to read and use IMHO. totally - this started out largely as a toy for basic larger-grained perf analysis where we were previously using elapsed timing in logs. I'd always intended to circle back and port the trace function to e.g. rust for lower overhead. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#3 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAO97TFTqOWv4AHo28W77rBPcVGLD2rBks5vNwM1gaJpZM4a8Weu> .

kwlzn · 2019-02-16T00:20:02Z

Is the best way to do that implement the trace function as the extension, and then still use sys.setprofile to install it?

as far as I know, yes. I believe it should be as simple as passing in a cffi function handle to that API - which should massively improve the trace call overhead vs a python def'd function.

I think it'd be cool to leverage something like https://github.com/getsentry/milksnake to have an in-line rust binary build + CFFI + sys.setprofile. rust is great to marry with python because of no GC/minimal runtime overhead, speed, modern dev toolchain etc.

wiltzius · 2019-02-19T19:51:54Z

I gave this a go over the weekend. I'm unfamiliar with Rust and couldn't figure out the story for Python types in Rust (there's an issue on the Milksnake repo asking about this without much of a pointer; PyO3 also seemed a little experimental), so I just used pybind11 since it seemed well supported and this is my first binary Python extension.

Here's my first very barebones take that does at least work:

https://github.com/wiltzius/pytracing-cpp

It still needs a lot (eg. threaded output, integration with a context manager on the Python side, and it does a lot of unnecessary string allocation), but curious what you think. If this is a direction you're interested in taking things I can work toward a branch of this repo with this alternative implementation; otherwise I can do this on a fork.

wiltzius · 2019-02-19T19:54:37Z

(advice on good patterns for writing C++ Python extensions particularly welcome, again I haven't built one before and there isn't a ton of good info out there. I'm in particular unclear about what the implication of forking a thread in this code is, how to avoid the GIL if forking, and whether it's a good idea to e.g. join the writer thread in the main Profiler object destructor).

kwlzn closed this as completed Feb 14, 2019

kwlzn mentioned this issue Feb 14, 2019

release 0.3 + 0.4 #4

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output can be truncated #3

Output can be truncated #3

wiltzius commented Feb 14, 2019

kwlzn commented Feb 14, 2019

wiltzius commented Feb 15, 2019

kwlzn commented Feb 15, 2019

wiltzius commented Feb 15, 2019 via email

kwlzn commented Feb 16, 2019

wiltzius commented Feb 19, 2019

wiltzius commented Feb 19, 2019

Output can be truncated #3

Output can be truncated #3

Comments

wiltzius commented Feb 14, 2019

kwlzn commented Feb 14, 2019

wiltzius commented Feb 15, 2019

kwlzn commented Feb 15, 2019

wiltzius commented Feb 15, 2019 via email

kwlzn commented Feb 16, 2019

wiltzius commented Feb 19, 2019

wiltzius commented Feb 19, 2019