-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spans inside frames inside spans #6
Comments
Yeah... Sadly frames are intended only to be used at the top level of performance collecting. What is the pattern that you are trying to look at in your code, and what do you want frames to help you do? |
My optimization algorithms are typically some initialization and then a loop. The body of the loop has several actions. My main goal is to understand the cost of the in-loop actions, but I want to know (not guess) that the initialization isn't too expensive either. Having removed the outer span, so that I have a next_frame at the beginning of each loop iteration and a bunch of span_of in it, I find the resulting flame graphs surprising: they have the spans sorted by name, but I expected the times for a particular named span to be summed over the frames, instead I have 10 copies (for ten loop iterations) of each. Is this the expected behavior? |
On second thought, what I should probably do is give up on wrapping the whole loop, but instead wrap the initialization code by its own span. Ok, that's reasonable. |
The by-name sorting thing surprised me too. I've pushed Also, right now, none of the exporters support drawing frames. I'm writing my own viewer though, so when that's done, it'll support frames and viewing multi-threaded computation! |
It seems each user has different expectations about what frames mean... to me it was obvious we sum over frames: since frames are likely to be similar, I want to average over them to more precisely measure the different spans in it. Looking at It is even not obvious what sort order is best: by size and by order of occurence both make sense. |
How do you plan to treat spans over a sequence of frames? summing as I'd prefer, or you have something else in mind? |
For the flamegraph, I've opted to order by occurrence. Honestly, the only reason I included the concept of a frame was to make visualizing perf for a game possible from inside the game itself. This is why I haven't built frames support into the Frames will probably always be the outermost unit of measurement in FLAME and they'll happen on a per-thread basis. What would it mean to have nested frames? Or a frame that doesn't If you want to collapse multiple frames into each other, you can use |
Of course you will keep the design fit for your own purposes, but I will explain my point of view, maybe you'll find it useful. Frames are a special case of loop bodies. Loops are sometimes nested. How should a profile of a program with (maybe nested) loops look like? (the visualization inside the game scenario is kind of orthogonal, will get back to it). If you loop happens some small number of times, and each time is completely different because of differing input etc, you really want to ignore the loop entirely, and display the profiles of the loop bodies one after the other for comparison. But it is pretty common that loops have many iterations (so the previous display proposed is impractical), and that different iterations are pretty similar to one another, and what we really care about is the relative costs of different loop-body-parts. In this case, you want to treat body-loop-passes (will call them frames for convenience now) as just samples from a process that you are trying to study. What can we do then? the most natural thing to do is average over them: construct a single profile, that has the union of all spans that occured over different frames, and divide the total measured time for each by the number of frames. But you can other statistics as well: show how many frames we are averaging over, show the +/- 25% percentiles in addition to the mean, pinpoint outliers etc. What about nested loops? typically, you'll want to average over each of them separately. The inner loop will be its own part in the external one. Of course, this applies to recurrences of different types, like iterators, not just loops. In fact, a very natural way to implement all of this is to always "summarize" like this the contents of any span. When its children occur only once, you don't see it. When a span has children repeating many times, the summary is valuable. To visualize inside the game: you can treat a particular span as the frame, and then for that span, either use only the last one as in hprof, or average over the last 30 for somewhat stabler display, or whatever is useful. But any in-frame loops, you still probably want the kind of summary I suggested above. Sorry for the wall of text. |
Thanks for the wall of text; I understand your use case much more now. I think that there are two (equally important) parts to FLAME: the API, and the viewer. With the alpha release, I'm proud of how small the API is, but the visualizer was thrown together at the last moment. I think in code with high amounts of repetition, there is certainly a need for the viewer to be intelligent; able to detect repetition, and have options for collapsing and summarizing. Detection of the pattern produced by code like this for _ in 0 .. 30 {
::flame::start("foo");
do_something();
::flame::end("foo");
} would be trivial, and could be very user-friendly without any API changes to account for it. I'm going to start writing my own visualizer this weekend (the current one is 3rd party), and when I do, I'll keep loop-detection in mind.If the experiment works out well, I also might deprecate the "frame" API as it would be unnecessary. |
Cool, glad to clarify my use case, and glad to hear it is (IIUC) within On Thu, May 26, 2016 at 12:00 PM, Ty Overby notifications@github.com
|
I am not writing a game, but numerical code, such that algorithms have loops with some interesting parts inside the loop and some outside.
So I want a span before the loop starts, a span on the whole loop, inside the loop each iteration is a frame, and more spans inside the loop, partitioning each frame.
Currently this results in errors like
thread 'logisticregression3' panicked at 'flame::end("SublinearAveragingSolver") called without a currently running span!', /home/danielv/.cargo/registry/src/github.com-88ac128001ac3a9a/flame-0.1.5/src/lib.rs:257
(presumably because my next_frame inside the loop is hiding the loop scope flame::start("Sublinear...");
Does this make sense?
The text was updated successfully, but these errors were encountered: