New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
common: add "avglat" in perf result to calculate average latency. #12199
Conversation
Signed-off-by: Pan Liu <pan.liu@istuary.com>
When we want to tune the performance, we may need perf dump command. For the variables added by "add_time_avg", only "sum" and "avgcount" are dumped out, so I have to compute average latency by hand: sum/avgcount. Indeed, for this kind of performance tuning, average latency of different part is even more important when compare two different perf results. In my modification, the unit of avglat is ms, not second, so that easier to read. |
I'm not sure this helps you. The sum and avgcount are totals over the lifetime of the counter, so dividing them directly doesn't tell you much.. you need to take the delta with the previous measurement (say, 1s ago), and then device that to get a useful value. Either way, this seems like something that consuming tools (e.g., 'ceph daemonperf ...') should be doing... |
@liewegas, there is also one command very usefull: ceph daemon... perf reset all. In am working tuning ceph on P3700 ssds. Using perf reset all and perf dump really help me a lot. :) After adding avg lat, it is much easier for me to compare The latency before and After perf reset all. |
@liewegas @tchaikov , I tried ceph daemonperf: ceph daemonperf /var/run/ceph/ceph-osd.0.asok I didn't find any perf information for a variable such as "l_bluestore_compress_lat". I also found there was no option to output this kind info: I agree this tool should support this. But seems many enhancement should be done to support. In this way, I think my modification is practical, can help users to do analysis. What is your opinion? Thanks. |
(cc @dmick) We'd love to see this command expanded so that it takes a list
of metrics to watch. Perhaps with a magic argument to include (or not
include) the default ones. Open to suggestions there!
|
@dmick , what is your opinion about this PR? I believe avg lat is useful for debug. |
I guess my opinion is that, as Sage said, the averages aren't very useful. I get that along with manually resetting the counters they can be made more useful, but that's a pretty brute-force solution, and affects more than just the counter you're interested in. That said, it does seem useful to be able to select a set of stats with arguments to ceph daemon, so that experiments like yours and others, while they may be less-generally useful, can still be accomplished without hacking the code. I'd vote for moving in that direction rather than adding more questionably-useful hardcoded stats. |
Signed-off-by: Pan Liu pan.liu@istuary.com