Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

common, osd, tools: Add histograms to performance counters #12829

Merged
merged 1 commit into from Feb 9, 2017

Conversation

@byo
Copy link
Contributor

byo commented Jan 9, 2017

This change adds new performance counter type: histogram.

Currently it does measure latency + request size (in bytes) of an op similarly to
l_osd_op_(r|w|rw)_lat and l_osd_op_(r|w|rw)_(inb|outb). Histograms
are 2-dimensional and gather probes for both values at the same time. This allows
a more detailed analysis than what could be done with two separate 1-D histograms.
Still the memory footprint is negligible.

Since new data could break existing tools and the amount of data dumped is rather large,
two new admin socket commands are introduced: perf histogram schema and
perf histogram dump. Invocation of original command should remain compatible.
There's also a new simple tool in src/tools/histogram_dump.py which does a live dump
of one histogram in local daemon.

Current configuration of histograms is hard-coded and should cover all common cases.
If it turns out to be useful, in the future it could be loaded from the configuration.

Signed-off-by: Bartłomiej Święcki bartlomiej.swiecki@corp.ovh.com

case PerfHistogramCommon::SCALE_LOG2:
return int64_t(1) << (i - 1);
}
assert(false && "Invalid quantization type");

This comment has been minimized.

Copy link
@branch-predictor

branch-predictor Jan 9, 2017

Member

You could also try to output what quantization type was passed to ease debugging later.

This comment has been minimized.

Copy link
@byo

byo Jan 10, 2017

Author Contributor

Thanks for comment, since assert generates assertion message at compile time it would be hard to format anything here. alternative would be to add some logging but that's a bit to heavy machinery for this class in my opinion.

return ac.m_buckets - 1;

default:
assert(false && "Invalid scale type");

This comment has been minimized.

Copy link
@branch-predictor
@@ -394,7 +394,7 @@ void RocksDBStore::get_statistics(Formatter *f)
f->close_section();
}
f->open_object_section("rocksdbstore_perf_counters");
logger->dump_formatted(f,0);
logger->dump_formatted(f,0,false);

This comment has been minimized.

Copy link
@branch-predictor

branch-predictor Jan 9, 2017

Member

Can you add missing spaces between args?

This comment has been minimized.

Copy link
@byo

byo Jan 10, 2017

Author Contributor

Fixed

return get_raw_index_internal<0>(
[](int64_t bucket, const axis_config_d &ac) {
assert(bucket >= 0 && "Bucket index can not be negative");
assert(bucket < ac.m_buckets && "Bucket index to large");

This comment has been minimized.

Copy link
@branch-predictor

This comment has been minimized.

Copy link
@byo

byo Jan 10, 2017

Author Contributor

Fixed

[](int){}
);

for (size_t i=0; i<rawValues.size(); ++i) {

This comment has been minimized.

Copy link
@branch-predictor

branch-predictor Jan 9, 2017

Member

Missing spaces

This comment has been minimized.

Copy link
@byo

byo Jan 10, 2017

Author Contributor

Applied clang-format to the file. Now it's much better.

@byo byo force-pushed the ovh:wip-perf-counters-histogram branch 2 times, most recently from 26c8a82 to 24795af Jan 10, 2017

@byo

This comment has been minimized.

Copy link
Contributor Author

byo commented Jan 10, 2017

Updated PR, mostly minor changes

@branch-predictor

This comment has been minimized.

Copy link
Member

branch-predictor commented Jan 10, 2017

Jenkins, retest this please

@byo

This comment has been minimized.

Copy link
Contributor Author

byo commented Jan 18, 2017

@liewegas @tchaikov @dachary can you take a look at this?

@byo byo force-pushed the ovh:wip-perf-counters-histogram branch from 24795af to c86cc46 Jan 31, 2017

@Liuchang0812
Copy link
Contributor

Liuchang0812 left a comment

some trival comments



def shorten(val):
if type(val) is str:

This comment has been minimized.

Copy link
@Liuchang0812

Liuchang0812 Jan 31, 2017

Contributor

is instanceof better ?

This comment has been minimized.

Copy link
@byo

byo Feb 1, 2017

Author Contributor

True, converted to instanceof (although this is a bit tricky to get right and compatible with both python2 and python3 at the same time)

type=str,
default='/var/run/ceph/*.asok',
help='Path to asok file, can use wildcards')
parser.add_argument(

This comment has been minimized.

Copy link
@Liuchang0812

Liuchang0812 Jan 31, 2017

Contributor

seems that logger is not used now?

This comment has been minimized.

Copy link
@byo

byo Feb 1, 2017

Author Contributor

Used in line 100, could be counter-intuitive since it's not python logger but the name of performance counter logger inside ceph.

def shorten(val):
if type(val) is str:
return val
for u in ((3, ''), (6, 'k'), (9, 'M'), (12, 'G'), (15, 'T')):

This comment has been minimized.

Copy link
@Liuchang0812

Liuchang0812 Jan 31, 2017

Contributor

(6, 'k') or (6, 'K')?

This comment has been minimized.

Copy link
@byo

byo Feb 1, 2017

Author Contributor

Lower-case intentionally since it's 10^3, not 2^10 (as per https://en.wikipedia.org/wiki/K_(disambiguation)#Unit) - this tool doesn't really take into account axis units, always threats as raw integer values - seems like a good future improvement though.

@liewegas

This comment has been minimized.

Copy link
Member

liewegas commented Jan 31, 2017

retest this please

Bartłomiej Święcki
common, osd, tools: Add histograms to performance counters
This change adds new performance counter type: histogram.

Currently it does measure latency + request size (in bytes) of
an op similarly to l_osd_op_(r|w|rw)_lat and
l_osd_op_(r|w|rw)_(inb|outb). Histograms are 2-dimensional and
gather probes for both values at the same time. This allows
a more detailed analysis than what could be done with two
separate 1-D histograms. Still the memory footprint is negligible.

Since new data could break existing tools and the amount of data
dumped is rather large, two new admin socket commands are
introduced: perf histogram schema and perf histogram dump.
Invocation of original command should remain compatible.
There's also a new simple tool in src/tools/histogram_dump.py
which does a live dump of one histogram in local daemon.

Current configuration of histograms is hard-coded and should
cover all common cases. If it turns out to be useful, in the
future it could be loaded from the configuration.

Signed-off-by: Bartłomiej Święcki <bartlomiej.swiecki@corp.ovh.com>

@byo byo force-pushed the ovh:wip-perf-counters-histogram branch from c86cc46 to 5034024 Feb 1, 2017

"Latency (usec)",
PerfHistogramCommon::SCALE_LOG2, ///< Latency in logarithmic scale
0, ///< Start at 0
100000, ///< Quantization unit is 100usec

This comment has been minimized.

Copy link
@asheplyakov

asheplyakov Feb 3, 2017

I think a fixed quantization value is not suitable for latency histograms. For instance, one wants to distinguish between 0.1 msec and 1 msec, but measuring latencies ~ 1 sec with that precision hardly makes any sense. Perhaps we should stop re-inventing the wheel and re-use HdrHistogram

This comment has been minimized.

Copy link
@byo

byo Feb 3, 2017

Author Contributor

Although the quantization is fixed, the axis scale is logarithmic here so that there's much higher granularity for lower values and much lower for the big ones.

I took a brief look at HdrHistogram and it does something very similar, just configured in different way. I think it won't be a good fit here though since they don't support atomic/concurrent histograms. Maybe something other would be a better fit, do you have any experience in this area?

Also I'm pretty sure most of implementations would do histograms over one value, I try to gather it over two values simultaneously which may help much more fine-grained analysis.

I also remember Milosz Tanski was talking in ML about Count Min Sketch which we could look into.

This comment has been minimized.

Copy link
@asheplyakov

asheplyakov Feb 6, 2017

the axis scale is logarithmic here so that there's much higher granularity for lower values and much lower for the big ones.

But this prevents one from measuring maximal latency accurately enough. For instance, all values in [0.1 sec, 1 sec) end up in the same bin, it's impossible to distinguish between 0.11 sec and 0.99 sec, which kind of makes the data useless.

I think it won't be a good fit here though since they don't support atomic/concurrent histograms.

The Java version does support thread safe (concurrent) and atomic versions (nobody bothered to port those to C yet).

This comment has been minimized.

Copy link
@liewegas

liewegas Feb 6, 2017

Member

Looking at HdrHistogram it looks like the space requirements will be significantly higher if you want that level of precision, right?

I haven't looked at this is much detail, but I'm confused what the quantization value is used for... isn't the data effectively quantized by the bin size anyway, which is logarithmic?

If we were to swap in HdrHistogram (or similar) later, would/can the reporting interface remain the same?

This comment has been minimized.

Copy link
@byo

byo Feb 6, 2017

Author Contributor

@asheplyakov I think in both HdrHistogram and the implementation I did it's a matter of space usage, if you want better resolution you have to use more buckets. For probabilistic data structures (like Count Min Sketch mentioned before) this could also be adjusted by the amount of allowed noise.

This comment has been minimized.

Copy link
@liewegas

liewegas Feb 6, 2017

Member

Ok. I don't want great to be the enemy of good, so unless there are outstanding issues with the current dump interface, let's merge this!

This comment has been minimized.

Copy link
@byo

byo Feb 6, 2017

Author Contributor

@liewegas To clarify the quantization stuff: I'm finding bucket in two steps, first is to quantize and shift the input space to normalize it. In the next step the quantized value is converted to a final bucket number by moving to a correct scale - either leave it as linear or make logarithmic, this second transformation always operate on uniform input space then.

Regarding the output - for each bucket it also dumps the range of values it holds so I guess switching to something else would just require dumping it in compatible format. Dumping probabilistic data structure could be a bit trickier though as long as we don't want to output large amounts of data.

@liewegas liewegas added the needs-qa label Feb 6, 2017

@tchaikov tchaikov self-assigned this Feb 7, 2017

@yuriw yuriw merged commit 6ff5c5f into ceph:master Feb 9, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details

@byo byo deleted the ovh:wip-perf-counters-histogram branch Feb 9, 2017

scale_type_d m_scale_type;
int64_t m_min;
int64_t m_quant_size;
int32_t m_buckets;

This comment has been minimized.

Copy link
@tchaikov

tchaikov Feb 17, 2017

Contributor

@byo where is m_buckets initialized?

This comment has been minimized.

Copy link
@byo

byo Feb 17, 2017

Author Contributor

Currently it's configured here and here

This comment has been minimized.

Copy link
@tchaikov

tchaikov Feb 17, 2017

Contributor

@byo okay, i was trying to fix the issues reported by coverity, which complains that the fields of axis_config_d are not initialized. i know this is a false alarm now, but shall we set some safe values in axis_config_d to silence this warning ? like

  struct axis_config_d {
    const char *m_name = nullptr;
    scale_type_d m_scale_type = 0;
    int64_t m_min = 0;
    int64_t m_quant_size = 0;
    int32_t m_buckets =0;
}

what do you think?

This comment has been minimized.

Copy link
@byo

byo Feb 17, 2017

Author Contributor

@tchaikov Seems like a great idea.
Those values would also create invalid configuration object and would assert. This would show up quickly when starting OSD.

This comment has been minimized.

Copy link
@tchaikov

tchaikov Feb 17, 2017

Contributor

@byo thanks! i just added a commit to #13473.

This comment has been minimized.

Copy link
@byo

byo Feb 17, 2017

Author Contributor

@tchaikov I added a comment in #13473 - a small fix in this patch is needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.