New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-3951] Add histogram metric type #2112
Conversation
-1. I haven't looked very deeply at the actual code but let me share some thoughts: I was under the impression that the core of this issue was to implement at least 1 Histogram that does not rely on another library. It should contain at least one to guarantee that the compatibility layer works properly. The original PR for the metric system already contained wrapped DropWIzard Histograms, which were specifically taken out as they did not meet our performance needs. Why we now essentially revert this decision now 2 weeks later is beyond me. I also hope you did not spend too much time writing this as a lot of similar code already existed. As such I really don't see the point of the DropWizardWrapper. IMO the only thing we need is the HistogramWrapper class, which may require a more distinctive name btw. . We should not expose a DW -> Flink compatibility layer, next up people want to use DW Timers, Gauges etc. as well. This will simply create more work for us. The original PR also contained the flink-metric-reporters module as flink-metrics. Now we revert this again as well. Just...wth. |
We decided to add the histogram metric, because users requested this feature. The important thing is that the histograms are not used by the runtime per default, because they indeed don't come for free. But if the user decides to use such a metric, then he should be able to explicitly register a histogram. Additionally, we don't want to add a strict dependency on dropwizard from flink-core. Therefore, I introduced the One of these implementations is the Since the assumption is that The PR does not introduce a Dropwizard compatibility layer. Dropwizard is only one of many possible implementation for our metrics (counter, gauge and histogram so far). It is up to the community to decide which other metrics to add. I agree that the naming of I'm sorry if you feel angry because you already did some of this work which was later removed. But please keep in mind that not everyone was involved in the metrics PR. |
So we don't want to tie ourselves to DropWizard histograms, but the only way for users to use a histogram right now without implementing one themselves (something that is too much work for us apparently) is to rely on DropWizard histograms. This doesn't really fit together. Could we add other implementations? Sure! But I have a hard time believing that we'll ever get around to doing that. If it doesn't make sense now to implement one because the DW ones offer so much functionality, when would it? |
Yes the idea is introduce the histogram abstraction which allows us to use any histogram implementation. At the moment we don't have a need for an optimized implementation which might change in the future, though. Thus, for the time being I think it is a good compromise to offer the user the possibility to use Dropwizard histograms in his user code. With that in mind, I don't understand why things are not fitting together? What do you propose? Adding Dropwizard histograms to flink-core? |
* @param quantile Quantile to calculate the value for | ||
* @return Value for the given quantile | ||
*/ | ||
public abstract double getValue(double quantile); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we rename this to getQuantile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both names work. I would keep it like that, because it would be consistent with the Dropwizard naming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not reduce clarity for the sake of consistency with a 3rd party library that may simply not be able to rename it due to compatibility concerns.
As a bonus, getPercentile() would make the interface more consistent in itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me getValue()
on a statistics object makes perfectly sense. But if you think that users will get confused by this name, then I'll change it. But then you have to decide to change into either getQuantile
or getPercentile
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all i see is 5 getXXPercentile() methods, and all they do is call getValue(). To me it just makes more sense that getValue() be called getPercentile().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alright, getPercentile() (or getNthPercentile) doesn't really fit since the argument is a long between 0 and 1. so getQuantile() would be more appropriate i guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, then I'll rename the method to getQuantile
but I think it's a bit of a pettifoggery here what we're doing.
…linkHistogramWrapper; Make usage of codahale classes more explicit
Thanks for your review @zentol. I've addressed your comments modulo the naming question of |
…ntile methods from HistogramStatistics
I applied the requested changes. |
merging |
This closes apache#2112
This PR introduces an interface for Flink's histogram metric type
Histogram
and the corresponding statisticsHistogramStatistics
. TheHistogram
interface allows to update the histogram with new values , retrieve the number of collected values and to obtain aHistogramStatistics
object which represents the current statistics of the histogram. The values must be of type long.The
HistogramStatistics
allows to compute the mean, min, max and values for quantiles based on the element distribution calculated by the histogram.An implementation of the
Histogram
interface is contained in the flink-metrics-dropwizard module. The implementation wraps a dropwizard histogram via theDropwizardHistogramWrapper
class. This is currently the only implementation ofHistogram
. If needed, we can add a histogram implementation to flink-core later. But for the moment histograms are intended to be used as user metrics. Histograms add computational overhead which should not be imposed on the runtime per default.The histogram' statistics are reported by all currently supported
MetricReporters
. TheJMXReporter
introduces aJmxHistogramMBean
interface for this purpose. TheStatsDReporter
reports the individual values of theHistogramStatistics
object as gauge values. TheScheduledDropwizardReporter
wraps theHistogram
in aHistogramWrapper
and theHistogramStatistics
in aHistogramStatisticsWrapper
so that they can be given to Dropwizards' own reporters. As an optimization, theScheduledDropwizardReporter
detects aDropwizardHistogramWrapper
and unwraps the dropwizard histogram which is then used for reporting.