Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-3951] Add histogram metric type #2112

Closed
wants to merge 10 commits into from

Conversation

tillrohrmann
Copy link
Contributor

This PR introduces an interface for Flink's histogram metric type Histogram and the corresponding statistics HistogramStatistics. The Histogram interface allows to update the histogram with new values , retrieve the number of collected values and to obtain a HistogramStatistics object which represents the current statistics of the histogram. The values must be of type long.

The HistogramStatistics allows to compute the mean, min, max and values for quantiles based on the element distribution calculated by the histogram.

An implementation of the Histogram interface is contained in the flink-metrics-dropwizard module. The implementation wraps a dropwizard histogram via the DropwizardHistogramWrapper class. This is currently the only implementation of Histogram. If needed, we can add a histogram implementation to flink-core later. But for the moment histograms are intended to be used as user metrics. Histograms add computational overhead which should not be imposed on the runtime per default.

The histogram' statistics are reported by all currently supported MetricReporters. The JMXReporter introduces a JmxHistogramMBean interface for this purpose. The StatsDReporter reports the individual values of the HistogramStatistics object as gauge values. The ScheduledDropwizardReporter wraps the Histogram in a HistogramWrapper and the HistogramStatistics in a HistogramStatisticsWrapper so that they can be given to Dropwizards' own reporters. As an optimization, the ScheduledDropwizardReporter detects a DropwizardHistogramWrapper and unwraps the dropwizard histogram which is then used for reporting.

@zentol
Copy link
Contributor

zentol commented Jun 16, 2016

-1. I haven't looked very deeply at the actual code but let me share some thoughts:

I was under the impression that the core of this issue was to implement at least 1 Histogram that does not rely on another library. It should contain at least one to guarantee that the compatibility layer works properly.

The original PR for the metric system already contained wrapped DropWIzard Histograms, which were specifically taken out as they did not meet our performance needs. Why we now essentially revert this decision now 2 weeks later is beyond me. I also hope you did not spend too much time writing this as a lot of similar code already existed.

As such I really don't see the point of the DropWizardWrapper. IMO the only thing we need is the HistogramWrapper class, which may require a more distinctive name btw. . We should not expose a DW -> Flink compatibility layer, next up people want to use DW Timers, Gauges etc. as well. This will simply create more work for us.

The original PR also contained the flink-metric-reporters module as flink-metrics. Now we revert this again as well. Just...wth.

@tillrohrmann
Copy link
Contributor Author

We decided to add the histogram metric, because users requested this feature. The important thing is that the histograms are not used by the runtime per default, because they indeed don't come for free. But if the user decides to use such a metric, then he should be able to explicitly register a histogram.

Additionally, we don't want to add a strict dependency on dropwizard from flink-core. Therefore, I introduced the Histogram interface which effectively hides the concrete implementation. So later we can easily swap the implementation out.

One of these implementations is the DropwizardHistogramWrapper which allows you to use Dropwizard histograms in Flink. The reason for this is that Dropwizard's Histogram has already a lot of functionality (different reservoirs for sampling streams, for example) and it would be a lot of duplicate work to reimplement it.

Since the assumption is that Histograms are a user metric, it seems to be ok for me to not ship an implementation of the interface with flink-core but to require the user to include an additional module such as flink-metrics-dropwizard. This module would add the dropwizard dependency anyway. If necessary, then we can also add our own histogram implementation later.

The PR does not introduce a Dropwizard compatibility layer. Dropwizard is only one of many possible implementation for our metrics (counter, gauge and histogram so far). It is up to the community to decide which other metrics to add.

I agree that the naming of HistogramWrapper is not so distinctive. I actually sticked to the naming scheme for Counter and Gauge which are called CounterWrapper and GaugeWrapper. This can be easily changed. Do you have a proposal?

I'm sorry if you feel angry because you already did some of this work which was later removed. But please keep in mind that not everyone was involved in the metrics PR.

@zentol
Copy link
Contributor

zentol commented Jun 17, 2016

So we don't want to tie ourselves to DropWizard histograms, but the only way for users to use a histogram right now without implementing one themselves (something that is too much work for us apparently) is to rely on DropWizard histograms. This doesn't really fit together.

Could we add other implementations? Sure! But I have a hard time believing that we'll ever get around to doing that. If it doesn't make sense now to implement one because the DW ones offer so much functionality, when would it?

@tillrohrmann
Copy link
Contributor Author

Yes the idea is introduce the histogram abstraction which allows us to use any histogram implementation. At the moment we don't have a need for an optimized implementation which might change in the future, though. Thus, for the time being I think it is a good compromise to offer the user the possibility to use Dropwizard histograms in his user code.

With that in mind, I don't understand why things are not fitting together? What do you propose? Adding Dropwizard histograms to flink-core?

* @param quantile Quantile to calculate the value for
* @return Value for the given quantile
*/
public abstract double getValue(double quantile);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we rename this to getQuantile?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both names work. I would keep it like that, because it would be consistent with the Dropwizard naming.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not reduce clarity for the sake of consistency with a 3rd party library that may simply not be able to rename it due to compatibility concerns.

As a bonus, getPercentile() would make the interface more consistent in itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me getValue() on a statistics object makes perfectly sense. But if you think that users will get confused by this name, then I'll change it. But then you have to decide to change into either getQuantile or getPercentile.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all i see is 5 getXXPercentile() methods, and all they do is call getValue(). To me it just makes more sense that getValue() be called getPercentile().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright, getPercentile() (or getNthPercentile) doesn't really fit since the argument is a long between 0 and 1. so getQuantile() would be more appropriate i guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, then I'll rename the method to getQuantile but I think it's a bit of a pettifoggery here what we're doing.

…linkHistogramWrapper; Make usage of codahale classes more explicit
@tillrohrmann
Copy link
Contributor Author

Thanks for your review @zentol. I've addressed your comments modulo the naming question of getValue and the percentile convenience methods. Let's try to figure out what's the best solution for them.

@tillrohrmann
Copy link
Contributor Author

I applied the requested changes.

@zentol
Copy link
Contributor

zentol commented Jun 27, 2016

merging

zentol pushed a commit to zentol/flink that referenced this pull request Jun 27, 2016
@asfgit asfgit closed this in ee3c7a8 Jun 27, 2016
@tillrohrmann tillrohrmann deleted the addHistogram branch August 19, 2016 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants