Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exporter/Prometheus: Upgrade version and fix buckets and labels. #492

Merged
merged 7 commits into from
Feb 12, 2019

Conversation

songy23
Copy link
Contributor

@songy23 songy23 commented Feb 6, 2019

  • Upgrade to latest Prometheus client.
  • Use UnknownMetricFamily instead of UntypedMetricFamily, as the latter is now deprecated.
  • Use ordered list for histogram buckets. Previously we use unordered map and that led to unpredictable behavior.
  • Label keys and values must match when uploading metrics.
  • Update unit tests to test against more precise results.

@@ -74,11 +74,11 @@ def test_prometheus_stats(self):
import urllib2
contents = urllib2.urlopen("http://localhost:9303/metrics").read()

self.assertIn(b'# TYPE opencensus_request_count_view counter',
self.assertIn(b'# TYPE opencensus_request_count_view_total counter',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest Prometheus client automatically append "_total" suffix to count metrics.

Copy link
Member

@odeke-em odeke-em left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR @songy23! I've added some comments, PTAL.

:class:`~prometheus_client.core.GaugeMetricFamily`
:returns: A Prometheus metric object
"""
metric_name = desc['name']
metric_description = desc['documentation']
label_keys = desc['labels']

assert(len(tag_values) == len(label_keys))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unfortunately going to panic in user code, am not sure if that's something we want?
Sure it demos a bug in the library but redeploying versions of an entire stack will be painful.
Perhaps if we suspect such problems, we can pad or truncate tag values depending on label_keys
Perhaps let's make a TODO for an action item later on and think about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unfortunately going to panic in user code, am not sure if that's something we want?

No this won't. to_metric() should only be called within our library and users are not expected to call it directly. If for some reason the label keys and values don't match, that indicates an internal error in our library.

In Java we have the exact same check: https://github.com/census-instrumentation/opencensus-java/blob/master/exporters/stats/prometheus/src/main/java/io/opencensus/exporter/stats/prometheus/PrometheusExportUtils.java#L143-L144.

@@ -183,26 +184,30 @@ def to_metric(self, desc, tag_values, agg_data):
aggregation_data_module.DistributionAggregationData):

assert(agg_data.bounds == sorted(agg_data.bounds))
points = {}
cum_count = 0
# buckets are a list of bucket. Each bucket is another list with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/list of bucket./list of buckets./g

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG, done.

for ii, bound in enumerate(agg_data.bounds):
cum_count += agg_data.counts_per_bucket[ii]
points[str(bound)] = cum_count
bucket = [str(bound), cum_count]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using lists is kind of overkill and potentially expensive.
Why not tuples that properly match the grouping we are using of pairs, or triples?

cur = (str(bound), cum_count,)
buckets = append(cur)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used list because Prometheus client comments said explicitly they expected a list of lists as buckets (https://github.com/prometheus/client_python/blob/master/prometheus_client/metrics_core.py#L201):

buckets: A list of lists. Each inner list can be a pair of bucket name and value, or a triple of bucket name, value, and exemplar. The buckets must be sorted, and +Inf present.

Guess tuples will work in the same way. Updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@songy23 oh I see, I didn't know that, but perhaps please cite that in your commit. My apologies, please revert to the original that you had as I was assuming that you had just used list of list out of convenience.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries - changed it back to lists.

sum_value=agg_data.sum,)
return metric

elif isinstance(agg_data,
aggregation_data_module.SumAggregationDataFloat):
metric = UntypedMetricFamily(name=metric_name,
metric = UnknownMetricFamily(name=metric_name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the SummaryMetricFamily then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SummaryMetricFamily is for Summary metrics. Sum metrics should be converted to UnknownMetricFamily.

@songy23 songy23 force-pushed the prometheus-tests branch 4 times, most recently from de3947b to be8dbf3 Compare February 11, 2019 23:12
@c24t
Copy link
Member

c24t commented Feb 12, 2019

Use ordered list for histogram buckets. Previously we use unordered map and that led to unpredictable behavior.

Good catch! I'm surprised to see the bounds need to be sorted but it looks to be the case.

Changing the prometheus client version seems significant and user-facing enough to be worth adding a note to the changelog. What do you think?

@songy23
Copy link
Contributor Author

songy23 commented Feb 12, 2019

Changing the prometheus client version seems significant and user-facing enough to be worth adding a note to the changelog. What do you think?

Make sense to me, added to changelog.

@songy23 songy23 merged commit 20bd611 into census-instrumentation:master Feb 12, 2019
@songy23 songy23 deleted the prometheus-tests branch February 12, 2019 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants