data model proto #126

RichiH · 2019-01-15T18:38:55Z

No description provided.

brian-brazil

There's probably more, but here's what I spotted that need to be made consistent with the text format

brian-brazil · 2019-01-15T18:55:02Z

proto/openmetrics_data_model.proto

+  // Required.
+  int64 seconds = 1;  // Seconds since the epoch. Negative values are permitted
+
+  uint32 nanoseconds = 2;


What are the semantics of this when seconds is negative?

nanoseconds_offset perhaps is a better concept? And if its signed instead, perhaps that's more permissive and allows you to simply always just blindly add these two together:

unix_nanos = (timestamp.seconds * SECOND_NANOSECONDS) + timestamp.nanoseconds_offset

Thoughts?

Don't quite understand. We can add even when nanoseconds is unsigned

As discussed we'll just use the google protobuf timestamp type here.

brian-brazil · 2019-01-15T18:56:34Z

proto/openmetrics_data_model.proto

+message Point {
+  oneof value {
+    double double_value = 1;
+    int64 int_value = 2;


Hmm, did we agree uint64 here?

Would it be a little strange that double value can be negative but int value cannot be?

Any reason to restrict it to unsigned? The proto doc has int64

Any reason this cannot be negative for a gauge?

brian-brazil · 2019-01-15T18:57:09Z

proto/openmetrics_data_model.proto

+  double sum = 1;
+
+  // Required.
+  int64 count = 2;


brian-brazil · 2019-01-15T18:57:54Z

proto/openmetrics_data_model.proto

+  message BucketCount {
+    // Required.
+    // Count is the number of values for a bucket of the histogram.
+    int64 count = 1;


brian-brazil · 2019-01-15T18:59:17Z

proto/openmetrics_data_model.proto

+  repeated Quantile quantile = 3;
+  message Quantile {
+    // Required.
+    // Must be in the interval (0.0, 1.0].


I think this should be fully closed, why allow 1 but not 0?

how do you calculate the 0th percentile from a sample of values? i don't think subtracting delta from the lowest value in the sample is mathematically sound.

Same question for 100th percentile. It's either both or neither.

brian-brazil · 2019-01-15T19:02:56Z

proto/openmetrics_data_model.proto

+  // A bucket has a 0 lower bound and an inclusive upper bound for the
+  // values that are counted for that bucket. That is, buckets are
+  // cumulative. The upper bound of successive buckets must be increasing.
+  // There is an implicit overflow bucket that extends up to +infinity.


and must match the value of _count

these are just the bucket boundaries. I've added a comment later about the sum of bucket counts.

brian-brazil · 2019-01-15T19:03:28Z

proto/openmetrics_data_model.proto

+
+// Value for CUMULATIVE_HISTOGRAM or GAUGE_HISTOGRAM point.
+message HistogramValue {
+  // Required.


Neither of these should be required

Some systems don't have them (MySQL)

brian-brazil · 2019-01-15T19:03:52Z

proto/openmetrics_data_model.proto

+  double sum = 1;
+  int64 count = 2;
+
+  repeated Quantile quantile = 3;


It should be clear that this is optional

are you saying the quantile's are optional?

brian-brazil · 2019-01-15T19:04:14Z

proto/openmetrics_data_model.proto

+    InfoValue info_value = 5;
+    SummaryValue summary_value = 6;
+  }
+  // Required for COUNTER, SUMMARY or CUMULATIVE_HISTOGRAM type.


We had agreed this is required from a model perspective.
Are you saying optional just for backward compatibility with the old text format? The proto model does not deal with that compatibility. It defines the baseline OpenMetrics model and a client that does not send start_timestamp is not OpenMetrics compliant.

No, we had not agreed this. It's optional, as many many systems out there do not support this.

brian-brazil · 2019-01-15T19:04:37Z

proto/openmetrics_data_model.proto

+  // interval.
+  Timestamp start_timestamp = 7;
+
+  // If not specified, the timestamp will be decided by the backend.


Make clear this is optional, and discouraged

I think we need to resolve the pull/push debate before we say "discouraged". IMO the same data model suffices for push in which case timestamp should be encouraged.
For now I am fine with saying "discouraged for pull and encouraged for push".

I wouldn't say encouraged for push, it depends on what's on the other side. The other side can always add its own.

robskillington · 2019-01-19T15:56:37Z

proto/openmetrics_data_model.proto

+
+  // TODO: Format?
+  // STATE_SETs and INFO types must not have a unit.
+  string unit = 2;


Should this perhaps instead be an enum?

There's too many potential values, it's a string to allow for that.

Do we want any restriction on this string, or just UTF-8?

It'd have to be the same limits as metric names.

robskillington · 2019-01-19T16:18:28Z

proto/openmetrics_data_model.proto

+    message Linear {
+      // Required.
+      // Must be greater than 0.
+      int32 num_explicit_buckets = 1;


uint32 perhaps since there's no such thing as negative buckets?

robskillington · 2019-01-19T16:18:37Z

proto/openmetrics_data_model.proto

+    message Exponential {
+      // Required.
+      // Must be greater than 0.
+      int32 num_exponential_buckets = 1;


uint32 perhaps here too since there's no such thing as negative buckets?

robskillington · 2019-01-19T16:32:17Z

proto/openmetrics_data_model.proto

+  // The quantiles can be reset at arbitrary unknown times.
+
+  double sum = 1;
+  int64 count = 2;


uint64 since there can't be negative sampled values?

bogdandrutu · 2019-04-12T18:22:53Z

proto/openmetrics_data_model.proto

+  string value = 2;
+}
+
+message Timestamp {


Why not using the default timestamp from proto3? "google/protobuf/timestamp.proto"

I think it'd be better to keep self-contained

At least we can copy the definition from there and avoid any arguments about nanoseconds being int vs uint.

do you know why it is int32 in google/protobuf/timestamp.proto even though it says "Non-negative fractions of a second at nanosecond resolution."?

Two reasons that I know:

This needs to be compatible with languages that do not have unsigned (like Java).

It makes diff calculation easier (you do the diff initially then normalize the nanos).

I don't think either is a problem in practice, it's not going to have values where that'll be an issue.

bogdandrutu · 2019-04-12T18:23:32Z

proto/openmetrics_data_model.proto

+
+// One or more timeseries for a single metric, where each timeseries has
+// one or more points.
+message MetricPoints {


Why not just Metric?

bogdandrutu · 2019-04-12T18:25:39Z

proto/openmetrics_data_model.proto

+message Point {
+  oneof value {
+    double double_value = 1;
+    int64 int_value = 2;


Any reason this cannot be negative for a gauge?

bogdandrutu · 2019-04-12T18:50:21Z

proto/openmetrics_data_model.proto

+// (and indicated by comments).
+
+// The top-level message sent on the wire.
+message MetricSet {


Probably a better name is MetricList. Set semantics are a bit unclear here.

It's a set, you aren't allowed have duplicates.

Then probably document that, to make it clear why is a set and what is the "hash" function.

bogdandrutu · 2019-04-12T18:51:52Z

proto/openmetrics_data_model.proto

+
+  enum Type {
+    UNKNOWN = 0;  // double or int valued point
+    GAUGE = 1;    // double or int valued point


In order to avoid confusion and have backends to support both double/int for the same metric what about making the type include the type of the valued point. Something like GAUGE_DOUBLE?

We discussed this at length, this is the way we're going.

You mean the current state or what I proposed?

bogdandrutu · 2019-04-12T19:01:05Z

proto/openmetrics_data_model.proto

+}
+
+// A single timeseries.
+message Timeseries {


I don't see anywhere something like MonitoredResource in Stackdriver or __meta__ labels (in Prometheus, hope I used the correct name). Is this intentional? How are these reported?

Couple of use-cases where these are necessary and cannot be determine even if this data model is focused on a pull mechanism which would allow the backend to annotate these meta labels in general:

Task (metric producer) -> Proxy -> Metric Backend. In this case the meta labels should be included in the metric because the backend does not know about the metric producer and cannot associate the right meta labels.

This is what Info metrics are for.

Nice, but I have some problems understanding some of the things here, maybe worth some clarification:

I don't understand how does the correlation between Info metrics and the other metrics happen. Should we have one info metric per MetricSet? Can I have in the same MetricSet metrics from multiple producers (in case of a proxy if I monitor multiple tasks)?

I think we are allowing multiple info metrics, e.g. build_info. It is up to the receiver to decide what to do with them. But they will all apply to the whole MetricSet, so one shouldn't mix multiple producers in the same MetricSet.

Yes, there's no limit on how many info metrics you can have.

bogdandrutu · 2019-04-12T19:05:37Z

proto/openmetrics_data_model.proto

+// timeseries, value of INFO metrics, and exemplars in Histograms.
+message Label {
+  // Required.
+  string name = 1;


Do we want to support a label description? Human readable description of this metric?

We discussed this previously, and no.

roger that.

This comment should be marked as resolved.

bogdandrutu · 2019-04-12T19:08:06Z

proto/openmetrics_data_model.proto

+
+      // Additional information about the example value // (e.g. trace id).
+      // The sum of lengths of all the strings in all the labels must not
+      // exceed 64 UTF-8 characters.


64 characters for all the labels here seems unreasonable. trace-id has 32-hex characters and span-id has 16-hex characters only these values will represent almost the entire limit.

See https://github.com/w3c/trace-context

There is some concern about allowing this to be a dumping ground for stuff.
With UTF-8, the current limit lets someone use 64*4 = 256 bytes.
How about we state the limit in the sum of bytes needed for names and values so that it doesn't penalize users who are restricting to ascii.

The spec in general is UTF-8, I'd rather not switch to bytes here.

It sounds to me like you've 12 characters left over for the label name, so you're inside the limit.

@sumeer I am not suggesting to not have a limit, but the limit is too small.

@brian-brazil correct but if I use "trace-id" and "span-id" (32 + 8 + 16 + 7 = 63) as keys then I have one character left so no other exemplar. Also we would like to have a trace-options which will suggest if the trace is sample or not (2-hex characters).

I know that the solution can be to modify the w3c standard (impossible) or to use "shorter" keys like "tid" and "sid" and "to" but those are not that clear. I really think that the limit should be more realistic and we can go with 256 characters for this.

bogdandrutu · 2019-04-17T23:00:08Z

proto/openmetrics_data_model.proto

+
+// A name-value pair. These are used in multiple places: identifying
+// timeseries, value of INFO metrics, and exemplars in Histograms.
+message Label {


Because label is key, value you can use Labels as map<String,String>.

This comment should be ignored and marked as resolved.

bogdandrutu · 2019-08-21T23:37:00Z

proto/openmetrics_data_model.proto

+    bool enabled = 1;
+
+    // Required.
+    string name = 2;


Why is this not a regular label? Recommended key can be state_name the label value is the state then this becomes a single boolean.

Then it'd not be a first-class type.

I am fine not having this as a first class. Probably on the gauge supporting boolean is enough

brian-brazil · 2019-08-27T17:24:06Z

proto/openmetrics_data_model.proto

+  // exponential sequence, or each bucket can be specified explicitly.
+  // `BucketOptions` does not include the number of values in each bucket.
+  //
+  // A bucket has a 0 lower bound and an inclusive upper bound for the


This should say "inclusive 0 lower bound" for clarity

Buckets for values <0 don't seem possible with this data model. Is that intended?

Per our discussions it is, noone could come up with a realistic use case.

My usual tech use-case is Spam Assassin scores.

However, OpenMetrics is supposed to be generally applicable. Observing negative values is a very normal thing for histograms in general. It is almost zero cost to support it. Kicking the support out would be a serious blow to the applicability of the format.

Current handling of negative bounds by Prometheus is defined here: https://prometheus.io/docs/prometheus/latest/querying/functions/#histogram_quantile

…ents and add explicit advice (#142) * Use google.protobuf.Timestamp for timestamps and add clarity to certain comments * Address that unknown, gauge and counter should all use double point values * Add optional/required to fields missing it * Mark summary quantiles as optional * Make linear buckets able to use negative observations * Call out that points are optional * Remove mention of 0 lower bound to allow negative buckets

* Update protobuf types to match latest OpenMetrics RFC data model

bogdandrutu · 2020-07-29T22:57:10Z

proto/openmetrics_data_model.proto

+
+// A name-value pair. These are used in multiple places: identifying
+// timeseries, value of INFO metrics, and exemplars in Histograms.
+message Label {


This comment should be ignored and marked as resolved.

bogdandrutu · 2020-07-29T23:02:16Z

proto/openmetrics_data_model.proto

+  // Optional.
+  // If there is more than one point, all must have a timestamp field
+  // and they must be in increasing order of timestamp.
+  // If there is only one point and has no timestamp field, the receiver
+  // will assign a timestamp.
+  repeated Point points = 2;


The amount of cases you have multiple points vs one point for the same labels is extremely small, I would suggest to optimize for the case where only one point per labels is sent and remove repeated (avoids an allocation or two depends on the languages) with the downside of repeating the labels for the case where multiple points are needed.

bogdandrutu · 2020-07-29T23:02:41Z

proto/openmetrics_data_model.proto

+// timeseries, value of INFO metrics, and exemplars in Histograms.
+message Label {
+  // Required.
+  string name = 1;


This comment should be marked as resolved.

…e separately (#151)

robskillington

LGTM, after merging latest changes now merging this to master

RichiH · 2020-11-05T18:43:28Z

Noice

…

On Thu, Nov 5, 2020, 17:13 Rob Skillington ***@***.***> wrote: Merged #126 <#126> into master. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#126 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFYII7Z6WP2KINK2CSP553SOLFLBANCNFSM4GQGH5TA> .

data model proto

9a6db2a

brian-brazil reviewed Jan 15, 2019

View reviewed changes

trygve-lie mentioned this pull request Jan 16, 2019

Validate object properties metrics-js/metric#4

Merged

robskillington reviewed Jan 19, 2019

View reviewed changes

sumeer added 2 commits April 9, 2019 12:35

Updates to proto.

a3e9081

Updates to data model.

e2305cc

bogdandrutu reviewed Apr 12, 2019

View reviewed changes

bogdandrutu reviewed Apr 17, 2019

View reviewed changes

tedsuo mentioned this pull request Apr 18, 2019

Rename Distribution to Histogram to match OpenMetrics teminology open-telemetry/opentelemetry-java#112

Closed

bogdandrutu reviewed Aug 21, 2019

View reviewed changes

brian-brazil reviewed Aug 27, 2019

View reviewed changes

beorn7 mentioned this pull request Oct 11, 2019

Determine what histogram buckets look like #62

Closed

bogdandrutu mentioned this pull request Nov 1, 2019

Should HistogramTimeSeries be called HistogramOfDoublesTimeSeries? open-telemetry/opentelemetry-proto#35

Closed

robskillington added 2 commits June 25, 2020 19:03

Update protobuf types to match latest OpenMetrics RFC data model (#150)

4c77806

* Update protobuf types to match latest OpenMetrics RFC data model

bogdandrutu reviewed Jul 29, 2020

View reviewed changes

Update protobuf spec to include exemplar, encapsulate each metric typ…

455f14e

…e separately (#151)

robskillington approved these changes Nov 5, 2020

View reviewed changes

robskillington merged commit 48e3d1c into master Nov 5, 2020

data model proto #126

data model proto #126

Conversation

RichiH commented Jan 15, 2019

brian-brazil left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington Jan 19, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brian-brazil Jan 19, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington Jan 19, 2019 •

edited

Loading

brian-brazil Jan 19, 2019 •

edited

Loading