Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MetricDescriptor.Type enum items do not match specification #78

Closed
tigrannajaryan opened this issue Nov 27, 2019 · 8 comments
Closed

MetricDescriptor.Type enum items do not match specification #78

tigrannajaryan opened this issue Nov 27, 2019 · 8 comments

Comments

@tigrannajaryan
Copy link
Member

tigrannajaryan commented Nov 27, 2019

MetricDescriptor.Type enum items currently specify GAUGE and COUNTER types, however their meaning does not match that of the specification.

The GAUGE enum value in this repository corresponds to specification's Non-monotonic Counters and Gauges.

The COUNTER enum value in this repository corresponds to specification's monotonic Counters and Gauges.

To avoid confusion I suggest renaming enum values to the following:

enum Type {
// Represents monotonic int64 Counter or Gauge
MONOTONIC_INT64 = 1;

// Represents monotonic double Counter or Gauge
MONOTONIC_DOUBLE = 2;

// Represents non-monotonic int64 Counter or Gauge
NONMONOTONIC_INT64 = 3;

// Represents non-monotonic double Counter or Gauge
NONMONOTONIC_DOUBLE = 4;
}

Alternatively we can have finer definition of metric type that includes 3 fields: data type, kind of the metric as defined in the spec (i.e. Counter, Gauge, Measure) and a flag indicating monotonicity of the metric:

enum Kind { Counter, Gauge, Measure };
enum DataType { Int64, Double, Histogram, Summary };
Kind kind;
DataType data_type;
bool monotonic;

Note that in this case certain combinations will be non-representable with current protobuf schema and we need to define precisely which combinations of enums and monitonicity are valid and correspond to DataPoint types for which we have defined protobuf messages.

@tigrannajaryan
Copy link
Member Author

@bogdandrutu @jmacd what do you think?

@bogdandrutu
Copy link
Member

@tigrannajaryan I like the MONOTONIC version (first proposal).

@jmacd
Copy link
Contributor

jmacd commented Nov 28, 2019

I would suggest the descriptor match the API more closely. There are 12 combinations of { Counter, Gauge, Measure }, { int64, float64 }, { monotonic/non-monotonic or absolute/non-absolute }.

As discussed in open-telemetry/opentelemetry-specification#364, there is a motion to avoid using a single boolean to reflect the bit for { monotonic/non-monotonic or absolute/non-absolute }. In the protocol it matters because we'd prefer to set the bit only when the non-default cause is chosen, but counters and gauges have different defaults. In this respect, maybe we're best off with 6 kinds:

{ monotonic_counter, nonmonotonic_counter, monotonic_gauge, nonmonotonic_gauge, absolute_measure, nonabsolute_measure }

I am not sure about "data type". I would prefer if "data type" referred to the original input values (i.e., { int64, float64 }), which may not be important to include in the export.

I am more familiar with using "value type" to refer to { Int64, Double, Histogram, Summary }, and I would not call "value type" information part of the descriptor itself. The same descriptor could generate any of these values, so I would let that be implied by the type of value that accompanies it in the Metric message, not part of the descriptor.

To summarize, I would go with a 6-value enum (monotonic_counter, nonmonotonic_counter, monotonic_gauge, nonmonotonic_gauge, absolute_measure, nonabsolute_measure) and I would leave out whether the inputs are integer or floating point.

@tigrannajaryan
Copy link
Member Author

Do we need to differentiate between absolute and non-absolute measures? I cannot think of anything that requires interpreting them differently and they are going to be represented as signed numbers in the proto anyway.

I am not sure about "data type". I would prefer if "data type" referred to the original input values (i.e., { int64, float64 }), which may not be important to include in the export.

The data type must be reflected in the metric descriptor otherwise there is no way to know which data field in Metric message to use (e.g. int64 or double?). So this either needs to be an explicit field or must be unambiguously inferrable from the Value Type (or Kind) enum if we use one enum only.

@jmacd
Copy link
Contributor

jmacd commented Dec 3, 2019

I believe we should. An example case that has come up is when using an absolute measure to convey individual measurements where the sum is relevant. If the measure is absolute, the sum is monotonic, which makes it easier to automate the rate-of-sum calculation. I see this as metadata that the transport and protocol itself probably doesn't care about, but might matter for presentation and downstream use.

@bogdandrutu
Copy link
Member

This was written before I read few times the conversation

I would suggest the descriptor match the API more closely. There are 12 combinations of { Counter, Gauge, Measure }, { int64, float64 }, { monotonic/non-monotonic or absolute/non-absolute }.

I am not sure what is the representation of the Measure in this case. We need to map the values to an aggregation that the SDK produces, otherwise you are proposing only raw values (no aggregation) which is sub-optimal and I think we should not do that (see for example OpenMetrics which does NOT send raw values for all the good reasons).

So that being said I think we should design the protocol around the aggregation we produce (similar with OpenMetrics) + time interval (cumulative value over time OR value set at a specific moment).

This was written after I read >3 times the conversation

If I understand correctly @jmacd proposes something like this:
Metric {
MetricDescriptor
repeated Aggregation{ AggregationType, repeated timeseries/points }
}

The current proposal does not try to preserve the initial api instrument description but rather unify the initial descriptor with the aggregation type and labels used, so then from an instrument we may produce multiple metrics (based on the aggregation defined), for example from an instrument called rpc_latency we may want to produce two histograms (different labels, different buckets) they will become two metrics. This way our protocol is compatible with OpenMetrics and much easier to transform to/from.

@jmacd
Copy link
Contributor

jmacd commented Dec 6, 2019

I feel there we should maintain the kind of the original instrument, but I agree with the comments above. Suppose I export a Measure instrument. If I know that it was absolute then I can automatically show you a plot of the sum as a rate. If the Measure is not absolute, I should not try to plot the sum as a rate. In both cases, the value being exported is a histogram or a summary.

@bogdandrutu I am suggesting that we keep the original instrument kind as well as be explicit about the aggregation (somehow). @tigrannajaryan I was imagining that there would only one value present, which is unambiguous in the sense that you have exactly one value, but doesn't actually state what the aggregation is.

@bogdandrutu If I read you correctly, you're agreeing with a structure like

Metric {
MetricDescriptor
repeated Aggregation{ AggregationType, repeated timeseries/points }
}

is that right?

@bogdandrutu
Copy link
Member

Closing this since we went with an approach to describe the aggregation of the data not the instrument that was initially used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants