-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
data model proto #126
data model proto #126
Changes from 1 commit
9a6db2a
a3e9081
e2305cc
a4157b2
4c77806
455f14e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,270 @@ | ||
syntax = "proto3"; | ||
|
||
package openmetrics; | ||
|
||
// This is a description of the OpenMetrics data model in a proto | ||
// form. Additionally, it serves as the proto wire format. | ||
// | ||
// Interpret "required" as semantically required. Some fields are | ||
// semantically required depending on the values of the other fields | ||
// (and indicated by comments). | ||
|
||
// The top-level message sent on the wire. | ||
message MetricSet { | ||
// Each MetricPoints has one or more points from a single metric. | ||
repeated MetricPoints metric_points = 1; | ||
} | ||
|
||
// One or more timeseries for a single metric, where each timeseries has | ||
// one or more points. | ||
message MetricPoints { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not just Metric? |
||
// Required. | ||
MetricDescriptor metric_descriptor = 1; | ||
|
||
repeated Timeseries timeseries = 2; | ||
} | ||
|
||
// A single timeseries. | ||
message Timeseries { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't see anywhere something like Couple of use-cases where these are necessary and cannot be determine even if this data model is focused on a pull mechanism which would allow the backend to annotate these meta labels in general:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is what Info metrics are for. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice, but I have some problems understanding some of the things here, maybe worth some clarification:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we are allowing multiple info metrics, e.g. build_info. It is up to the receiver to decide what to do with them. But they will all apply to the whole MetricSet, so one shouldn't mix multiple producers in the same MetricSet. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, there's no limit on how many info metrics you can have. |
||
// Labels applying to all points. | ||
repeated Label label = 1; | ||
|
||
// If there is more than one point, all must have a timestamp field | ||
// and they must be in increasing order of timestamp. | ||
// If there is only one point and has no timestamp field, the receiver | ||
// will assign a timestamp. | ||
repeated Point point = 2; | ||
} | ||
|
||
// The descriptor of a metric. | ||
message MetricDescriptor { | ||
// Required. | ||
// Names must match the regex [a-zA-Z_:][a-zA-Z0-9_:]* however colons are | ||
// reserved for monitoring system use, and names beginning with | ||
// underscores are reserved for monitoring-system internal use. | ||
string name = 1; | ||
|
||
// TODO: Format? | ||
// STATE_SETs and INFO types must not have a unit. | ||
string unit = 2; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this perhaps instead be an enum? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's too many potential values, it's a string to allow for that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we want any restriction on this string, or just UTF-8? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It'd have to be the same limits as metric names. |
||
|
||
// Required. | ||
Type type = 3; | ||
|
||
string description = 4; | ||
|
||
enum Type { | ||
UNKNOWN = 0; // double or int valued point | ||
GAUGE = 1; // double or int valued point | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In order to avoid confusion and have backends to support both double/int for the same metric what about making the type include the type of the valued point. Something like GAUGE_DOUBLE? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We discussed this at length, this is the way we're going. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You mean the current state or what I proposed? |
||
COUNTER = 2; // double or int valued point. | ||
STATE_SET = 3; | ||
INFO = 4; // provides information about the entity being monitored | ||
CUMULATIVE_HISTOGRAM = 5; | ||
GAUGE_HISTOGRAM = 6; | ||
// Summary quantiles are not recommended, since they cannot be | ||
// aggregated. | ||
SUMMARY = 7; | ||
} | ||
// Note: No bucketing is defined for a HISTOGRAM in the descriptor. | ||
// It is an implementation detail of the backend whether it can handle | ||
// changes in bucketing for a metric. Same for SUMMARY (quantiles are | ||
// not defined here), and STATE_SET (the set of possible values is | ||
// defined in each point). | ||
} | ||
|
||
// A name-value pair. These are used in multiple places: identifying | ||
// timeseries, value of INFO metrics, and exemplars in Histograms. | ||
message Label { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because label is key, value you can use Labels as map<String,String>. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This comment should be ignored and marked as resolved. |
||
// Required. | ||
string name = 1; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we want to support a label description? Human readable description of this metric? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We discussed this previously, and no. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. roger that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This comment should be marked as resolved. |
||
|
||
// Required. | ||
string value = 2; | ||
} | ||
|
||
message Timestamp { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not using the default timestamp from proto3? "google/protobuf/timestamp.proto" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it'd be better to keep self-contained There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At least we can copy the definition from there and avoid any arguments about nanoseconds being int vs uint. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do you know why it is int32 in google/protobuf/timestamp.proto even though it says "Non-negative fractions of a second at nanosecond resolution."? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Two reasons that I know:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think either is a problem in practice, it's not going to have values where that'll be an issue. |
||
// Required. | ||
int64 seconds = 1; // Seconds since the epoch. Negative values are permitted | ||
|
||
uint32 nanoseconds = 2; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What are the semantics of this when seconds is negative? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Thoughts? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't quite understand. We can add even when nanoseconds is unsigned There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As discussed we'll just use the google protobuf timestamp type here. |
||
} | ||
|
||
// A point in a timeseries. | ||
message Point { | ||
oneof value { | ||
double double_value = 1; | ||
int64 int_value = 2; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, did we agree uint64 here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it be a little strange that double value can be negative but int value cannot be? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any reason to restrict it to unsigned? The proto doc has int64 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any reason this cannot be negative for a gauge? |
||
HistogramValue histogram_value = 3; | ||
StateSetValue state_set_value = 4; | ||
InfoValue info_value = 5; | ||
SummaryValue summary_value = 6; | ||
} | ||
// Required for COUNTER, SUMMARY or CUMULATIVE_HISTOGRAM type. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Optional There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We had agreed this is required from a model perspective. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, we had not agreed this. It's optional, as many many systems out there do not support this. |
||
// | ||
// The cumulative value is over | ||
// the time interval (start_timestamp, timestamp]. | ||
// If start_timestamp is present and timestamp is absent, | ||
// the backend assigned timestamp may be used to construct the time | ||
// interval. | ||
Timestamp start_timestamp = 7; | ||
|
||
// If not specified, the timestamp will be decided by the backend. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Make clear this is optional, and discouraged There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we need to resolve the pull/push debate before we say "discouraged". IMO the same data model suffices for push in which case timestamp should be encouraged. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wouldn't say encouraged for push, it depends on what's on the other side. The other side can always add its own. |
||
Timestamp timestamp = 8; | ||
} | ||
|
||
// Value for CUMULATIVE_HISTOGRAM or GAUGE_HISTOGRAM point. | ||
message HistogramValue { | ||
// Required. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Neither of these should be required There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why not? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some systems don't have them (MySQL) |
||
double sum = 1; | ||
|
||
// Required. | ||
int64 count = 2; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. uint64? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
|
||
// Required. | ||
BucketOptions bucket_options = 3; | ||
|
||
// `BucketOptions` describes the bucket boundaries used to create a | ||
// histogram. The buckets can be in a linear sequence, an | ||
// exponential sequence, or each bucket can be specified explicitly. | ||
// `BucketOptions` does not include the number of values in each bucket. | ||
// | ||
// A bucket has a 0 lower bound and an inclusive upper bound for the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should say "inclusive 0 lower bound" for clarity There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Buckets for values <0 don't seem possible with this data model. Is that intended? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Per our discussions it is, noone could come up with a realistic use case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My usual tech use-case is Spam Assassin scores. However, OpenMetrics is supposed to be generally applicable. Observing negative values is a very normal thing for histograms in general. It is almost zero cost to support it. Kicking the support out would be a serious blow to the applicability of the format. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Current handling of negative bounds by Prometheus is defined here: https://prometheus.io/docs/prometheus/latest/querying/functions/#histogram_quantile |
||
// values that are counted for that bucket. That is, buckets are | ||
// cumulative. The upper bound of successive buckets must be increasing. | ||
// There is an implicit overflow bucket that extends up to +infinity. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. and must match the value of _count There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. these are just the bucket boundaries. I've added a comment later about the sum of bucket counts. |
||
message BucketOptions { | ||
// Exactly one of these three fields must be set. | ||
oneof options { | ||
// The linear bucket. | ||
Linear linear_buckets = 1; | ||
|
||
// The exponential buckets. | ||
Exponential exponential_buckets = 2; | ||
|
||
// The explicit buckets. | ||
Explicit explicit_buckets = 3; | ||
} | ||
|
||
// Specifies a linear sequence of buckets that add the same constant width | ||
// to the previous bucket (except the overflow bucket). This implies | ||
// a constant absolute uncertainty on the specific value of a sample. | ||
// | ||
// There are `num_explicit_buckets` (N) buckets plus the overflow | ||
// bucket. Bucket `i` (0 <= i < N) has upper bound : offset + (width * i). | ||
// Note that when offset = 0, bucket 0 represents the interval [0, 0]. | ||
message Linear { | ||
// Required. | ||
// Must be greater than 0. | ||
int32 num_explicit_buckets = 1; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. uint32 perhaps since there's no such thing as negative buckets? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
|
||
// Required. | ||
// Must be greater than 0. | ||
double width = 2; | ||
|
||
// Required. | ||
// Must be >= 0. | ||
double offset = 3; | ||
} | ||
|
||
// Specifies an exponential sequence of buckets that have an upper bound | ||
// that is proportional to the value of the upper bound of the previous | ||
// bucket. This implies a constant relative uncertainty on the specific | ||
// value of a sample. | ||
// | ||
// There are `num_exponential_buckets` + 1 (N) buckets plus the overflow | ||
// bucket. | ||
// Bucket 0 represents the interval [0, 0]. | ||
// Bucket `i` (1 <= i < N) has upper bound : | ||
// scale * (growth_factor ^ (i - 1)). | ||
message Exponential { | ||
// Required. | ||
// Must be greater than 0. | ||
int32 num_exponential_buckets = 1; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. uint32 perhaps here too since there's no such thing as negative buckets? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
|
||
// Required. | ||
// Must be greater than 1. | ||
double growth_factor = 2; | ||
|
||
// Required. | ||
// Must be greater than 0. | ||
double scale = 3; | ||
} | ||
|
||
// Specifies a set of buckets with arbitrary upper-bounds | ||
// | ||
// There are `size(bounds)` (N) explicit buckets plus the overflow bucket. | ||
// Bucket `i` (0 <= i < N) has upper bound: bounds[i]. | ||
// The `bounds` field must contain at least one element. | ||
message Explicit { | ||
// The values must be monotonically increasing and >= 0. | ||
repeated double bounds = 1; | ||
} | ||
|
||
} | ||
|
||
// The number of values in each bucket of the histogram, as described in | ||
// `bucket_options`. | ||
// | ||
// `bucket_counts` will typically contain N + 1 values. If you supply fewer | ||
// values, the remaining values are assumed to be 0. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This doesn't make sense, as these are cumulative. I'd suggest requiring it to be explicit There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
// | ||
// The order of the values in `bucket_counts` follows the bucket numbering | ||
// schemes described for the three bucket types. | ||
repeated BucketCount bucket_counts = 4; | ||
message BucketCount { | ||
// Required. | ||
// Count is the number of values for a bucket of the histogram. | ||
int64 count = 1; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. uint64? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
|
||
// An example of a value that fell into this bucket. | ||
Exemplar exemplar = 2; | ||
message Exemplar { | ||
// Required. | ||
// The value of the example. | ||
double value = 1; | ||
|
||
// The timestamp of the example. Optional but SHOULD set it. | ||
Timestamp timestamp = 2; | ||
|
||
// Additional information about the example value // (e.g. trace id). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Length limit should be mentioned There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Making it consistent with "The maximum number of characters in the labels, including the curly braces, is 64 UTF-8 characters." seems messy. I'd rather define a length limit on the strings in the Label and compute the resulting limit for the text format (accounting for the quirks in the text format in the text format specification) |
||
repeated Label label = 3; | ||
} | ||
} | ||
} | ||
|
||
// Value for STATE_SET point. | ||
message StateSetValue { | ||
// Each state must have a unique name. Any number of states may be enabled. | ||
repeated State states = 1; | ||
|
||
message State { | ||
// Required. | ||
bool enabled = 1; | ||
|
||
// Required. | ||
string name = 2; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is this not a regular label? Recommended key can be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Then it'd not be a first-class type. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am fine not having this as a first class. Probably on the gauge supporting boolean is enough |
||
} | ||
} | ||
|
||
// Value for INFO point. | ||
message InfoValue { | ||
repeated Label info = 1; | ||
} | ||
|
||
// Value for SUMMARY point. | ||
message SummaryValue { | ||
// Sum and count are optional since some systems (e.g. DropWizard) do not | ||
// expose them. The start_timestamp only applies to the sum and count. | ||
// The quantiles can be reset at arbitrary unknown times. | ||
|
||
double sum = 1; | ||
int64 count = 2; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. uint64 since there can't be negative sampled values? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
|
||
repeated Quantile quantile = 3; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It should be clear that this is optional There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. are you saying the quantile's are optional? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes |
||
message Quantile { | ||
// Required. | ||
// Must be in the interval (0.0, 1.0]. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this should be fully closed, why allow 1 but not 0? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how do you calculate the 0th percentile from a sample of values? i don't think subtracting delta from the lowest value in the sample is mathematically sound. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same question for 100th percentile. It's either both or neither. |
||
double quantile = 1; | ||
double value = 2; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a better name is MetricList. Set semantics are a bit unclear here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a set, you aren't allowed have duplicates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then probably document that, to make it clear why is a set and what is the "hash" function.