Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proto: Problem differentiating not-set value from default value #183

Open
mxinden opened this issue Dec 20, 2020 · 3 comments
Open

proto: Problem differentiating not-set value from default value #183

mxinden opened this issue Dec 20, 2020 · 3 comments

Comments

@mxinden
Copy link
Member

mxinden commented Dec 20, 2020

Problem

The OpenMetrics Protobuf specification is using the proto3 syntax.

With the proto3 syntax all fields are optional by default. This is not to be confused with null-able. While one can choose not to specify a field it is then represented by its default value. Someone receiving a proto3 encoded payload has no way to tell the difference between a field explicitly set to its default value or not set at all.

The most prominent case where this can be problematic is for boolean fields where both the default value and the not-set value is false. In the context of OpenMetrics, as far as I can tell, one problematic example would be the examplar field in the CounterValue message. A use can not tell the difference between a not-set examplar and an examplar set to 0.

https://github.com/OpenObservability/OpenMetrics/blob/1ff04f252cec929acd9a569fce87b8ec4f72b086/proto/openmetrics_data_model.proto#L109-L123

Solution

To solve the above I see 3 solutions:

  1. The optional keyword, which would allow differentiating between a default value and a not-set value, was initially removed with the proto3 syntax, but has been revived with v3.12.0. In case requiring protoc v3.12.0 is fine, one could just make use of the revived optional keyword.
  2. The optional keyword is just syntactic sugar for oneof. Thus, in case requiring protoc v3.12.0 is not an option, one can use oneof directly. See this stackoverflow answer.
  3. Use proto2 instead of proto3 (see original Prometheus proto definition).

Further reading


What do people think? Do you see alternative approaches?

Given that I don't have much expertise with Protobuf I might be missing something. If so, I would appreciate an answer pointing out my mistake.

@beorn7
Copy link
Member

beorn7 commented Dec 22, 2020

Yeah, that's definitely a problem in general. For the exemplar, only the rare case of an exemplar with a value of 0 and without any labels and timestamp (or a default timestamp, see below) would be indistinguishable from a non-existing exemplar, but it's still a corner case that might strike eventually.

IIUC, the google.protobuf.Timestamp type has a default value of 1970-01-01, midnight UTC, which means that all the optional timestamps are indistinguishable from that timestamp, which is a perfectly relevant timestamp for practical purposes. (The Go Time type, for example, has a zero value 0001-01-01, midnight UTC, which is way less likely to come up in practice and is therefore officially used to mark the "unset" state. With protobuf, we apparently cannot rely on a similar trick.)

Using the new optional keyword or oneof creates a larger message size (again IIUC), so there might be an argument to not use it on all semantically optional fields, but only on fields where it is important to distinguish between "unset" and "set to default value". (Perhaps that's the case for all semantically optional fields?)

I don't think going back to proto2 is an acceptable option at this time.

@robskillington
Copy link
Contributor

robskillington commented Jan 6, 2021

Hey thanks for opening this to discuss. This is actually not an issue for Timestamp or Exemplars since they are of message type, and it's very easy to detect if the field was set on the wire.

This is outlined in the documentation you linked for proto3 field presence: https://github.com/protocolbuffers/protobuf/blob/v3.12.0/docs/field_presence.md#presence-in-proto3-apis.

Here's an example of this, you can see very clearly if those fields are set or not:

package main

import (
	"fmt"
	"github.com/golang/protobuf/proto"
	"google.golang.org/protobuf/types/known/timestamppb"
)

func main() {
	data, err := proto.Marshal(&CounterValue{})
	if err != nil {
		panic(err)
	}

	var msg CounterValue
	if err := proto.Unmarshal(data, &msg); err != nil {
		panic(err)
	}

	fmt.Printf("msg: %v\n", msg.String())
	fmt.Printf("value.Created: %v\n", msg.Created)
	fmt.Printf("value.Exemplar: %v\n", msg.Exemplar)

	data, err = proto.Marshal(&CounterValue{
		Created: &timestamppb.Timestamp{Seconds: 123},
	})
	if err != nil {
		panic(err)
	}

	if err := proto.Unmarshal(data, &msg); err != nil {
		panic(err)
	}

	fmt.Printf("msg: %v\n", msg.String())
	fmt.Printf("value.Created: %v\n", msg.Created)
	fmt.Printf("value.Exemplar: %v\n", msg.Exemplar)

	data, err = proto.Marshal(&CounterValue{
		Created:  &timestamppb.Timestamp{Seconds: 123},
		Exemplar: &Exemplar{Value: 42},
	})
	if err != nil {
		panic(err)
	}

	if err := proto.Unmarshal(data, &msg); err != nil {
		panic(err)
	}

	fmt.Printf("msg: %v\n", msg.String())
	fmt.Printf("value.Created: %v\n", msg.Created)
	fmt.Printf("value.Exemplar: %v\n", msg.Exemplar)
}

Output:

$ go run bin/proto/*.go
msg: 
value.Created: <nil>
value.Exemplar: <nil>
msg: created:{seconds:123}
value.Created: seconds:123
value.Exemplar: <nil>
msg: created:{seconds:123} exemplar:{value:42}
value.Created: seconds:123
value.Exemplar: value:42

@beorn7
Copy link
Member

beorn7 commented Jan 6, 2021

Good to learn about the singular message case. That will indeed defuse many cases. I guess we still need to go through all singular numeric, enum, and string fields to see if there is any issue with not-set vs. default.

I assume that we never have the necessity to distinguish between a not-set repeated field and a repeated field with zero repetitions (but if anyone has contrary evidence, please speak up).

We don't have maps or bytes in the current proto file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants