Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reflect/protoreflect: why are all Descriptor interfaces marked with doNotImplement? #1194

Open
jhump opened this issue Sep 1, 2020 · 4 comments
Milestone

Comments

@jhump
Copy link
Contributor

jhump commented Sep 1, 2020

I see only a few places that actually care about the concrete type that implements the interface, and those rightly use single-assignment type conversions (since, for those cases, the internal impls should be the only possible ones).

But dynamicpb, for example, does not depend on a particular concrete implementation. So it seems useful to allow alternate implementations, in order for sophisticated needs that want some of the same serialization/deserialization/validation done by dynamic messages, even if a normal descriptor (or protos) is not yet available.

So is the main reason for this so that you can add new methods to the interface without breaking users? Or are there other invariants/assumptions implemented by the internal impls that are not specified, that would make such other impls unsafe or incorrect?

I imagine this is a strange request, so here is some context: I am trying to convert github.com/jhump/protoreflect over to the API v2. This enables much simpler interop, for usages of my library that also want to use the new API v2 protoreflect stuff. This is also part of my planned "transition path" to a v2 of my protoreflect, which can mostly drop the desc and dynamic packages (since those are now part of the API v2 runtime) and rewrite the other useful bits (such as desc/protoprint, desc/builder, and most of all desc/protoparse) directly in terms of the API v2 types.

The sticking point is in protoparse: I don't yet have fully constructed and resolvable descriptor protos, so I can't construct descriptors "the normal way". Yet I need descriptors for the last step of parsing/linking: interpreting options. Interpreting options is complicated enough... but doing so without the aid of a dynamic message is just silly. I've tried "the normal way", but since options are not yet interpreted, there are numerous classes of descriptor protos that fail this step (for example, enums that have aliases, messages that use message set wire format and have fields with tag numbers outside of the normal range). Also, "the normal way" is kind of expensive to have to do 2x (before and then again after interpreting options).

So my current approach is to provide my own light-weight implementation of those interfaces, that just use the raw parsed descriptor protos plus the linker's internal data structures. I could then use dynamicpb with those impls and interpret options. Once options are interpreted, I could then use "the normal way" to turn the resulting fully-populated descriptor protos into rich fully-linked descriptors.

@dsnet
Copy link
Member

dsnet commented Sep 1, 2020

So is the main reason for this so that you can add new methods to the interface without breaking users?

Essentially this. The protobuf language has new additions over time (e.g., oneofs, maps, proto3 optional to name a few), and we would want the freedom to adjust the descriptor API accordingly to reflect such functionality.

There was some discussion in the past about adding a "FooDescriptorNotImplemented" type for every Descriptor and require application code to embed that or some other technique. At the time that we released the module, we decided to punt on figuring out if we wanted to allow custom descriptor implementations. Naturally, having an unexported method is the more conservative approach and can always be loosened in the future.

\cc @neild

@jhump
Copy link
Contributor Author

jhump commented Sep 1, 2020

Okay. I have a branch where I am just embedding the protoreflect interface, so that my type makes the Go compiler happy.

I guess I'm wondering how likely I am to be broken by using this technique. I think the answer to that really depends on if (1) the new protobuf feature requires expansion of one or more descriptor interfaces, (2) the feature impacts the behavior of a dynamic message such that dynamicpb needs to invoke one of the new methods, and (3) my use of dynamic messages (text unmarshal, setting/validating field values) triggers such a code path. I suspect a change that has all three of these properties is rather unlikely.

@neild
Copy link
Contributor

neild commented Sep 1, 2020

I'm probably missing something, but I don't follow this:

I don't yet have fully constructed and resolvable descriptor protos, so I can't construct descriptors "the normal way".

Is this "don't yet" because you haven't implemented it yet, or because at this stage of parsing the descriptors haven't been created yet?

I'd expect protoparse to process the dependency tree of .proto files in depth-first order, so that the descriptors for any options have been created by the time the option needs to be parsed.

@dsnet dsnet changed the title APIv2: why are all Descriptor interfaces marked with doNotImplement? reflect/protoreflect: why are all Descriptor interfaces marked with doNotImplement? Sep 1, 2020
@jhump
Copy link
Contributor Author

jhump commented Sep 1, 2020

Is this "don't yet" because you haven't implemented it yet, or because at this stage of parsing the descriptors haven't been created yet?

@neild, at this stage of parsing, the options have not been interpreted. So certain validation when constructing descriptors from protos fails because they require certain options be set, such as enums that have aliases or messages with message set wire format and fields with tags outside the normal range.

Also, going in dependency order for interpreting options does not necessarily help since a file can reference custom options that are defined in the file itself, not in an imported file. So having processed dependencies is not all that is needed to interpret options.

I considered a first-pass of interpreting options that might only support non-custom options. But that is complicated for the case that the user provides a custom/override version of descriptor.proto, with possibly different non-custom options than what the runtime actually knows about. If any of those non-custom options were non-scalars (maps or messages/groups), I'm back in the same situation where I really need a dynamic message (and thus a descriptor) to reasonably interpret. This option also means I have to implement a completely separate "descriptor-free" code path for interpreting non-custom options and incurs the cost of double-processing, since it means building the full descriptors both before and after processing of custom options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants