-
Notifications
You must be signed in to change notification settings - Fork 504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom Options and Proto2 Extensions #674
Comments
@LucioFranco Following up, I have some time coming up, would you like me to move ahead some small PRs or would you like to review the design first? I could also recreate the existing PR but cut the commits better into logical slices so it would be easier to follow the progression described here, then we could work off that. |
@nswarm sorry for the delay on this! I would say for now hold off on doing some more work before I get some time to read through this and think about it. I know thats probably not the answer you want to hear but my time is a bit limited atm. I appreciate you doing this work and I will try to get to this in the next month or so, again sorry about the delay. |
No problem, I'm in no rush. Thanks for getting back to me. I'll let it simmer :) |
Hi @nswarm @LucioFranco I'm catching up all the discussions about this feature, I'm gonna give my 2 cents: Unknown extensions and
|
Hello, I'm very interested in this feature and would like to help with anything: reviewing, discussion, implementing some of the features. |
Hi sorry this has taken me so long to finally review but I started digesting it over the last few days and I don't feel like I am fully there yet on what I think the right idea is but ill add some of my current thoughts and notes to spark some discussion.
Anyways, here are some quick thoughts, I will continue thinking about this and looking forward to see what yall say. Again, thanks for being so patient <3 |
Perhaps a no-op ExtensionSet is nicer rather than the
Just checked the Java implementation, what they do is hardcode the protobuf descriptor bytes in the generated java class, then in static context they initialize a global variable with the extension value. When the user accesses the extension, the value is casted by using the generic type of the generated extension class. This is roughly similar to lazy parsing the extensions from |
By this do you mean bounding I spiked this out quickly and I think that would work great, assuming we could call the merge/encode methods without generics (e.g. with
Yes. It would nuke a lot of oddities of my implementation like the Encode/Merge traits as well.
The way I had it is effectively the no-op version, where the top-level Option is handled by the
If the Option was around the
That said, since the
I don't have much opinion either way.
(Thinking out loud a bit here...) WRT interface, you always need to supply byte layout info to the parser, whether it's up front or during data access. The tradeoff between: extension registry and lazily parsing extensions is when you supply that data. With an extension registry, you supply layout info up front when you decode. With lazily parsed extensions, you supply the layout info whenever you try to do an operation on the extension set (get/set/has). As it happens in my impl you already need to supply a static There's also a performance tradeoff of when you parse the extension data. Probably not significant enough to consider without clear use case and measurements though, as long as you're properly storing the result on first eval. It could even provide some small perf benefit re-encoding if you're ok with the higher memory footprint to hang onto the encoded version if it hasn't changed. There's also a potential benefit that if you aren't accessing extensions often, you don't pay the cost. tl;dr:
|
Agreed, I won't have time to take this on but if someone wants to start on this that would be good. I think I posted a proposal in there. The big thing here is running enough benches to be sure there is no perf impact.
I think it would be nice if we could do |
Overview
This is a proposal for support for extensions (proto2 only) and custom options (proto2 and 3).
The general idea of Extensions is that users can extend third-party protobuf definitions with additional data that will not break the deserialization of anything that relies only on the base version of the protobuf. For example, if your service reports health check information in some standard-compliant way, but also includes Extension data that another service in your ecosystem (with your specific protobuf defs) could use as well.
Custom Options are Extensions on the core
*Options
types likeMessageOptions
andFileOptions
that are available in protoc generators, including custom protoc plugins, or prost.Protobuf Extensions
Given a protobuf message like this:
Users can write extensions like this:
Then they can use extension data like this (what it looks like in my PR #591):
Implementation
The original PR (#591) implements extensions similar to how other languages are handled within the core protobuf project, which I'll describe here.
There are three major components:
1. In-Memory Extension Data
Given a protobuf like:
We generate code like:
::prost::ExtensionSet
: At runtime, holds type-erased extension values by protobuf tag.EXTENDABLE_TYPE_ID
: Used as a validator to ensure a particular Extension is for this protobuf type. (Will come back to this)::prost::Extendable
: A derived trait that generates accessors for the above two.The
ExtensionSet
is simply a mapping of protobuf tag to a type-erased values, similar to this:2. Encoding and Decoding Extension Data
Given a type with an ExtensionSet, we need to be able to encode/decode that data to the wire. Users write this information into the protobuf extension itself like this:
We generate code like:
extendable_type_id
andfield_tag
compose the ID to a specific field of a specific type.The generic type and
proto_int_type
give us all the info we need to encode/decode the data.proto_int_type
supplies additional info because there's not a 1:1 mapping of rust int types to proto.Decoding
When decoding, fields in the rust generated code map 1:1 with the protobuf. For extensions we don't have generated fields. Instead we have a single
ExtensionSet
field which serves as a container for all unknown fields that we have an Extension registered for. This is why we need theExtensionRegistry
, which needs to be filled with those Extension types before parsing.The
ExtensionRegistry
is simply a storage ofExtensionImpls
which is queried when parsing the unknown fields of a message of a particular type and tag, then the generic type andproto_int_type
can be used to decode the field from the serialized data and store it in theExtensionSet
field on the object using themerge
method.Encoding
When encoding, the
ExtensionSet
'sencode
is called, which in turn can tell each of its Extension datas to be encoded.Merge
andEncode
TraitsThe various
merge
andencode
methods already exist for generated prost types via the::prost::Message
derive. I added theMerge
andEncode
traits, which are implemented by the same derive trait to call into those same methods. This way we can call these methods in a typeless context such as when encoding/decoding in the generic types ofExtensionImpl
.3. Getting and Setting Extension Data
The
Extendable
trait provides a function to get the type'sExtensionSet
. From there you can use the staticExtensionImpl
type to tell theExtensionSet
which data you want to operate on.I think it's plausible to break down into PRs roughly along the lines above if it would help to consume. The feature is kind of just a monster and not particularly useful without everything though.
Let me know if anything is unclear, it's fairly complicated and took some digging to truly understand what was going on. I'm also probably not explaining as well as I think I am. :)
@LucioFranco
The text was updated successfully, but these errors were encountered: