Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work In Progress: Separate binary with default code generation behaviour for normal proto files. #39

Closed
pattyshack opened this issue Jan 14, 2015 · 36 comments

Comments

@pattyshack
Copy link
Contributor

There are a bunch of dropbox internal customization which I'm interested in upstreaming. In particular:

  1. We've rename Size() to ProtoSize()
  2. We've enable marshaller, unmarshaller, and sizer by default

To avoid breaking the rest of the world, I was thinking of creating a struct which would holds the default values (and we'll only patch this struct within dropbox). something like

type DefaultParams struct {
   sizeMethodName string
   isMarshaller bool
   isUnmarshaller bool
   isSizer bool
}

the world would use:

params := DefaultParams{"Size", false, false, false}

while dropbox would override the defaults to:

params := DefaultParams{"ProtoSize", true, true, true}

Is this acceptable?

Also, on the experimental side, I want to backport plugins from https://github.com/dropbox/goprotoc to generate proper mutator APIs (we've got burn by goproto's lack of proper mutator api a few of time recently, especially around repeated field). Is this something you would be interested in?

Thanks,
Patrick

@pattyshack
Copy link
Contributor Author

(FYI @andrei-alpha)

@awalterschulze
Copy link
Member

Upstreaming yeah :)

I really like the idea of overriding the DefaultParams for gogoproto extensions, but I do not think sizeMethodName fits into this model though.
Overriding the defaults would allow quite a lot of people not to have to import gogoproto.proto and descriptor.proto and make their makefiles much simpler.
I wonder if there is a way of passing command line parameters from protoc into the gogoproto plugin, since then they don't even have to make their own main.go.

When you talk about a mutator API you are talking about setters that return a pointer to the struct they just changed, right?
"There are setters that can change the value of a field. For message fields they return a mutable pointer."
I guess this should be easy to add with an extension, or what did you have in mind?

@pattyshack
Copy link
Contributor Author

if we can remove io/ and the Sizer interface from proto/encode_gogo.go (looks like neither is used), then we should be able to parameterize the size method name (plugins and test are the only places that refers to Size()).

wrt to mutator: yes, basically the same as c++'s api where the mutator function is responsible for memory allocation

@awalterschulze
Copy link
Member

My concern is there is a TODO in goprotobuf code.
I don't know what else they are planning.

https://github.com/golang/protobuf/blob/master/proto/encode.go

// Size returns the encoded size of a protocol buffer.func Size(pb Message)
(n int) { // Can the object marshal itself? If so, Size is slow. // TODO:
add Size to Marshaler, or add a Sizer interface. if m, ok :=
pb.(Marshaler); ok { b, _ := m.Marshal() return len(b) }

On 14 January 2015 at 11:36, Patrick notifications@github.com wrote:

if we can remove io/ and the Sizer interface from proto/encode_gogo.go
(looks like neither is used), then we should be able to parameterize the
size method name (plugins and test are the only places that refers to
Size()).

wrt to mutator: yes, basically the same as c++'s api where the mutator
function is responsible for memory allocation


Reply to this email directly or view it on GitHub
#39 (comment).

@awalterschulze
Copy link
Member

I have no experience with the C++ api, maybe you could explain more about the memory allocation you have in mind?

@awalterschulze
Copy link
Member

io is a library that is being used, not within gogoprotobuf, but by projects using gogoprotobuf.

@awalterschulze
Copy link
Member

It seems like we can remove the Sizer interface in encode_gogo.go. I hope nobody is using it.

@awalterschulze
Copy link
Member

I don't know how useful renaming the size method would be to others.
Maybe there is some other feature that would be more generally useful, but I can't think of one.
For name conflicts I recommend using the customname extension.

@pattyshack
Copy link
Contributor Author

https://developers.google.com/protocol-buffers/docs/reference/cpp-generated

in summary

non-repeated scalar fields support: Set(value), Clear() and Get()

non-repeated msg fields support: Mutable(), Clear() and Get()

repeated scalar fields support: Add(value), Set(index, value), Clear(), Get(index) and Size()

repeated msg fields support: Add(), Mutable(index), Clear(), Get(index) and Size()

@awalterschulze
Copy link
Member

Ok so is that where the msg Size() method gets into conflict with the repeated fields Size() ?
How about renaming that to Len()?
Have you already implemented this in dropbox/goprotoc?
Maybe I should just have a look over there?

@pattyshack
Copy link
Contributor Author

oops, github markdown trim out parts of the method names. Let's try this again. Suppose your field name is Foo, then

non-repeated scalar fields support: SetFoo(value), ClearFoo() and GetFoo()

non-repeated msg fields support: MutableFoo(), ClearFoo() and GetFoo()

repeated scalar fields support: AddFoo(value), SetFoo(index, value), ClearFoo(), GetFoo(index) and FooSize()

repeated msg fields support: AddFoo(), MutableFoo(index), ClearFoo(), GetFoo(index) and FooSize()

Note that the generated methods names do not conflict with Size.

Size is a valid field name in standard (non-go) protobuf. We needed to rename Size to ProtoSize because we have numerous definitions of the form

message FileInfo {
  optional string name = 1;
  optional int64 size = 2;
  ...
}

that are already in used in our c++ / python code

@awalterschulze
Copy link
Member

Aha ok that is good news.

So what about changing the fieldname in go only, to avoid the name conflict?

message FileInfo {
  optional string name = 1;
  optional int64 size = 2 [gogoproto.customname="filesize"];
}

@awalterschulze
Copy link
Member

So do you already have a working mutator API in dropbox/goprotoc?

@pattyshack
Copy link
Contributor Author

we want to avoid importing the gogo's proto since it doesn't behavior well with our python toolchain.

Andrei wrote a working mutator API for dropbox/goprotoc. Unlike gogoproto / go proto, fields generated by goprotoc are private (so it's not backward compatible). The primary motivation for the change was to reduce programming errors such as setting nil in repeated message field array, accidentally sharing variable pointers, etc (my coworkers and I have made these mistakes repeatedly). The secondary motivation was to improve proto's performance; for example, we can cache the computed size during serialization, reduce gc by properly inlining optional fields (gogo sort of support this, but we actually care about empty fields), etc.

(Note: I haven't had the chance to migrate dropbox internal to goprotoc.)

1 similar comment
@pattyshack
Copy link
Contributor Author

we want to avoid importing the gogo's proto since it doesn't behavior well with our python toolchain.

Andrei wrote a working mutator API for dropbox/goprotoc. Unlike gogoproto / go proto, fields generated by goprotoc are private (so it's not backward compatible). The primary motivation for the change was to reduce programming errors such as setting nil in repeated message field array, accidentally sharing variable pointers, etc (my coworkers and I have made these mistakes repeatedly). The secondary motivation was to improve proto's performance; for example, we can cache the computed size during serialization, reduce gc by properly inlining optional fields (gogo sort of support this, but we actually care about empty fields), etc.

(Note: I haven't had the chance to migrate dropbox internal to goprotoc.)

@awalterschulze
Copy link
Member

So your python tool chain can't handle imports or why its it hard for the python tool chain?

proto3 might make it hard for you to care about empty fields.
As I understand it, documentation pending, nullable=false fields will become the norm, except for structs.

I think that adding the mutator API support might be possible.
gogoprotobuf would do this with an extension and then you could use the above proposed config to make it the default.
There will be a couple of hurdles though.
For example the populate would have to know if fields are private or not.
I would like to see how it works together with all plugins and which don't work with it.

As I understand it, documentation pending, proto3 also removes extensions, which really scares me, but they are replacing it with something different called Any.
Maybe this will make it possible to have parse-able proto files which allow you to have switch features on and off.
They also say they are going to support files in proto2 and proto3 syntax.
So I wonder if they are going to update their descriptor.proto to not use extensions.

@pattyshack
Copy link
Contributor Author

our python tool chain doesn't support extensions (we're using an in-house c++ binding that's ~6-30x faster than standard python proto; I haven't have the time to verify extension working correctly yet).

I haven't been following proto3's progress. Are you referring to flatbuffers? I know for a fact that google (ads) uses empty fields in millions of places. I doubt they can actually migrate away from empty fields / proto2 syntax.

@awalterschulze
Copy link
Member

Ok but you don't need to marshal or unmarshal the extensions in the
descriptor with your fast python tool chain?

No not flatbuffers.
Please read this.
https://github.com/google/protobuf/releases/tag/v3.0.0-alpha-1

On 21 January 2015 at 09:47, Patrick notifications@github.com wrote:

our python tool chain doesn't support extensions (we're using an in-house
c++ binding that's ~6-30x faster than standard python proto; I haven't have
the time to verify extension working correctly yet).

I haven't been following proto3's progress. Are you referring to
flatbuffers? I know for a fact that google (ads) uses empty fields in
millions of places. I doubt they can actually migrate away from empty
fields / proto2 syntax.


Reply to this email directly or view it on GitHub
#39 (comment).

@awalterschulze
Copy link
Member

This could also be interesting
golang/protobuf@5677a0e
The plugin list is probably an extra param to proto3.

@awalterschulze
Copy link
Member

Any thoughts on proto3?

@pattyshack
Copy link
Contributor Author

TBH, it's unlikely we (dropbox) will switch over to proto3 any time soon. The ROI is very low. I know there's a lot of resistance to proto3 within google for the same reason. That's why there's a backward compatibility mode.

@awalterschulze
Copy link
Member

That is very interesting to hear.

Ok so I made a similar suggestion in the proto3 issue.
But what about a binary that simply edits the descriptors for you, adding extensions and customnames.
Enabling all your defaults.
And then does the same code generation as usual.
Then the only feature that is not in gogoprotobuf then is the setter API?

@awalterschulze
Copy link
Member

But then you don't have import gogoproto or use any extensions.

@awalterschulze
Copy link
Member

Hello?

@pattyshack
Copy link
Contributor Author

Sorry, your email got lost in my inbox (I'm drowning in work emails).

For clarification wrt your binary comment, are you referring to 1) a secondary binary (on top of the standard gogo compiler) which mutates the compiled proto files and add the extensions and customnames, or 2) a parallel binary to the standard gogo compiler which ables all of the customizations?

using 1) places extra burden on our developers (a lot of them are new to protos), which increases odds of messing up the compilation. We're basically doing 2) in house right now, so I wouldn't object to it.

@awalterschulze
Copy link
Member

Cool no problem :)

Yes I was thinking a parallel binary lets say protoc-gen-gogostd
Then devs would run protoc --gogostd_out=. filename.proto
No importing of gogo.proto, since this is only necessary for protoc to compile the extensions.
Standard extensions (which should be designed, or maybe you already have?) will be added by protoc-gen-gogostd and it will call the functions to do the normal code generation.

So I am pretty sure that is the same as number 2.

@anton-povarov
Copy link
Contributor

Very interested in this change as well. Would love to make gogo protofiles look as similar as possible to "regular" ones.

Do i understand correctly, that a user would be able to build a second protobuf generator plugin binary, with options of their choice already enabled by default. And therefore clean protofiles up ?

@awalterschulze awalterschulze changed the title customizing gogo proto Feature: Separate binary with default code generation behaviour for normal proto files. May 22, 2015
@awalterschulze
Copy link
Member

There will be a library to make to make this easier and gogoprotobuf would also provide a binary using that library.

The library functions would just walk through the fileDescriptorSet and annotate the structures with the current user specified extensions.

For example there would be a function AllNonNullable (which would add nullable=false to all fields), AllTheSpeed (which would add marshaler_all, unmarshaler_all and sizer_all to all files), etc.

@awalterschulze
Copy link
Member

I have created a new issue pertaining to the Size fieldname issue
#56

@awalterschulze awalterschulze changed the title Feature: Separate binary with default code generation behaviour for normal proto files. Work In Progress: Separate binary with default code generation behaviour for normal proto files. Jun 2, 2015
@awalterschulze
Copy link
Member

I have started some work on the vanity branch, here is the commit
aaa40f1
If someone would like to check this out before I merge this back into the proto3 branch that would be greatly appreciated.

@awalterschulze
Copy link
Member

I would also like to know from @PatrickDropbox if this vanity branch and the Size fieldname bug would resolve this issue?

@pattyshack
Copy link
Contributor Author

yeah, probably.

@awalterschulze
Copy link
Member

Please check out the size fieldname fix on the sizeunderscore branch.

@awalterschulze
Copy link
Member

merged.
If you have more issues please open another.
Thank you for your input it has been really valuable.

@anton-povarov
Copy link
Contributor

Is there documentation how to use this new delicious stuff ?

@awalterschulze
Copy link
Member

The Readme om the proto3 branch
https://github.com/gogo/protobuf/tree/proto3
shoud cover it.
If you want to build your own protoc-gen-vanity, then just check one of the binaries' source code.
I made it very self describing.

For golang/protobuf/proto import simply go to gogo.github.io/doc and Ctrl+f gogoproto_import

I hope that answers your question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants