-
Notifications
You must be signed in to change notification settings - Fork 1.4k
feat(bigquery/storage/managedwriter): improve protobuf support #4589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
After some ongoing discussions with the Storage API team, this PR improves support for proto2/proto3 syntax in protocol buffer code. * Updates testdata so that we have a proto2 and proto3 form of our SimpleMessage data. * Add reference schemas to testdata. * Updates proto conversion code in the adapt package so it's creating proto2 messages by default. The code paths for doing proto3 conversions are present, but not exported yet as the storage API doesn't properly handle proto3 expectations for conversion. Namely, conversion doesn't properly account for default values and use of wrapper types. * Adds benchmarks for dynamic schema generation and static serialization, to aid in some internal discussions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Good to see that proto3 isn't all that much slower than proto2.
{Name: "four", Value: 1}, | ||
{Name: "five", Value: 2}, | ||
var testSimpleData = []*testdata.SimpleMessageProto2{ | ||
{Name: proto.String("one"), Value: proto.Int64(1)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need these proto.String
and proto.Int64
wrapper types in proto2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The structs take pointers to strings rather than strings directly, the proto.String, proto.Int64 etc funcs are just wrappers to take a value and return a pointer to the value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one small suggestion.
After some ongoing discussions with the Storage API team, this PR
improves support for proto2/proto3 syntax in protocol buffer code.
Updates testdata so that we have a proto2 and proto3 form of
our SimpleMessage data.
Add reference schemas to testdata.
Updates proto conversion code in the adapt package so it's creating
proto2 messages by default, but supports generation of proto3 style
descriptors.
Adds benchmarks for dynamic schema generation and static
serialization, to aid in some internal discussions. The GithubArchive schema
has all fields fully nullable, so it should provide a good comparison when having
to represent data using well known wrapper types.
Towards: #4366
Benchmark results thus far: