Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

protoCodec: return early if proto.Marshaler #1689

Merged
merged 2 commits into from
Dec 1, 2017

Conversation

muirdm
Copy link
Contributor

@muirdm muirdm commented Nov 27, 2017

If the object to marshal implements proto.Marshaler, delegate to that
immediately instead of pre-allocating a buffer. (*proto.Buffer).Marshal
has the same logic, so the []byte buffer we pre-allocate in codec.go
would never be used.

This is mainly for users of gogoproto. If you turn on the "marshaler"
and "sizer" gogoproto options, the generated Marshal() method already
pre-allocates a buffer of the appropriate size for marshaling the
entire object.

If the object to marshal implements proto.Marshaler, delegate to that
immediately instead of pre-allocating a buffer. (*proto.Buffer).Marshal
has the same logic, so the []byte buffer we pre-allocate in codec.go
would never be used.

This is mainly for users of gogoproto. If you turn on the "marshaler"
and "sizer" gogoproto options, the generated Marshal() method already
pre-allocates a buffer of the appropriate size for marshaling the
entire object.
@thelinuxfoundation
Copy link

Thank you for your pull request. Before we can look at your contribution, we need to ensure all contributors are covered by a Contributor License Agreement.

After the following items are addressed, please respond with a new comment here, and the automated system will re-verify.

Regards,
The Linux Foundation CLA GitHub bot

@muirdm
Copy link
Contributor Author

muirdm commented Nov 27, 2017

FYI still working on signing the license.

@muirdm
Copy link
Contributor Author

muirdm commented Nov 28, 2017

I think I signed the license.

@muirdm
Copy link
Contributor Author

muirdm commented Nov 28, 2017

I associated my GH account, so maybe now?

@dfawley
Copy link
Member

dfawley commented Nov 30, 2017

PR #1478 was attempting to do the same thing, but also for proto.Unmarshaler. Could you add that, too, and run our benchmarks before/after the change to make sure this does not hurt performance for non-self-marshaling proto types?

If the object to unmarshal implements proto.Unmarshaler, delegate to
that immediately. This perhaps saves a bit of work preparing the
cached the proto.Buffer object which would not end up being used for
the proto.Unmarshaler case.

Note that I moved the obj.Reset() call above the delegation to
obj.Unmarshal(). This maintains the grpc behavior of
proto.Unmarshalers always being Reset() before being delegated to,
which is consistent to how proto.Unmarshal() behaves (proto.Buffer
does not call Reset() in Unmarshal).
@muirdm
Copy link
Contributor Author

muirdm commented Dec 1, 2017

Thank you for looking. I have pushed the Unmarshaler change as requested. Below is benchmark comparison output. I used these benchmark options:

go run benchmark/benchmain/main.go -benchtime=10s -workloads=all   -compression=on -maxConcurrentCalls=1 -trace=off   -reqSizeBytes=1,1048576 -respSizeBytes=1,1048576 -networkMode=Local   -cpuProfile=after_cpuProf -memProfile=after_memProf -memProfileRate=10000 -resultFile=after
Stream-traceMode_false-latency_0s-kbps_0-MTU_0-maxConcurrentCalls_1-reqSize_1048576B-respSize_1048576B-Compressor_true
     Title       Before        After Percentage
  Bytes/op     16794264     16750595    -0.26%
 Allocs/op          196          197     0.51%
50 latency  13.695719ms  12.325266ms   -10.01%
90 latency  14.947742ms  14.132171ms    -5.46%

Unary-traceMode_false-latency_0s-kbps_0-MTU_0-maxConcurrentCalls_1-reqSize_1B-respSize_1B-Compressor_true
     Title       Before        After Percentage
  Bytes/op        13135        13135     0.00%
 Allocs/op          217          217     0.00%
50 latency     153.65µs     143.05µs    -6.90%
90 latency     203.31µs    198.995µs    -2.12%

Stream-traceMode_false-latency_0s-kbps_0-MTU_0-maxConcurrentCalls_1-reqSize_1B-respSize_1B-Compressor_true
     Title       Before        After Percentage
  Bytes/op       542367       597505    10.17%
 Allocs/op          124          126     1.61%
50 latency    215.157µs    221.049µs     2.74%
90 latency    564.593µs    583.952µs     3.43%

Unary-traceMode_false-latency_0s-kbps_0-MTU_0-maxConcurrentCalls_1-reqSize_1B-respSize_1048576B-Compressor_true
     Title       Before        After Percentage
  Bytes/op     10550046     10549798    -0.00%
 Allocs/op          638          621    -2.66%
50 latency   2.719907ms   2.760703ms     1.50%
90 latency   3.162787ms   3.237892ms     2.37%

Stream-traceMode_false-latency_0s-kbps_0-MTU_0-maxConcurrentCalls_1-reqSize_1B-respSize_1048576B-Compressor_true
     Title       Before        After Percentage
  Bytes/op      9139821      9137864    -0.02%
 Allocs/op          179          179     0.00%
50 latency   6.342682ms   6.413199ms     1.11%
90 latency   6.821812ms   7.127117ms     4.48%

Unary-traceMode_false-latency_0s-kbps_0-MTU_0-maxConcurrentCalls_1-reqSize_1048576B-respSize_1B-Compressor_true
     Title       Before        After Percentage
  Bytes/op     10549525     10549711     0.00%
 Allocs/op          634          651     2.68%
50 latency   2.656818ms   2.709885ms     2.00%
90 latency    3.15594ms   3.262535ms     3.38%

Stream-traceMode_false-latency_0s-kbps_0-MTU_0-maxConcurrentCalls_1-reqSize_1048576B-respSize_1B-Compressor_true
     Title       Before        After Percentage
  Bytes/op      9162913      9175603     0.14%
 Allocs/op          180          180     0.00%
50 latency   6.409991ms   6.404367ms    -0.09%
90 latency   7.028107ms   7.018095ms    -0.14%

Unary-traceMode_false-latency_0s-kbps_0-MTU_0-maxConcurrentCalls_1-reqSize_1048576B-respSize_1048576B-Compressor_true
     Title       Before        After Percentage
  Bytes/op     21095620     21094787    -0.00%
 Allocs/op         1061         1020    -3.86%
50 latency   5.375525ms   5.260222ms    -2.14%
90 latency   6.223487ms   5.913482ms    -4.98%

Copy link
Member

@dfawley dfawley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor tweak, otherwise LGTM, thanks.

@@ -79,10 +84,17 @@ func (p protoCodec) Marshal(v interface{}) ([]byte, error) {
}

func (p protoCodec) Unmarshal(data []byte, v interface{}) error {
protoMsg := v.(proto.Message)
protoMsg.Reset()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per proto.Unmarshaler: "The method should reset the receiver before decoding starts."

So this should not be necessary above the call to Unmarshal. Also, buffer.Unmarshal says "Unlike proto.Unmarshal, this does not reset pb before starting to unmarshal."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at some of our Unmarshalers generated via gogoproto and they don't reset themselves in their Unmarshal. I think it would be dangerous to not reset here. See gogo/protobuf#334 and golang/protobuf#424.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See ... golang/protobuf#424.

That's...amazing...

OK, this LGTM then. Thanks!

Copy link
Member

@dfawley dfawley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

@dfawley dfawley merged commit cd563b8 into grpc:master Dec 1, 2017
@dfawley dfawley added the Type: Performance Performance improvements (CPU, network, memory, etc) label Dec 1, 2017
@muirdm muirdm deleted the avoid-extra-marshaler-allocation branch December 2, 2017 00:02
@dfawley dfawley added this to the 1.9 Release milestone Jan 2, 2018
@lock lock bot locked as resolved and limited conversation to collaborators Jan 18, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Type: Performance Performance improvements (CPU, network, memory, etc)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants