Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert Series()/sync.Pool changes for now #4609

Merged
merged 2 commits into from
Aug 27, 2021

Conversation

GiedriusS
Copy link
Member

Fix #4595 by reverting these changes for now. grpc-go does double-buffer i.e. SendMsg() can return before the whole message is on the wire hence it is impossible to share the buffers for now. http://www.golangdevops.com/2019/12/31/autopool/ has shown some promise but with finalizers the CPU usage is even higher and RAM usage is barely lower:

name                                                           old time/op    new time/op    delta
BucketSeries/1000000SeriesWith1Samples/1of1000000-16             95.0ms ± 8%    93.9ms ±15%     ~     (p=0.411 n=25+50)
BucketSeries/1000000SeriesWith1Samples/10of1000000-16            92.9ms ±11%    90.7ms ±13%     ~     (p=0.073 n=24+49)
BucketSeries/1000000SeriesWith1Samples/1000000of1000000-16        1.12s ± 2%     1.71s ± 6%  +52.97%  (p=0.000 n=23+48)
BucketSeries/100000SeriesWith100Samples/1of10000000-16           6.49ms ± 2%    6.16ms ± 6%   -5.01%  (p=0.000 n=21+50)
BucketSeries/100000SeriesWith100Samples/100of10000000-16         6.70ms ± 3%    6.01ms ± 1%  -10.27%  (p=0.000 n=24+45)
BucketSeries/100000SeriesWith100Samples/10000000of10000000-16     122ms ± 3%     175ms ± 8%  +43.59%  (p=0.000 n=25+40)

name                                                           old alloc/op   new alloc/op   delta
BucketSeries/1000000SeriesWith1Samples/1of1000000-16             62.0MB ± 0%    62.0MB ± 0%     ~     (p=0.093 n=21+50)
BucketSeries/1000000SeriesWith1Samples/10of1000000-16            62.1MB ± 0%    62.1MB ± 0%     ~     (p=0.368 n=25+50)
BucketSeries/1000000SeriesWith1Samples/1000000of1000000-16       1.37GB ± 0%    1.41GB ± 0%   +3.06%  (p=0.000 n=25+47)
BucketSeries/100000SeriesWith100Samples/1of10000000-16           4.85MB ± 0%    4.85MB ± 0%   -0.13%  (p=0.000 n=25+50)
BucketSeries/100000SeriesWith100Samples/100of10000000-16         4.86MB ± 0%    4.85MB ± 0%   -0.14%  (p=0.000 n=25+50)
BucketSeries/100000SeriesWith100Samples/10000000of10000000-16     158MB ± 2%     154MB ± 0%   -2.55%  (p=0.000 n=25+42)

name                                                           old allocs/op  new allocs/op  delta
BucketSeries/1000000SeriesWith1Samples/1of1000000-16              9.68k ± 0%     9.70k ± 1%   +0.14%  (p=0.007 n=24+50)
BucketSeries/1000000SeriesWith1Samples/10of1000000-16             9.79k ± 0%     9.82k ± 0%   +0.25%  (p=0.000 n=25+50)
BucketSeries/1000000SeriesWith1Samples/1000000of1000000-16        11.0M ± 0%     13.0M ± 0%  +18.11%  (p=0.000 n=25+49)
BucketSeries/100000SeriesWith100Samples/1of10000000-16            1.10k ± 0%     1.10k ± 0%     ~     (p=0.424 n=25+50)
BucketSeries/100000SeriesWith100Samples/100of10000000-16          1.14k ± 0%     1.14k ± 0%   +0.59%  (p=0.000 n=25+49)
BucketSeries/100000SeriesWith100Samples/10000000of10000000-16     1.10M ± 0%     1.30M ± 0%  +18.07%  (p=0.000 n=22+41)

Relevant discussion: https://cloud-native.slack.com/archives/CL25937SP/p1629797444067700

What's interesting is that vtprotobuf's documentation says:

func (p *YourProto) MarshalToVT(data []byte) (int, error): this function can be used to marshal a message to an existing buffer. The buffer must be large enough to hold the marshalled message, otherwise this function will panic. It returns the number of bytes marshalled. This function is useful e.g. when using memory pooling to re-use serialization buffers.

But I'm not sure how it could be safe to do that with when SendMsg() returns before putting the message on the wire: grpc/grpc-go#2159. So, another goroutine might write over a slice that is about to be sent :)

Thus, all of this needs quite a bit more research so revert these changes for now. But all of this shows that there are huge performance gains to be made here

This reverts commit 8b4c3c9.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
…hanos-io#4535)"

This reverts commit 7a8d189.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I don't think this is an easy problem to solve, but we learned a lot from this so thank you!

I wonder also if vitess lib actually has a solution to this. It's worth a try. I think resetting on the end of grpc call is not bad solution - you at least pool between gRPC calls (: Let's try in free time. Let's make sure we have clear GH issue too!

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Current main's Store Gateway gRPC Series returns malformed proto:
2 participants