Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

race when using codec with grpc #205

Open
chocnutfriday opened this issue Sep 30, 2016 · 33 comments
Open

race when using codec with grpc #205

chocnutfriday opened this issue Sep 30, 2016 · 33 comments

Comments

@chocnutfriday
Copy link

grpc expects marshalers to be thread-safe. This is not true in gogo protobuf currently. In most cases this is best, soI am not suggesting that the default behavior changes. However, it would be nice if an option, such as
"option (gogoproto.concurrent_marshaler_all) = true"
could be used to enforce thread-safety in marshalers.

I have created a small example that demonstrates the issue:
https://gist.github.com/chocnutfriday/a864630a5bcbafd5c15ae634ccc776d3
Run it with "go test -race"

@awalterschulze awalterschulze changed the title Data Race when using grpc race when using codec with grpc Sep 30, 2016
@awalterschulze
Copy link
Member

On 30 September 2016 at 12:38, chocnutfriday notifications@github.com
wrote:

grpc expects marshalers to be thread-safe. This is not true in gogo
protobuf currently. In most cases this is best, soI am not suggesting that
the default behavior changes. However, it would be nice if an option, such
as
"option (gogoproto.concurrent_marshaler_all) = true"
could be used to enforce thread-safety in marshalers.

What would this generated marshaler look like?

The generated Marshal method is already thread safe.
And the generated MarshalTo method is as thread safe as the byte buffer you
are passing in.

I have created a small example that demonstrates the issue:
https://gist.github.com/chocnutfriday/a864630a5bcbafd5c15ae634ccc776d3
Run it with "go test -race"

Also this test passes for me.
And also passes for me with the -race flag.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#205, or mute the thread
https://github.com/notifications/unsubscribe-auth/ABvsLfQ5uURUxcKSlP_g031Keg71Co6iks5qvOaegaJpZM4KK9B6
.

@chocnutfriday
Copy link
Author

I'm not sure why the race isn't happening for you. Did you just run the generate script and go test? I just checked after "go get"ing the latest gogo/protobuf and grpcs.
Is there a way to specify what type of buffer to use for the marshalTo method? This is where the race happens. By default GRPC uses a byte slice as well as MarshalTo, which isn't safe for the server which handles concurrent requests. For performance reasons it would be nice if the marshaler did not make a new buffer each time, but used a buffer pool or something safe.

@chocnutfriday
Copy link
Author

Gist now updated with example output from the race detector as well as the generated pbs.

@awalterschulze
Copy link
Member

I think you need to send me the new gist link.
Also I don't see you using the codec that might also be a problem.

@chocnutfriday
Copy link
Author

My bad. I forgot to hit the update button or something yesterday. Maybe it will make more sense after you looked at the code that is generated as well as the output.

@awalterschulze
Copy link
Member

Ok its updated, but I still don't see where you are importing codec?

@chocnutfriday
Copy link
Author

Well, in the generated pb protobuf/proto is imported. Proto.MarshalTo is called.
I quote from the race detector output:
WARNING: DATA RACE
Write at 0x00c4202781ce by goroutine 80:
concurrentgrpc.(_Payload).MarshalTo()
gopath/concurrentgrpc/data.pb.go:192 +0x6d
github.com/gogo/protobuf/codec.(_codec).Marshal()
gopath/github.com/gogo/protobuf/codec/codec.go:80 +0x15a
google.golang.org/grpc.encode()
gopath/google.golang.org/grpc/rpc_util.go:264 +0x495
google.golang.org/grpc.(_Server).sendResponse()
gopath/google.golang.org/grpc/server.go:513 +0xce
google.golang.org/grpc.(_Server).processUnaryRPC()
gopath/google.golang.org/grpc/server.go:638 +0x11e4
google.golang.org/grpc.(_Server).handleStream()
gopath/google.golang.org/grpc/server.go:770 +0xf9c
google.golang.org/grpc.(_Server).serveStreams.func1.1()
gopath/google.golang.org/grpc/server.go:423 +0xb8

Previous write at 0x00c4202781ce by goroutine 78:
concurrentgrpc.(_Payload).MarshalTo()
gopath/concurrentgrpc/data.pb.go:192 +0x6d
github.com/gogo/protobuf/codec.(_codec).Marshal()
gopath/github.com/gogo/protobuf/codec/codec.go:80 +0x15a
google.golang.org/grpc.encode()
gopath/google.golang.org/grpc/rpc_util.go:264 +0x495
google.golang.org/grpc.(_Server).sendResponse()
gopath/google.golang.org/grpc/server.go:513 +0xce
google.golang.org/grpc.(_Server).processUnaryRPC()
gopath/google.golang.org/grpc/server.go:638 +0x11e4
google.golang.org/grpc.(_Server).handleStream()
gopath/google.golang.org/grpc/server.go:770 +0xf9c
google.golang.org/grpc.(_Server).serveStreams.func1.1()
gopath/google.golang.org/grpc/server.go:423 +0xb8

@awalterschulze
Copy link
Member

But can you explain to me how grpc/rpc_util is able to call gogo/protobuf/codec without you passing it to the grpc server?

@chocnutfriday
Copy link
Author

From the grpc source:
// protoCodec is a Codec implementation with protobuf. It is the default codec for gRPC.
type protoCodec struct{}

func (protoCodec) Marshal(v interface{}) ([]byte, error) {
return proto.Marshal(v.(proto.Message))
}

@awalterschulze
Copy link
Member

Ok and now. How does it get to the github.com/gogo/protobuf/codec ?

On 6 October 2016 at 13:54, chocnutfriday notifications@github.com wrote:

From the grpc source:
// protoCodec is a Codec implementation with protobuf. It is the default
codec for gRPC.
type protoCodec struct{}

func (protoCodec) Marshal(v interface{}) ([]byte, error) {
return proto.Marshal(v.(proto.Message))
}


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#205 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABvsLQS4WYc_MvU4LllZV5OPNg5Pw7aPks5qxOFngaJpZM4KK9B6
.

@chocnutfriday
Copy link
Author

gogo protobuf satisfies the grpc codec interface and it is used by default.
From the grpc docs:
type Codec interface {
// Marshal returns the wire format of v.
Marshal(v interface{}) ([]byte, error)
// Unmarshal parses the wire format into v.
Unmarshal(data []byte, v interface{}) error
// String returns the name of the Codec implementation. The returned
// string will be used as part of content type in transmission.
String() string
}
Supplying a custom codec to grpc that uses a buffer pool made the race go away.

@awalterschulze
Copy link
Member

How is it used by default? where?
It probably needs to be injected somewhere, no?

@chocnutfriday
Copy link
Author

Yes, it is injected somewhere. Magic of auto-generated code. It shows up on the race-detector in most of my grpc projects. The fact of the matter is that when something injects is into something that expects a thread-safe buffer it's not going to happen, causing a race.

@awalterschulze
Copy link
Member

generated code can be read. Code is not magic. Where is it begin injected?

@chocnutfriday
Copy link
Author

Yes, you are completely right. I couldn't find where the use of codec was generated, because it isn't generated. It is done in the grpc server itself. Which makes me wonder how the race isn't a problem for more people.

in
gopath/google.golang.org/grpc/server.go
we have:
"github.com/gogo/protobuf/codec"
being imported

at line 196:
if opts.codec == nil {
// Set the default codec.
opts.codec = codec.New(0)
}

Well I'm surprised.

@awalterschulze
Copy link
Member

Well done, see no magic :)
I "guess" "someone" must have edited your git subtree to only import github.com/gogo/protobuf and removed all imports of github.com/golang/protobuf. This was "probably" done because "you" didn't want to add the extra unneeded dependency and possibly also to avoid confusion about which protobuf library to use.
Ek sou vir jou vroeer hiervan vertel het as ek onthou het, jammer.

Now you will need to edit your gist to always use the github.com/gogo/protobuf/codec . Please see the grpc-go Server and Client Options. Their should be some option called WithCodec or something like that.

@chocnutfriday
Copy link
Author

The gist has been updated to use the protobuf codec as "custom codec".

@awalterschulze
Copy link
Member

Looks like it has only been updated on the server and not the client. But might be a good idea to see if that alone reproduces the problem. I'll have a look as soon as I can, but I am quite busy at the moment with moving. So expect some delays.

@chocnutfriday
Copy link
Author

Yes, the output from the race detector shows that the race happens for the server. I think it is because new requests can arrive while other requests are being processed. The client code in this case is quite thread-safe, so there's no need for a thread-safe codec.

Thank you.

@awalterschulze
Copy link
Member

Thats also good to know.
Thank you :)

@awalterschulze
Copy link
Member

Its really weird, but when I clone your gist I can really weird file names.

concurrentgrpc\data.pb.go
concurrentgrpc\data.proto
concurrentgrpc\exampleOutput
concurrentgrpc\genpb.sh

Maybe its easier if you remove the folder.

Also please change the proto_path to just assume that you are in ./src/somefolder and that gogoprotobuf and all deps are already cloned.

That way someone can just check it out and start fiddling.

@awalterschulze
Copy link
Member

hello?

@chocnutfriday
Copy link
Author

Sorry for not being active on this. I have stopped working on this issue for now. The default PB marshaler is good enough for its purpose and this is what GRPC uses by default. I could not come up with a scheme for a codec that is efficient, fast and thread-safe. Though the current proto/codec is not safe for concurrent use, it is efficient, fast and good for reuse. If you can think of a way to get similar performance in a thread-safe way let me know. I tried using sync.Pool, but gave up on that since I had to choose between doing a copy or risk a race when returning the buffer coupled with a defered insert into the pool.

@awalterschulze
Copy link
Member

All that work can be very useful in making an argument to grpc-go.
If you tidy up your gists and post the alternative gist that shows the slow down with sync.Pool that will be amazing.
Since the only way to make the codec thread safe and as fast as it is now, is to get buy in from grpc-go.
Are you maybe willing to do this?

@chocnutfriday
Copy link
Author

I don't mind doing this. I just want to get clarity about what a good solution would be.

Parallel encoding is currently done by sharing a single codec. My thoughts are that, for go, it would be better to have a pool of codec objects that never gets accessed concurrently. Given the cost of assuring synchronisation having a single shared codec isn't worth it currently. At least, this is true for go, but not for all languages. C++ has arena to get around the safety issue, for example. In other words, this default behavior of sharing a codec isn't always bad and the implications of changing this assumption extends beyond go.

So before I raise the issue let me know what you would suggest as I'm at a loss.
To post benchmarks for a codec that is not supported by them will probably be ignored. Asking for the behavior of GRPC to change is probably not going to be accepted due to other languages and there being a working codec all be it slower. So the only real alternative is to create a thread-safe codec that is faster or settle for using the default pb marshaler.

@awalterschulze
Copy link
Member

We are not challenging grpc just the grpc-go library.
So even though change is unlikely we should still at least try.
Since if they decide to have a pool of codec objects (and they support custom implementations of codecs, its there interface that exposes WithCodec) it will be the fastest.
Only if this comes down to a definite no can we give up on this solution.

I don't mind raising the issue with them, if you don't want to, but I need data and code to support this and as you know this takes time to make.

We need three things:

  1. One that exposes the race condition. This you have, but you need to tidy up. It needs to be as simple as checking it out in the right place and running some make option. With no absolute paths. It should also be easy to modify and rerun. Just like any normal github project, just smaller.

  2. Next we need an implementation that tries to use sync.Pool inside the codec, like you have said you have already tried.

  3. Finally we need benchmark numbers for these two implemenations.

Then in the issue (you or I) can link to the gist(s) that you made you simply argue the point that asks them to implement codec pooling.

@chocnutfriday
Copy link
Author

Ok, I'll work on this. I'll let you know when I've updated the gist.

@steeve
Copy link

steeve commented Jan 23, 2017

Jumping on the bandwagon here.

gRPC expect the Codec object to be long lived, as such, the current implementation at https://github.com/gogo/protobuf/blob/master/codec/codec.go is not safe since it keeps the resulting []byte as state.

As for using a pool, I did some benchmark using a sync.Pool of []byte and using MarshalTo.

Here are the results, first one is simple proto.Marshal, second is using a pool:

BenchmarkProtoBuffer-4    	10000000	       268 ns/op	      96 B/op	       4 allocs/op
BenchmarkProtoBufferPool-4   	 5000000	       275 ns/op	      64 B/op	       3 allocs/op

The performance advantage of the pool, in our case, is lost doing the type assertion and calling Size().

type marshalerTo interface {
	Size() int
	MarshalTo(data []byte) (n int, err error)
}

In the end, we are using this very simple codec:

type GogoCodec struct {
	grpc.Codec
}

func (g *GogoCodec) Marshal(v interface{}) ([]byte, error) {
	return proto.Marshal(v.(proto.Message))
}

func (g *GogoCodec) Unmarshal(data []byte, v interface{}) error {
	return proto.Unmarshal(data, v.(proto.Message))
}

func (g *GogoCodec) String() string {
	return "proto"
}

@awalterschulze
Copy link
Member

Thank you for the data. Can you also provide the protobuf you used, since it looks like it could be a small one?

@steeve
Copy link

steeve commented Jan 24, 2017

Sure, here goes (proto3), compiled with marshaler and sizer:

message Sample {
    string str1 = 1;
    string str2 = 2;
    string str3 = 3;
    repeated string str_array1 = 4;
    repeated string str_array2 = 5;
}

@steeve
Copy link

steeve commented Jan 24, 2017

All the fields were set, and the arrays to one item only

@awalterschulze
Copy link
Member

awalterschulze commented Jan 24, 2017 via email

@steeve
Copy link

steeve commented Jan 25, 2019

Going back on this, is there a reason for the gogo grpc codec to keep its bytes as state, thus suffering from a data race when concurrent calls to Marshal are done ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants