Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grpcwebproxy: "panic: concurrent write to websocket connection" #713

Closed
mgkeeley opened this issue Jun 24, 2020 · 11 comments · Fixed by #815
Closed

grpcwebproxy: "panic: concurrent write to websocket connection" #713

mgkeeley opened this issue Jun 24, 2020 · 11 comments · Fixed by #815

Comments

@mgkeeley
Copy link
Contributor

I get the following error when running the grpxwebproxy service:

panic: concurrent write to websocket connection

goroutine 7407 [running]:
github.com/improbable-eng/grpc-web/vendor/github.com/gorilla/websocket.(*messageWriter).flushFrame(0xc00060beb0, 0xeec501, 0xeec5e0, 0x0, 0x0, 0x13da78305d1bc0, 0xc00060beb8)
	/Users/jonnyreeves/Projects/grpc-web/src/github.com/improbable-eng/grpc-web/vendor/github.com/gorilla/websocket/conn.go:591 +0x6f6
github.com/improbable-eng/grpc-web/vendor/github.com/gorilla/websocket.(*Conn).WriteMessage(0xc0004a8dc0, 0x9, 0xeec5e0, 0x0, 0x0, 0x0, 0x0)
	/Users/jonnyreeves/Projects/grpc-web/src/github.com/improbable-eng/grpc-web/vendor/github.com/gorilla/websocket/conn.go:752 +0x24c
github.com/improbable-eng/grpc-web/go/grpcweb.(*webSocketResponseWriter).ping(0xc000387920, 0xc0000cb800)
	/Users/jonnyreeves/Projects/grpc-web/src/github.com/improbable-eng/grpc-web/go/grpcweb/websocket_wrapper.go:60 +0xbd
created by github.com/improbable-eng/grpc-web/go/grpcweb.(*webSocketResponseWriter).enablePing
	/Users/jonnyreeves/Projects/grpc-web/src/github.com/improbable-eng/grpc-web/go/grpcweb/websocket_wrapper.go:46 +0xea

Versions of relevant software used
grpcwebproxy-v0.12.0-linux-x86_64
grpcwebproxy-v0.12.0-win64.exe

What happened
Very intermittently, the proxy process crashes with the message
panic: concurrent write to websocket connection

What you expected to happen
No crash :-)

How to reproduce it (as minimally and precisely as possible):
I can't reproduce this reliably, it is intermittent on our system.

Anything else we need to know
According to https://godoc.org/github.com/gorilla/websocket#hdr-Concurrency this is because there must only be one writer at a time.

My guess is that the ping happened at exactly the same time as some other message.

We use both linux and windows pre-built versions of the proxies, I saw this error on linux.

@mgkeeley
Copy link
Contributor Author

The command line is

grpcwebproxy-v0.12.0-linux-x86_64 --allow_all_origins --backend_addr=localhost:5004 --use_websockets --websocket_ping_interval=30s --backend_tls=false --server_http_debug_port=5007 --run_tls_server=false

@johanbrandhorst
Copy link
Contributor

Thanks for the error report, your assessment sounds about right. It would be cool to have a test that exercises the websocket writer concurrently to see if we could reproduce it. In any case, I don't know if any of the current maintainers have the bandwidth to pick this up, and the websocket transport is experimental. If you'd like to try to tackle this, I'd be happy to review a PR.

@mgkeeley
Copy link
Contributor Author

I'm not a go developer, so hopefully one of the maintainers can look at this. If I get some time this weekend I could probably create a test setup that triggers the issue.

@jsign
Copy link

jsign commented Sep 25, 2020

Looks like the spawned goroutine responsible for pings is doing unguarded WriteMessages compared to other exported APIs.

I'm not entirely sure if these exported APIs, such as Write, have their own serialization mechanism to avoid concurrency since I don't see a lock at webSocketResponseWriter scope. Depending on the case, the solution might be a bit different maybe?

@pcj
Copy link

pcj commented Dec 10, 2020

I'm seeing the same problem. I think writes to the websocket should be protected by a mutex lock. I can create a PR, but a proper test for this may be tricky.

2020/12/10 18:42:21 http: panic serving 10.8.0.7:60540: concurrent write to websocket connection

net/http.(*conn).serve.func1(0xc0001ae000)
	GOROOT/src/net/http/server.go:1772 +0x139
panic(0xec2ee0, 0x1291a80)
	GOROOT/src/runtime/panic.go:975 +0x3e3
github.com/gorilla/websocket.(*messageWriter).flushFrame(0xc0004535d8, 0xc000474301, 0xc000474367, 0x0, 0x0, 0x0, 0x1)
	external/com_github_gorilla_websocket/conn.go:597 +0x5e9
github.com/gorilla/websocket.(*Conn).WriteMessage(0xc000352160, 0x2, 0xc000474367, 0x5, 0x5, 0x0, 0x0)
	external/com_github_gorilla_websocket/conn.go:750 +0x238
github.com/improbable-eng/grpc-web/go/grpcweb.(*webSocketResponseWriter).writeHeaderFrame(0xc000e05830, 0xc0006da870)
	external/com_github_improbable_eng_grpc_web/go/grpcweb/websocket_wrapper.go:84 +0xc9
github.com/improbable-eng/grpc-web/go/grpcweb.(*webSocketResponseWriter).FlushTrailers(0xc000e05830)
	external/com_github_improbable_eng_grpc_web/go/grpcweb/websocket_wrapper.go:118 +0x39
github.com/improbable-eng/grpc-web/go/grpcweb.(*webSocketWrappedReader).Close(0xc000da6240, 0x1140a38, 0xc00124c780)
	external/com_github_improbable_eng_grpc_web/go/grpcweb/websocket_wrapper.go:134 +0x2f
google.golang.org/grpc/internal/transport.(*serverHandlerTransport).HandleStreams(0xc000cd2510, 0xc000e05d10, 0x1140848)
	external/org_golang_google_grpc/internal/transport/handler_server.go:392 +0x54f
google.golang.org/grpc.(*Server).serveStreams(0xc000a70600, 0x12c3620, 0xc000cd2510)
	external/org_golang_google_grpc/server.go:718 +0xdb
google.golang.org/grpc.(*Server).ServeHTTP(0xc000a70600, 0x12b6ce0, 0xc000e05830, 0xc0000a0b00)
	external/org_golang_google_grpc/server.go:770 +0x114
github.com/improbable-eng/grpc-web/go/grpcweb.(*WrappedGrpcServer).handleWebSocket(0xc0012ab860, 0xc000352160, 0xc0000a0600, 0xc000e05620)
	external/com_github_improbable_eng_grpc_web/go/grpcweb/wrapper.go:192 +0x58e
github.com/improbable-eng/grpc-web/go/grpcweb.(*WrappedGrpcServer).HandleGrpcWebsocketRequest(0xc0012ab860, 0x12ba020, 0xc0003fe380, 0xc0000a0600)
	external/com_github_improbable_eng_grpc_web/go/grpcweb/wrapper.go:149 +0x17f
github.com/improbable-eng/grpc-web/go/grpcweb.(*WrappedGrpcServer).ServeHTTP(0xc0012ab860, 0x12ba020, 0xc0003fe380, 0xc0000a0600)
	external/com_github_improbable_eng_grpc_web/go/grpcweb/wrapper.go:94 +0xfb

@johanbrandhorst
Copy link
Contributor

I wonder if we can avoid writing in two goroutines and have the spawned goroutine communicate back instead? Introducing a lock seems messy.

@pcj
Copy link

pcj commented Dec 10, 2020

I'm not familiar with the code other than looking at the spot where the write failed, so I guess a suggestion on a technical fix may be premature. I'll try to dig deeper but any suggestions are definitely welcome @johanbrandhorst!

@johanbrandhorst
Copy link
Contributor

Yeah, I don't have much context on this, unfortunately, it's been a while, but it would be great to understand exactly where these two writers exist, and if we can streamline writes into a single goroutine instead of introducing a lock.

@pcj
Copy link

pcj commented Dec 10, 2020

May be some prior art in https://github.com/sacOO7/gowebsocket. Edit: using locks so maybe not the best example for concurrency control with gorilla websocket.

https://github.com/gorilla/websocket/tree/master/examples/chat is probably a better example.

@mgkeeley
Copy link
Contributor Author

I had a look at fixing this with a mutex; go mutexes are very fast when there are no locks on the mutex (which should be almost all the time, as it's only the keepalive ping that happens concurrently with other writes on the socket).
But I had trouble modifying the test framework to test websockets, currently there are no tests that exercise this situation (using grpcwebproxy to forward requests to a grpc backend). I tried modifying the test framework but got a bit lost, as I'm not familiar with go or typescript!
And then, due to an Series of Unfortunate Events, I lost the changes I made to introduce the mutex. :-(

@pcj
Copy link

pcj commented Dec 11, 2020

I put together a draft PR that refactors the websocket handler using the overall design of the gorilla mux chat example (uses channels as @johanbrandhorst suggested). Still a WIP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants