Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non blocking server impl and recommended usage of Server.directExecutor() #5185

Closed
marcoferrer opened this issue Dec 19, 2018 · 3 comments
Closed

Comments

@marcoferrer
Copy link
Contributor

What version of gRPC are you using?

1.15.1

Ive been building a non blocking implementation for kotlin coroutines on top of grpc-java. Similar to the reactive-grpc project. (marcoferrer/kroto-plus#16 & Kotlin/kotlinx.coroutines#360)

Im having a hard time finding the preferred / best practices for configuring a non blocking server.

When configuring the executor for the server, some sources point to using a directExecutor while others recommend using aForkJoinPool. Based on related discussions, it seems the former has the possibility of being unsafe?

I ran a benchmark on a coroutine base port of the existing benchmark service.

The results were in favor of using directExecutor over a forkJoinPool, but if directExecutor is considered unsafe Id rather live with decreased QPS. Another reason why Im looking for clarification is because Id like be able to outline the recommended configurations in my projects documentation.

QPS Benchmarks

Unary Test Args:

./qps_client --address=127.0.0.1:8000 --channels=10 --server_payload=1 --client_payload=1

Server Executor: ForkJoinPool(parallelism = 4)

Channels:                       10
Outstanding RPCs per Channel:   10
Server Payload Size:            1
Client Payload Size:            1
50%ile Latency (in micros):     459
90%ile Latency (in micros):     1151
95%ile Latency (in micros):     2191
99%ile Latency (in micros):     11775
99.9%ile Latency (in micros):   124415
Maximum Latency (in micros):    260095
QPS:                            87355

Server Executor: directExecutor()

Channels:                       10
Outstanding RPCs per Channel:   10
Server Payload Size:            1
Client Payload Size:            1
50%ile Latency (in micros):     431
90%ile Latency (in micros):     1003
95%ile Latency (in micros):     1975
99%ile Latency (in micros):     8383
99.9%ile Latency (in micros):   39679
Maximum Latency (in micros):    148479
QPS:                            122236

Client Executor: directExecutor() & Server Executor: directExecutor()

Channels:                       10
Outstanding RPCs per Channel:   10
Server Payload Size:            1
Client Payload Size:            1
50%ile Latency (in micros):     397
90%ile Latency (in micros):     855
95%ile Latency (in micros):     1215
99%ile Latency (in micros):     2767
99.9%ile Latency (in micros):   7679
Maximum Latency (in micros):    89087
QPS:                            178784

Streaming Test Args:

./qps_client --address=127.0.0.1:8000 --channels=10 --server_payload=1 --client_payload=1 --streaming_rpcs=true

Server Executor: ForkJoinPool(parallelism = 4)

Channels:                       10
Outstanding RPCs per Channel:   10
Server Payload Size:            1
Client Payload Size:            1
50%ile Latency (in micros):     415
90%ile Latency (in micros):     883
95%ile Latency (in micros):     1359
99%ile Latency (in micros):     3295
99.9%ile Latency (in micros):   10751
Maximum Latency (in micros):    245759
QPS:                            174428

Server Executor: directExecutor()

Channels:                       10
Outstanding RPCs per Channel:   10
Server Payload Size:            1
Client Payload Size:            1
50%ile Latency (in micros):     321
90%ile Latency (in micros):     559
95%ile Latency (in micros):     779
99%ile Latency (in micros):     1831
99.9%ile Latency (in micros):   5567
Maximum Latency (in micros):    119295
QPS:                            250821

Client Executor: directExecutor() & Server Executor: directExecutor()

Channels:                       10
Outstanding RPCs per Channel:   10
Server Payload Size:            1
Client Payload Size:            1
50%ile Latency (in micros):     249
90%ile Latency (in micros):     373
95%ile Latency (in micros):     481
99%ile Latency (in micros):     1023
99.9%ile Latency (in micros):   2655
Maximum Latency (in micros):    125439
QPS:                            348325
@carl-mastrangelo
Copy link
Contributor

If you are really sure there is no blocking in your app code, and you know it your app handlers finish up quickly, directExecutor is okay. Let me cast doubt on your certainty:

  • If you make any function call that could result in the invocation LockSupport.park, parkNanos, or other park variants, your code is blocking
  • If your code has synchronized blocks on more than one lock, your code is blocking
  • If you make calls outside of your own library from within a synchronized block, your code is blocking (and Effective Java admonishes you!)
  • If you make any uninterruptible syscalls (such as read from a file, say your jar file for instance) your code is blocking.
  • If you call Thread.sleep, Thread.join, or those are in the transitive closure of code your application calls, you code is blocking.
  • If you call Object.wait, your code is blocking
  • If you call any of the synchronization primitives in java.util.concurrent.locks, your code is blocking
  • If you use volatile on a processor without native atomic support, your code is blocking
  • And more.

The thing is, unless you can prove (typically because your application code is trivial) there is no blocking, it probably is. It probably has some blocking that you aren't aware of, and can lead to starvation. Note that I am using the concurrency definition of blocking, rather than the colloquial meaning, which is less well defined.

This is why we say it's dangerous. Unless you have prior knowledge it isn't safe. In the best case, one thread just stalls the others from making progressm increasing latency. In the worst case, it deadlocks.

@marcoferrer
Copy link
Contributor Author

marcoferrer commented Dec 20, 2018

I think to an even better point, even if I could prove my certainty it doesnt mean that consumers of my library are going to be diligent enough to ensure they dont block as well.

My current implementation dispatches the actual service method invocation to a separate executor Dispatchers.Default. But everything leading up to that point would still occur on the directExecutor thread, which includes running the interceptors. And the presents another chance for a user to introduce blocking.

Thanks @carl-mastrangelo for all your input!

@carl-mastrangelo
Copy link
Contributor

SG @marcoferrer . One other note for future readers: class loading always involves locks, and has bitten me more than once. It's surreptitious because it's invisible in the source code, but all over the place in the JVM.

@lock lock bot locked as resolved and limited conversation to collaborators Mar 20, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants