Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Http2: Writes starving reads? #12320

Open
tommyulfsparre opened this issue Apr 20, 2022 · 4 comments
Open

Http2: Writes starving reads? #12320

tommyulfsparre opened this issue Apr 20, 2022 · 4 comments

Comments

@tommyulfsparre
Copy link

Expected behavior

Running the reproducer code the client should succeed the majority of RPCs without hitting the deadline (100ms).

Actual behavior

Using gRPC or servicetalk the reproducer will fail most RPC while using OkHttp does not. The commonality between gRPC and servicetalk is that they are both built on Netty, although with different implementations.

This issue was initial raised here: grpc/grpc-java#8912 and contains some more information; It was speculated that it might be due to writes starving reads and seeing that OkHttp uses separate dedicated threads for reading and writing would explain why it does not exhibit the same behavior here.

Steps to reproduce

Run the reproducer with either -Dcom.grpc.example.transport=netty or -Dcom.grpc.example.transport=servicetalk system property set.

Minimal yet complete reproducer code (or URL to code)

https://github.com/tommyulfsparre/repro-netty-http2

Netty version

4.1.72.Final

JVM version (e.g. java -version)

Java 11

OS version (e.g. uname -a)


Does this sound plausible and if so is there anything that can be tweak in Netty to alleviate this?

@normanmaurer
Copy link
Member

Any idea @ejona86 ?

@ejona86
Copy link
Member

ejona86 commented Apr 28, 2022

I looked into it and confirmed the behavior. But I haven't determined why ioRatio isn't avoiding this problem.

@franz1981
Copy link
Contributor

@ejona86 the benchmark/test is strange (no warmup, just 1K input buffers) and I am not personally confident how the protocol is supposed to work.
I remember that others eg Jetty has developed and hoc scheduling of writes/reads to save HOL to happen ie Eat What You Kill - named as a jetty blog post (if that's the issue that's happening, maybe not).
Can you confirm that despite how this test is written, this is a real issue that can affect production-level usages?
@tommyulfsparre as well
@vietj we are interested on this because of grpc

@tommyulfsparre
Copy link
Author

Can you confirm that despite how this test is written, this is a real issue that can affect production-level usages

@franz1981 How would one know if a timeout happens because of this (potential) issue or any other reason that can lead to a timeout? Is there instrumentation that is available that one could hook into? Here is some additional context: grpc/grpc-java#8912 (comment).

The code merely tries to show that during certain, fabricated, conditions Netty seem to fare worse than other transport (OkHttp in this case). If this is something that actually happens in practice or if it's even a Netty issue, I don't know yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants