Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BinderTransport flow control starves unlucky streams #8834

Closed
jdcormie opened this issue Jan 14, 2022 · 0 comments · Fixed by #8835
Closed

BinderTransport flow control starves unlucky streams #8834

jdcormie opened this issue Jan 14, 2022 · 0 comments · Fixed by #8835

Comments

@jdcormie
Copy link
Member

What version of gRPC-Java are you using?

head

What is your environment?

Android/Linux

Steps to reproduce the bug

Create a gRPC client that maintains 10 concurrently active unary RPCs to a service that immediately responds with a payload of 1MB, for 60 seconds.

What did you expect to see?

No errors and a normal distribution of response latencies.

What did you see instead?

~1% of requests fail with either DEADLINE_EXCEEDED or CANCELLED
p99 latency of 3 seconds compared to p50 of only 90ms

I believe the long tail latency / timeouts is caused by some call ids always hashing to the end of BinderTransport's ongoingCalls container. Every time space in the flow control window opens up, call ids that appear early in the iteration order gobble it all up. By the time we get to the end of ongoingCalls, flowController.isTransmitWindowFull() is returning true again and Outbound.send() just returns without making any progress.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant