-
Notifications
You must be signed in to change notification settings - Fork 9.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dispatcher no longer has quadratic behaviour #4581
Dispatcher no longer has quadratic behaviour #4581
Conversation
Wow, that's both awful and unexpected. I assume the problem is that we're doing work while holding that lock that we shouldn't be. The dispatcher lock is intended to defend against memory accesses and should never be a contention point. |
May I propose an alternate fix? Two broad ideas: Take an optional host hint for the method and pass it when a call completes or is enqueued. If we have more than 1 general capacity then search for a call for that host specifically. Add an AtomicInteger to RealCall which represents the running calls for that host. Initialize it when the call is enqueued to the AtomicInteger of another call that shares the host, or 0 if there are none. |
@swankjesse OK I can take a stab at that AtomicInteger version - is the motivation there to reduce the Map allocation or minimise the new abstractions in one PR? It seems a bit sad to add mutability to the otherwise immutable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I like this much better.
// Mutate the AsyncCall so that it shares the AtomicInteger of an existing running call to | ||
// the same host. | ||
if (!call.get().forWebSocket) { | ||
for (AsyncCall existingCall : runningAsyncCalls) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably needs to loop over ready calls also!
Also, it seems that synchronous calls are not affected by the |
Synchronous calls don't participate. You got it. |
} | ||
promoteAndExecute(); | ||
} | ||
|
||
@Nullable private AsyncCall findExistingCallWithHost(String host) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Nice improvement. Presumably your benchmark numbers are still a big improvement?! |
Yep, I think we should still see a speedup when lots of requests are queued/in-flight! |
…o okhttp3.Dispatcher (#940) ## Before this PR We recently experienced two P0 issues that seemed to be caused by locking inside the okhttp 3.12.0 dispatcher. I opened a PR to okhttp which should hopefully reduce the amount of work done inside `synchronized` blocks (square/okhttp#4581). One internal service forced okhttp to 3.11.0 as a temporary workaround. ## After this PR Hopefully that 3.11.0 downgrade can be reverted, and P0s won't re-occur! _Full changelog https://github.com/square/okhttp/blob/master/CHANGELOG.md_ Do not merge until: - [x] ~TLSv1 and TLSv1.1 are no longer enabled by default. - do we need to opt-in to `ConnectionSpec.COMPATIBLE_TLS`~ seems like we've always been overriding the connection spec to [only specify TLS 1.2](https://github.com/palantir/conjure-java-runtime/blob/9d88bfa3c3e1de74ea57f158734186628d1bb8c9/okhttp-clients/src/main/java/com/palantir/conjure/java/okhttp/OkHttpClients.java#L225).
We had a production incident earlier today in which outgoing requests were blocked, with jstacks suggesting a lock in the Dispatcher as the culprit (on 3.12.0). (cc @swankjesse who worked on this recently)
This PR eliminates some quadratic behaviour in the Dispatcher by keeping track of requests per host.
I doubt this PR will really solve the problem we encountered (especially as we only use
maxRequests = 256
andmaxRequestsPerHost = 64
), but it did seem a bit inefficient that thepromoteAndExecute
method iterated through the entirerunningAsyncCalls
for every item ofreadyAsyncCalls
. I thought I'd put this PR up anyway as you might be interested in a perf improvement. Happy to add more test coverage if this is a direction you'd be interested in.A very unscientific benchmark on my laptop showed speedups as maxRequests increased:
For reference, jstacks showed:
0x00007f28956a5000
lineokhttp3.Dispatcher.finished(Dispatcher.java:219)
0x00007f28956a5000
on lineokhttp3.Dispatcher.enqueue(Dispatcher.java:134)
runningCallsForHost
method: