New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix coordinator loadStatus performance #5632
Conversation
This is a pretty neat failure mode!
And it sounds like "perform badly" is an understatement. |
@@ -297,7 +297,9 @@ boolean hasLoadPending(final String dataSource) | |||
for (DruidServer druidServer : serverInventoryView.getInventory()) { | |||
final DruidDataSource loadedView = druidServer.getDataSource(dataSource.getName()); | |||
if (loadedView != null) { | |||
segments.removeAll(loadedView.getSegments()); | |||
for (DataSegment serverSegment : loadedView.getSegments()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This deserves a comment so someone doesn't "simplify" it back into the old code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, added a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@jon-wei is there anywhere else using removeAll that looks suspicious? (IMO, given what we see here, anything where the argument is not a Set is suspicious.) |
KafkaSupervisor calls removeAll in checkPendingCompletionTasks, where both collections are ArrayLists, containing task groups PendingTaskBasedWorkerProvisioningStrategy and SimpleWorkerProvisioningStrategy have instances where Set.removeAll is called with a List, containing worker IDs |
* Optimize coordinator loadStatus * Add comment * Fix teamcity * Checkstyle * More checkstyle * Checkstyle
* Optimize coordinator loadStatus * Add comment * Fix teamcity * Checkstyle * More checkstyle * Checkstyle
* Optimize coordinator loadStatus * Add comment * Fix teamcity * Checkstyle * More checkstyle * Checkstyle
* Optimize coordinator loadStatus * Add comment * Fix teamcity * Checkstyle * More checkstyle * Checkstyle
getLoadStatus()
inDruidCoordinator
determines how many segments need to be loaded per-datasource by retrieving the list of all segments from metadata, then asks each server for its segment inventory and removes the server segments from the set returned from metadata:The current code can perform badly when a server has the entire set of segments for a datasource.
This is the code for
AbstractSet.removeAll()
:In that situation, the else block will be used, and this is a problem because
loadedView.getSegments()
is the values collection of a ConcurrentHashMap, and its contains() method does a full traversal of the map:The following benchmark (included in the PR) shows this behavior: