-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Fixing _rollup/data performance for a large number of indices #138305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing _rollup/data performance for a large number of indices #138305
Conversation
|
Hi @masseyke, I've created a changelog YAML for you. |
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
nielsbauman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left one optional suggestion, other than that LGTM. Thanks Keith :)
| static Map<String, RollableIndexCaps> getCapsByRollupIndex(Collection<String> resolvedIndexNames, Map<String, IndexMetadata> indices) { | ||
| Map<String, List<RollupJobCaps>> allCaps = new TreeMap<>(); | ||
|
|
||
| indices.entrySet().stream().filter(entry -> resolvedIndexNames.contains(entry.getKey())).forEach(entry -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it's worth converting this to a regular for-each loop (instead of using the Streams API)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem like it would matter much for a single pass through a collection, does it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is, this is only hot code b/c it's in an N^2 for loop -- the method itself isn't called all that often.

This avoids performing an N^2 loop through indices when we call
GET /*/_rollup/data. This can save a few seconds per call for clusters with a very large number of indices.