New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show stats of all index sets without pagination #4211

Merged
merged 4 commits into from Oct 10, 2017

Conversation

Projects
None yet
4 participants
@lennartkoopmann
Member

lennartkoopmann commented Oct 5, 2017

Add information and stats about all indices on the "Index Sets" page without accounting for pagination. The reason for this is, that finding out the total size of all indices can be really annoying if you have a lot of index sets. See also #4204.

This change was initiated by customer request #1503.

This introduces a new REST API endpoint at "/system/indices/index_sets/stats" to retrieve global, aggregated stats without pagination or other information about the index sets.

How Has This Been Tested?

Tested on a local setup and also a v2.3.1 with about 500GB of data in over 400 indices and 3 index sets. The performance impact to calculate the stats with an initial concern but my benchmarks showed that it only adds about 300ms to the Index Sets page load time, which is acceptable IMO. Adding a caching layer or getting creative with the already computed stats of the paginated index sets is probably not worth the overhead.

Screenshots:

screen shot 2017-10-05 at 2 54 38 pm

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
@kroepke

This comment has been minimized.

Member

kroepke commented Oct 6, 2017

I'm not really comfortable with this approach, because it causes one elasticsearch query for each configured index set (via the index wildcard).
It also does so sequentially, which means that if the user has so many index sets that pagination becomes a problem, the stats query will take significant time, too, for each page that's visited.

I would much rather get the indices stats for multiple index sets in parallel short of storing this information in the database for read-only indices and only asking for the latest deflector stats whenever necessary.

@dennisoelkers

LGTM in general, I added two inline comments which might improve the code a bit.

long documents = 0;
long size = 0;
for (IndexSetStats stats : indexSetRegistry.getAll().stream()
.collect(Collectors.toMap(indexSet -> indexSet.getConfig().id(), indexSetStatsCreator::getForIndexSet)).values()) {

This comment has been minimized.

@dennisoelkers

dennisoelkers Oct 6, 2017

Member

Can you please extract this (the map you are iterating over) into a local variable? This makes it way easier to understand what is going on and the type is explicitly visible.

This comment has been minimized.

@joschi

joschi Oct 6, 2017

Contributor

Also, why collect into a map if you're only interested in the values anyway?

This comment has been minimized.

@joschi

joschi Oct 6, 2017

Contributor

Additionally, this will run two remote queries for each index set for a piece of information which could be fetched with a single request for all index sets via the Cat Indices API.
(Basically what @kroepke said: #4211 (comment))

@@ -128,6 +128,26 @@ const IndexSetsStore = Reflux.createStore({
IndexSetsActions.setDefault.promise(promise);
},
stats() {
const url = URLUtils.qualifyUrl(ApiRoutes.IndexSetsApiController.stats().url);
const promise = fetch('GET', url);

This comment has been minimized.

@dennisoelkers

dennisoelkers Oct 6, 2017

Member

If this is a long-running operation (which it probably becomes when the number of indices is big enough), we should cache the promise and return it if it still running instead of creating a new one. The general logic for this would look like this:

if (this.promise) {
  return this.promise;
}
this.promise = fetch(..);
this.promise.then(...).finally(() => { this.promise = undefined; });

This pattern avoids building up too many requests in the browser.

@lennartkoopmann

This comment has been minimized.

Member

lennartkoopmann commented Oct 6, 2017

@kroepke I agree with your point about the parallelity. Let me take a second attempt at that part.

Thanks for the other comments @dennisoelkers! :) Will approach those, too.

@joschi joschi modified the milestones: 2.4.0, 3.0.0 Oct 10, 2017

lennartkoopmann and others added some commits Oct 5, 2017

Show stats of all index sets without pagination
Add information and stats about all indices on the "Index Sets" page without accounting for pagination. The reason for this is, that finding out the total size of all indices can be really annoying if you have a lot of index sets. #4204
Reduce number of ES requests for IndexSetsResource#globalStats()
Instead of polling index statistics for each index in an index set,
we now only use a single request to get the required information.

@joschi joschi force-pushed the global-index-stats branch from de90bca to 8a341f9 Oct 10, 2017

Issue has been addressed by making backend call faster

@joschi

joschi approved these changes Oct 10, 2017

Remove unused imports
[ci skip]

@joschi joschi merged commit 9f60751 into master Oct 10, 2017

3 checks passed

ci-web-linter Jenkins build graylog-pr-linter-check 1995 has succeeded
Details
graylog-project/pr Jenkins build graylog-project-pr-snapshot 585 has succeeded
Details
license/cla Contributor License Agreement is signed.
Details

@joschi joschi deleted the global-index-stats branch Oct 10, 2017

@wafflebot wafflebot bot removed the ready-for-review label Oct 10, 2017

joschi added a commit that referenced this pull request Oct 10, 2017

Show stats of all index sets without pagination (#4211)
Add information and stats about all indices on the "Index Sets" page without accounting for pagination. The reason for this is, that finding out the total size of all indices can be really annoying if you have a lot of index sets.

Closes #4204

(cherry picked from commit 9f60751)
@lennartkoopmann

This comment has been minimized.

Member

lennartkoopmann commented Oct 14, 2017

Thank you @joschi! Came here to fix it and was pleasantly surprised. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment