Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Monitoring] Use server side pagination for Logstash Pipelines page #46587

Conversation

chrisronline
Copy link
Contributor

@chrisronline chrisronline commented Sep 25, 2019

Relates to #37246
Relates to #36358

This PR introduces server-side pagination for the Logstash Pipelines listing page, for both the cluster and individual Logstash nodes.

This work should severely improve the performance of these pages in production, where users have 10+ pipelines. The biggest change to cause this is our query to collect pipeline metrics now includes a filter terms query that limits the search to only the pipeline ids visible on the page.

We achieve this by moving out sorting/pagination/filtering logic from Kibana client to Kibana server. The decision to handle this in Kibana server versus Elasticsearch is due to ease of use and full control over the process. In an ideal world, we can craft queries against Elasticsearch to handle sorting/pagination/filtering but there are limitations - specifically around using composite aggregations with a combination of nested fields and unnested fields (reverse nested aggregation does not work in composite aggregations which is requirement for this to work in the pipelines listing page).

Because of some of the limitations in Elasticsearch, it makes more sense to handle all of this in Kibana server, which is a scalable solution as the number of data points used will always be limited by what the UI allows (which is small).

The basic gist of how this works is:

  • Kibana client requests pipeline data for the first page, with a size of 10 (example data)
  • Kibana server makes a request to get all pipeline ids (this is lightweight as there are no date histograms - just a simple nested terms aggregation)
  • Kibana server performs filtering/sorting/pagination logic based on the request paylaod
  • Kibana server creates a list of pipelineIds that are shown on the page, and sends this information to the getPipelines function
  • Kibana server getPipelines function then adds a filter terms query to limit the data to only documents that contain pipeline ids from the pipelineIds list
  • Kibana server returns the pipeline data

To accommodate this change, there is some client side work needed - specifically, moving away from EuiInMemoryTable usage to EuiBasicTable so we can let the server handle the filtering/pagination/sorting logic instead of the client.

This is only the start of this work. If this proves to work out well, we will apply this principle to other listing pages.

Performance

I wanted to test the performance on a real-life use case, preferably one where the pipelines page takes awhile to load to right now.

I found an ESMS cluster that had 8 Logstash nodes with 5 pipelines and over 300m events processed.

Using this ESMS cluster, I configured two separate Kibana instances to point to it for monitoring data - one running master and one running this PR. I used curl (and the time_starttransfer data point) to query the /api/monitoring/v1/clusters/<cluster_uuid>/logstash/pipelines endpoint and time the response with a sample size of 50 requests. For this PR, I sent this for pagination: {index:0,size:10} and the time range is the same 1h period of time. One important thing to note is the search.max_buckets for this cluster is set to 50,000.

Response Time

Master: 4.3473
PR: 1.3494

This is an improvement of 69% in terms of response time.

Buckets created

This particular API call does two separate requests to ES, so I'll list each bucket size.

Master: 726 and 10386
PR: 363 and 2779

That's an improvement of 50% and 73% respectively in terms of buckets created.

Since our size:10 in the call against this PR and the total number of pipelines is only 5, I thought I'd be interesting to also see something like size:3, which is:
Response Time: 0.4
Buckets: 242 and 968

This is pretty significant and should help out many users loading and using these two pages.

Testing

  • Start logstash with multiple pipelines (as well as multiple logstash with multiple pipelines), ensuring to enable monitoring through internal collection or MB.
  • View the logstash pipeline listing pages and ensure the pagination/sorting/filtering works as expected.

Notes for reviewers

  • SSP stands for Server Side Pagination, happy to change to something better
  • I purposely left out the setup mode logic from EuiMonitoringSSPTable because neither table using it supports setup mode, and I wanted to avoid merge conflicts when [Monitoring] Metricbeat Migration Wizard (last step!!) #45799 is merged

Misc

@elasticmachine
Copy link
Contributor

Pinging @elastic/stack-monitoring

@ycombinator
Copy link
Contributor

@chrisronline Just a suggestion: you might want to try and quantify some of the performance gains, not so much as hard numbers but more to give a ballpark of the magnitude of improvement. I did something similar in #31293 (see "Gains, by the numbers" section).

@elasticmachine
Copy link
Contributor

💔 Build Failed

@igoristic
Copy link
Contributor

igoristic commented Sep 25, 2019

This will be an awesome improvement! Great job @chrisronline ! 💪

Couple of things:

  • I noticed since pagination is now required it will fail with a 400 for pipelines list in Management > Pipelines (under Logstash)

  • When using search term queryText on a page greater than 1 you won't see any results. I think the pagination should be based on the filtered results (and rest to the first page)

  • Maybe a desired behavior, but the listing page seems to preserve the page I'm on. For example, if I go to page 4 in the pipelines list, then go to Dev Tools, and then back to the pipelines listing page it'll land me on page 4.

@chrisronline
Copy link
Contributor Author

Quick note here. This PR doesn't do anything to help with the number of buckets created. This is because we hard code a high limit, but we should be able to change this to a dynamic number based on the pagination request data. Stay tuned...

@chrisronline
Copy link
Contributor Author

Quick note here. This PR doesn't do anything to help with the number of buckets created. This is because we hard code a high limit, but we should be able to change this to a dynamic number based on the pagination request data. Stay tuned...

This is now resolved via ce049c5 and performance metrics are available in the PR description

@elasticmachine
Copy link
Contributor

💔 Build Failed

@chrisronline
Copy link
Contributor Author

I noticed since pagination is now required it will fail with a 400 for pipelines list in Management > Pipelines (under Logstash)

Great catch. This is fixed -> 4497a2a

When using search term queryText on a page greater than 1 you won't see any results. I think the pagination should be based on the filtered results (and rest to the first page)

Nice find. Fixed -> 76e357c

Maybe a desired behavior, but the listing page seems to preserve the page I'm on. For example, if I go to page 4 in the pipelines list, then go to Dev Tools, and then back to the pipelines listing page it'll land me on page 4.

Yea, this is a legacy feature. I think it's a good thing overall, but it can be slightly jarring if the user does not understand it will happen. I think it's super helpful for times where users need to leave a listing page to help understand something and then come back to it and not have to re-paginate. I'm honestly open to changing this if we think it's not necessary anymore.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Copy link
Contributor

@igoristic igoristic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some fine work good sir! 🏆

@igoristic
Copy link
Contributor

Yea, this is a legacy feature. I think it's a good thing overall, but it can be slightly jarring if the user does not understand it will happen. I think it's super helpful for times where users need to leave a listing page to help understand something and then come back to it and not have to re-paginate. I'm honestly open to changing this if we think it's not necessary anymore

@chrisronline Yeah, I'm okay with leaving it (since the behavior is indeed preexisting). My only concern is that it's not smart enough to know whether that page still exists or not, which can be cased by total number of pipelines shrinking, or going from cluster pipelines to node pipelines etc. But, I think this is separate issue

@ycombinator
Copy link
Contributor

I'm seeing a couple of issues:

  1. Sorting by the ID column no longer seems to be working as expected.

    id_sort_not_working

  2. Navigating to page 3, then sorting on a column puts the user back on page 1.

    reset_to_page_1

@chrisronline
Copy link
Contributor Author

Sorting by the ID column no longer seems to be working as expected.

Legit bug, nice find. Fixed: bc1f4d1

Navigating to page 3, then sorting on a column puts the user back on page 1.

This looks like an EUI thing -> https://github.com/elastic/eui/blob/master/src/components/basic_table/basic_table.js#L340

I'm following up with that team to find out more information

@chrisronline
Copy link
Contributor Author

@ycombinator

It's by design, and related to elastic/eui#467

Are you okay leaving it as-is as it matches how the rest of the tables work in Kibana?

@ycombinator
Copy link
Contributor

@chrisronline Merge conflicts.

@elasticmachine
Copy link
Contributor

💔 Build Failed

Copy link
Contributor

@ycombinator ycombinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Very nice, much overdue improvement!

@elasticmachine
Copy link
Contributor

💔 Build Failed

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@chrisronline chrisronline merged commit 96e40d6 into elastic:master Oct 2, 2019
@chrisronline chrisronline deleted the monitoring/logstash_pipelines_pagination branch October 2, 2019 19:09
chrisronline added a commit to chrisronline/kibana that referenced this pull request Oct 2, 2019
…lastic#46587)

* Basic version working for cluster pipelines

* More support

* Refactoring

* Fixes

* Fix sorting issues

* Reduce the number of buckets too

* Fix tests

* This is actually not helping - it seems that the filter in the query doesn't work as expected - maybe related to the fact that are using nested fields

* Add more data for metric.debug

* Support sorting on throughput and node count

* Fix broken test

* Use getMetrics and support with numOfBuckets parameter

* Fix test for realz

* Fix logstash management pages by introducing a new api to just retrieve ids

* We need this to go back to 1000 but it doesn't affect the number of created buckets

* Fix issue with pagination data when filtering

* Fix sorting by id not working

* Make this a little more sturdy
chrisronline added a commit that referenced this pull request Oct 2, 2019
…46587) (#47153)

* Basic version working for cluster pipelines

* More support

* Refactoring

* Fixes

* Fix sorting issues

* Reduce the number of buckets too

* Fix tests

* This is actually not helping - it seems that the filter in the query doesn't work as expected - maybe related to the fact that are using nested fields

* Add more data for metric.debug

* Support sorting on throughput and node count

* Fix broken test

* Use getMetrics and support with numOfBuckets parameter

* Fix test for realz

* Fix logstash management pages by introducing a new api to just retrieve ids

* We need this to go back to 1000 but it doesn't affect the number of created buckets

* Fix issue with pagination data when filtering

* Fix sorting by id not working

* Make this a little more sturdy
chrisronline added a commit that referenced this pull request Oct 2, 2019
…46587) (#47154)

* Basic version working for cluster pipelines

* More support

* Refactoring

* Fixes

* Fix sorting issues

* Reduce the number of buckets too

* Fix tests

* This is actually not helping - it seems that the filter in the query doesn't work as expected - maybe related to the fact that are using nested fields

* Add more data for metric.debug

* Support sorting on throughput and node count

* Fix broken test

* Use getMetrics and support with numOfBuckets parameter

* Fix test for realz

* Fix logstash management pages by introducing a new api to just retrieve ids

* We need this to go back to 1000 but it doesn't affect the number of created buckets

* Fix issue with pagination data when filtering

* Fix sorting by id not working

* Make this a little more sturdy
@chrisronline
Copy link
Contributor Author

Backport:

7.x: d5595eb
7.4: 2a07be0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants