Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Monitoring] Fix a couple of issues with the cpu usage alert #80737

Merged
merged 4 commits into from
Oct 26, 2020

Conversation

chrisronline
Copy link
Contributor

Resolves #80689
Resolves #80684

@ravikesarwani found a couple issues with the CPU alert and this PR aims to address both of them.

First, we discovered that the Node -> Advanced page wasn't properly filtering out alert instances for other nodes.

Second, the way we calculate container CPU utilization didn't match what we are doing for the CPU utilization chart, which is actually more accurate. A couple of the data points are running counters (usage and periods cgroup data) and need to be treated as such, by using a derivative to properly use them. In order to use a derivative, we need to use a date_histogram even though we have no need for multiple buckets, we just need them to properly calculate the derivative so we are using a fixed_interval set to the same time range as the duration parameter of the alert (which is also used in the same query in a range filter)

@elasticmachine
Copy link
Contributor

Pinging @elastic/stack-monitoring (Team:Monitoring)

@chrisronline
Copy link
Contributor Author

@elasticmachine merge upstream

1 similar comment
@chrisronline
Copy link
Contributor Author

@elasticmachine merge upstream

} else {
cpuUsage = stat.cpuUsage;
stat.cpuUsage = stat.cpuUsage;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need this else clause here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

async chunks size

id before after diff
monitoring 1.1MB 1.1MB +498.0B

page load bundle size

id before after diff
monitoring 247.7KB 248.3KB +605.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Contributor

@igoristic igoristic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍

@chrisronline chrisronline merged commit d784840 into elastic:master Oct 26, 2020
@chrisronline chrisronline deleted the monitoring/cpu_usage_fixes branch October 26, 2020 19:48
chrisronline added a commit to chrisronline/kibana that referenced this pull request Oct 26, 2020
…#80737)

* Fix a couple of issues with the cpu usage alert

* Fix tests

* PR feedback
# Conflicts:
#	x-pack/plugins/monitoring/public/components/elasticsearch/node/advanced.js
chrisronline added a commit to chrisronline/kibana that referenced this pull request Oct 26, 2020
…#80737)

* Fix a couple of issues with the cpu usage alert

* Fix tests

* PR feedback
# Conflicts:
#	x-pack/plugins/monitoring/public/components/elasticsearch/node/advanced.js
chrisronline added a commit that referenced this pull request Oct 26, 2020
…#81671)

* Fix a couple of issues with the cpu usage alert

* Fix tests

* PR feedback
# Conflicts:
#	x-pack/plugins/monitoring/public/components/elasticsearch/node/advanced.js
chrisronline added a commit that referenced this pull request Oct 26, 2020
…#81669)

* Fix a couple of issues with the cpu usage alert

* Fix tests

* PR feedback
# Conflicts:
#	x-pack/plugins/monitoring/public/components/elasticsearch/node/advanced.js
gmmorris added a commit to gmmorris/kibana that referenced this pull request Oct 27, 2020
* master: (37 commits)
  [ILM] Migrate Warm phase to Form Lib (elastic#81323)
  [Security Solutions][Detection Engine] Fixes critical bug with error reporting that was doing a throw (elastic#81549)
  [Detection Rules] Add 7.10 rules (elastic#81676)
  [kbn/optimizer] ignore missing metrics when updating limits with --focus (elastic#81696)
  [SECURITY SOLUTIONS] Bugs overview page + investigate eql in timeline (elastic#81550)
  [Maps] fix unable to edit cluster vector styles styled by count when switching to super fine grid resolution (elastic#81525)
  Fixed migration issue for case specific actions, by extending email action migrator checks (elastic#81673)
  [CI] Preparation for APM tracking on CI (elastic#80399)
  [Home] Fixes Kibana app description order on home page and updates Canvas copy (elastic#80057)
  Make sure `to` is 'now' and not the same as `from` (elastic#81524)
  Nitpicking the 8.0 Breaking Change issue template (elastic#81678)
  [SECURITY_SOLUTION] Fix text on onboarding screen (elastic#81672)
  [data.search] Skip async search tests in build candidates and production builds (elastic#81547)
  Fix previousStartedAt by not changing when execution fails (elastic#81388)
  [Monitoring] Fix a couple of issues with the cpu usage alert (elastic#80737)
  Telemetry collection xpack to ts project references (elastic#81269)
  Elasticsearch: don't use url authentication for new client (elastic#81564)
  [App Search] Credentials: implement working flyout form (elastic#81541)
  Properly encode links to edit user page (elastic#81562)
  [Alerting UI] Don't wait for health check before showing Create Alert flyout (elastic#80996)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants