Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add normalized CPU values and number of cores #4553

Merged
merged 3 commits into from
Jun 26, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
112 changes: 105 additions & 7 deletions metricbeat/docs/fields.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -9870,7 +9870,7 @@ System status metrics, like CPU and memory usage, that are collected from the op
[float]
== core Fields

`system-core` contains local CPU core stats.
`system-core` contains CPU metrics for a single core of a multi-core system.



Expand All @@ -9889,7 +9889,7 @@ type: scaled_float

format: percent

The percentage of CPU time spent in user space. On multi-core systems, you can have percentages that are greater than 100%. For example, if 3 cores are at 60% use, then the `cpu.user_p` will be 180%.
The percentage of CPU time spent in user space.


[float]
Expand Down Expand Up @@ -10038,7 +10038,7 @@ The amount of CPU time spent in involuntary wait by the virtual CPU while the hy

type: long

The number of CPU cores. The CPU percentages can range from `[0, 100% * cores]`.
The number of CPU cores present on the host. The non-normalized percentages will have a maximum value of `100% * cores`. The normalized percentages already take this value into account and have a maximum value of 100%.


[float]
Expand Down Expand Up @@ -10121,6 +10121,86 @@ format: percent
The percentage of CPU time spent in involuntary wait by the virtual CPU while the hypervisor was servicing another processor. Available only on Unix.


[float]
=== system.cpu.user.norm.pct

type: scaled_float

format: percent

The percentage of CPU time spent in user space.


[float]
=== system.cpu.system.norm.pct

type: scaled_float

format: percent

The percentage of CPU time spent in kernel space.


[float]
=== system.cpu.nice.norm.pct

type: scaled_float

format: percent

The percentage of CPU time spent on low-priority processes.


[float]
=== system.cpu.idle.norm.pct

type: scaled_float

format: percent

The percentage of CPU time spent idle.


[float]
=== system.cpu.iowait.norm.pct

type: scaled_float

format: percent

The percentage of CPU time spent in wait (on disk).


[float]
=== system.cpu.irq.norm.pct

type: scaled_float

format: percent

The percentage of CPU time spent servicing and handling hardware interrupts.


[float]
=== system.cpu.softirq.norm.pct

type: scaled_float

format: percent

The percentage of CPU time spent servicing and handling software interrupts.


[float]
=== system.cpu.steal.norm.pct

type: scaled_float

format: percent

The percentage of CPU time spent in involuntary wait by the virtual CPU while the hypervisor was servicing another processor. Available only on Unix.


[float]
=== system.cpu.user.ticks

Expand Down Expand Up @@ -10419,7 +10499,7 @@ Total space (used plus free).
[float]
== load Fields

Load averages.
CPU load averages.



Expand Down Expand Up @@ -10452,23 +10532,31 @@ Load average for the last 15 minutes.

type: scaled_float

Load divided by the number of cores for the last minute.
Load for the last minute divided by the number of cores.


[float]
=== system.load.norm.5

type: scaled_float

Load divided by the number of cores for the last 5 minutes.
Load for the last 5 minutes divided by the number of cores.


[float]
=== system.load.norm.15

type: scaled_float

Load divided by the number of cores for the last 15 minutes.
Load for the last 15 minutes divided by the number of cores.


[float]
=== system.load.cores

type: long

The number of CPU cores present on the host.


[float]
Expand Down Expand Up @@ -10789,6 +10877,16 @@ format: percent
The percentage of CPU time spent by the process since the last update. Its value is similar to the %CPU value of the process displayed by the top command on Unix systems.


[float]
=== system.process.cpu.total.norm.pct

type: scaled_float

format: percent

The percentage of CPU time spent by the process since the last event. This value is normalized by the number of CPU cores and it ranges from 0 to 100%.


[float]
=== system.process.cpu.system

Expand Down
19 changes: 19 additions & 0 deletions metricbeat/mb/testing/modules.go
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,25 @@ func NewEventsFetcher(t testing.TB, config interface{}) mb.EventsFetcher {
return fetcher
}

func NewReportingMetricSet(t testing.TB, config interface{}) mb.ReportingMetricSet {
metricSet := newMetricSet(t, config)

reportingMetricSet, ok := metricSet.(mb.ReportingMetricSet)
if !ok {
t.Fatal("MetricSet does not implement ReportingMetricSet")
}

return reportingMetricSet
}

// ReportingFetch runs the given reporting metricset and returns all of the
// events and errors that occur during that period.
func ReportingFetch(metricSet mb.ReportingMetricSet) ([]common.MapStr, []error) {
r := &capturingReporter{}
metricSet.Fetch(r)
return r.events, r.errs
}

// NewPushMetricSet instantiates a new PushMetricSet using the given
// configuration. The ModuleFactory and MetricSetFactory are obtained from the
// global Registry.
Expand Down
5 changes: 3 additions & 2 deletions metricbeat/metricbeat.full.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,9 @@ metricbeat.modules:
period: 10s
processes: ['.*']

# if true, exports the CPU usage in ticks, together with the percentage values
#cpu_ticks: false
# Configure the metric types that are included by these metricsets.
cpu.metrics: ["percentages"] # The other available options are normalized_percentages and ticks.
core.metrics: ["percentages"] # The other available option is ticks.

# These options allow you to filter out all processes that are not
# in the top N by CPU or memory, in order to reduce the number of documents created.
Expand Down
5 changes: 3 additions & 2 deletions metricbeat/module/system/_meta/config.full.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,9 @@
period: 10s
processes: ['.*']

# if true, exports the CPU usage in ticks, together with the percentage values
#cpu_ticks: false
# Configure the metric types that are included by these metricsets.
cpu.metrics: ["percentages"] # The other available options are normalized_percentages and ticks.
core.metrics: ["percentages"] # The other available option is ticks.

# These options allow you to filter out all processes that are not
# in the top N by CPU or memory, in order to reduce the number of documents created.
Expand Down
21 changes: 10 additions & 11 deletions metricbeat/module/system/core/_meta/data.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,14 @@
},
"system": {
"core": {
"id": 0,
"id": 1,
"idle": {
"pct": 0.9063,
"ticks": 22204290
"pct": 0.98,
"ticks": 243110733
},
"iowait": {
"pct": 0,
"ticks": 79386
"ticks": 0
},
"irq": {
"pct": 0,
Expand All @@ -30,21 +30,20 @@
},
"softirq": {
"pct": 0,
"ticks": 7944
"ticks": 0
},
"steal": {
"pct": 0,
"ticks": 0
},
"system": {
"pct": 0.0208,
"ticks": 160489
"pct": 0,
"ticks": 840906
},
"user": {
"pct": 0.0729,
"ticks": 417331
"pct": 0.02,
"ticks": 1266791
}
}
},
"type": "metricsets"
}
}
7 changes: 3 additions & 4 deletions metricbeat/module/system/core/_meta/fields.yml
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
- name: core
type: group
description: >
`system-core` contains local CPU core stats.
`system-core` contains CPU metrics for a single core of a multi-core system.
fields:
- name: id
type: long
description: >
CPU Core number.

# Percentages
- name: user.pct
type: scaled_float
format: percent
description: >
The percentage of CPU time spent in user space. On multi-core systems, you can have percentages that are greater than 100%.
For example, if 3 cores are at 60% use, then the `cpu.user_p` will be 180%.
The percentage of CPU time spent in user space.

- name: user.ticks
type: long
Expand Down Expand Up @@ -100,4 +100,3 @@
The amount of CPU time spent in involuntary wait by the virtual CPU while the hypervisor
was servicing another processor.
Available only on Unix.

46 changes: 46 additions & 0 deletions metricbeat/module/system/core/config.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
package core

import (
"strings"

"github.com/elastic/beats/libbeat/logp"
"github.com/pkg/errors"
)

// Core metric types.
const (
percentages = "percentages"
ticks = "ticks"
)

// Config for the system core metricset.
type Config struct {
Metrics []string `config:"core.metrics"`
CPUTicks *bool `config:"cpu_ticks"` // Deprecated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we deprecate it in 5.6 and remove it here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin Any reason to deprecate it in 5.6 instead of 6.0.0?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like deprecated features in the code base. If don't do it now, we have to wait until 7.0 to remove it. Seems like a good time to do it now. Any disadvantage doing it now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to add deprecation warnings in 5.6 when there is no replacement for cpu_ticks in 5.6? I thought there should be some period of overlap where the two config options are available to aid in migration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean no replacement in 6.0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant 5.6. But possibly I misunderstood the original suggestion. You were suggesting adding the deprecation warnings in 5.6 and not having a migration path in 5.6 (e.g. we are telling the user not to use cpu_ticks, but not offering them an alternative in 5.6)?

}

// Validate validates the core config.
func (c Config) Validate() error {
if c.CPUTicks != nil {
logp.Deprecate("6.1", "cpu_ticks is deprecated. Add 'ticks' to the core.metrics list.")
}

if len(c.Metrics) == 0 {
return errors.New("core.metrics cannot be empty")
}

for _, metric := range c.Metrics {
switch strings.ToLower(metric) {
case percentages, ticks:
default:
return errors.Errorf("invalid core.metrics value '%v' (valid "+
"options are %v and %v)", metric, percentages, ticks)
}
}

return nil
}

var defaultConfig = Config{
Metrics: []string{percentages},
}
Loading