Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(metrics): add runtime metrics #1021

Merged
merged 42 commits into from
Jul 2, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
90de51f
feat(metrics): add runtime metrics
Apr 17, 2019
b8fa5f9
chore(metrics): js event loop delay fallback
May 1, 2019
18faab3
docs(metrics): improve nodejs.loop.delay description
May 7, 2019
8520b04
chore(metrics): improve metric names
May 8, 2019
0f8e0e2
refactor(metrics): pull out monitor-event-loop-delay polyfill
Qard May 11, 2019
98b381d
chore(metrics): improve naming and cpu usage calculation
Qard May 16, 2019
ffb2871
docs(metrics): improve event loop delay description
Qard Jun 3, 2019
f36f4ee
feat(metrics): replace secondary timers with triggered collect
Qard Jun 7, 2019
b414807
feat(metrics): process system and user percent
Qard Jun 8, 2019
271a0e6
refactor(metrics): improve collector registration process
Qard Jun 13, 2019
22cd9e0
Merge branch 'master' into runtime-metrics
watson Jun 17, 2019
986300a
fix: active_{handles|requests} => {handles|requests}.active
watson Jun 17, 2019
3e19b85
perf: optimize gathering of process cpu metrics
watson Jun 17, 2019
d0a6a86
test: include test/metrics folder in our tests
watson Jun 17, 2019
265b244
test: improve test output
watson Jun 17, 2019
d4e4768
fix: include process metrics on Linux
watson Jun 17, 2019
b12fda1
test: fix test output
watson Jun 17, 2019
1a045e6
test: fix system.memory.actual.free test on Linux
watson Jun 17, 2019
8a16e5c
perf: improve algorithm for reading /proc/meminfo
watson Jun 17, 2019
81554ad
fix(package): bump measured-reporting to ^1.49.0
watson Jun 18, 2019
270cb8e
docs: fix docs
watson Jun 18, 2019
28972c8
fix: ensure CPU metrics are read correctly on Linux
watson Jun 20, 2019
219cc27
Merge branch 'master' into runtime-metrics
watson Jun 20, 2019
6b30904
fix: comment
watson Jun 20, 2019
f9adf3a
test: fix metricset tags
watson Jun 20, 2019
22f28f3
fix(metrics): don't silently catch errors in metrics code
watson Jul 1, 2019
454f14f
perf(metrics): re-use regex
watson Jul 1, 2019
cee2ad2
refactor: don't pre-calc system.process.cpu.total.norm.pct
watson Jul 1, 2019
2bf0a1a
chore(metrics): use more modern JavaScript
watson Jul 1, 2019
3a805a2
perf(metrics): prefer buf.toString() over String(buf)
watson Jul 1, 2019
e132dbc
refactor(metrics): no need to keep rss in variable
watson Jul 1, 2019
ff4909a
fix(metrics): improve process line parsing
watson Jul 1, 2019
56f63b2
refactor(metrics): cpu line will always be first in /proc/self
watson Jul 1, 2019
0adba56
fix(metrics): include all /proc/self fields in total cpu calc
watson Jul 1, 2019
c71f481
chore(metrics): code cleanup
watson Jul 1, 2019
f39c424
fix(metrics): ensure CPU can't be larger than 100%
watson Jul 1, 2019
9c55712
perf(metrics): speed up parsing of /proc files
watson Jul 1, 2019
478ddfa
test(metrics): improve test output
watson Jul 1, 2019
a98a4fd
test(metrics): expect 2nd metrics run to spend more than 0% cpu
watson Jul 1, 2019
9029cf8
test(metrics): make tests less flaky
watson Jul 1, 2019
b060f3e
fix(metrics): ensure CPU can't be reported as NaN% on non-Linux
watson Jul 1, 2019
32eb0c0
Merge branch 'master' into runtime-metrics
watson Jul 1, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 68 additions & 0 deletions docs/metrics.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -60,3 +60,71 @@ This value is normalized by the number of CPU cores and it ranges from 0 to 100%

The Resident Set Size,
the amount of memory the process occupies in main memory (RAM).

[float]
[[metric-nodejs.handles.active]]
=== `nodejs.handles.active`

* *Type:* Long
* *Format:* Counter

The number of active libuv handles,
likely held open by currently running I/O operations.

[float]
[[metric-nodejs.requests.active]]
=== `nodejs.requests.active`

* *Type:* Long
* *Format:* Counter

The number of active libuv requests,
likely waiting for a response to an I/O operation.

[float]
[[metric-system.process.cpu.user.norm.pct]]
=== `system.process.cpu.user.norm.pct`

* *Type:* Long
* *Format:* Counter

The number of CPU cycles spent executing application code.

[float]
[[metric-system.process.cpu.system.norm.pct]]
=== `system.process.cpu.system.norm.pct`

* *Type:* Long
* *Format:* Counter

The number of CPU cycles spent executing kernel code as a result of application activity.

[float]
[[metric-nodejs.eventloop.delay.avg.ms]]
=== `nodejs.eventloop.delay.avg.ms`

* *Type:* Float
* *Format:* Milliseconds

The number of milliseconds of event loop delay.
watson marked this conversation as resolved.
Show resolved Hide resolved
Event loop delay is sampled every 10 milliseconds.
Delays shorter than 10ms may not be observed,
for example if a blocking operation starts and ends within the same sampling period.

[float]
[[metric-nodejs.memory.heap.allocated.bytes]]
=== `nodejs.memory.heap.allocated.bytes`

* *Type:* Long
* *Format:* Bytes

The current allocated heap size in bytes.

[float]
[[metric-nodejs.memory.heap.used.bytes]]
=== `nodejs.memory.heap.used.bytes`

* *Type:* Long
* *Format:* Bytes

The currently used heap size in bytes.
16 changes: 13 additions & 3 deletions lib/metrics/platforms/generic/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ const os = require('os')

const semver = require('semver')

const Stats = require('./stats')

module.exports = function createSystemMetrics (registry) {
// Base system metrics
registry.getOrCreateGauge(
Expand All @@ -22,10 +24,18 @@ module.exports = function createSystemMetrics (registry) {
// Process metrics
// NOTE: Process CPU metrics are not supported on 6.0.x
if (semver.satisfies(process.versions.node, '>=6.1')) {
registry.getOrCreateGauge(
const stats = new Stats()
registry.registerCollector(stats)

const metrics = [
'system.process.cpu.total.norm.pct',
require('./process-cpu')
)
'system.process.cpu.system.norm.pct',
'system.process.cpu.user.norm.pct'
]

for (let metric of metrics) {
registry.getOrCreateGauge(metric, () => stats.toJSON()[metric])
}
}
registry.getOrCreateGauge(
'system.process.memory.rss.bytes',
Expand Down
7 changes: 6 additions & 1 deletion lib/metrics/platforms/generic/process-cpu.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,10 @@ const processTop = require('./process-top')()
const cpus = os.cpus()

module.exports = function processCPUUsage () {
return processTop.cpu().percent / cpus.length
const cpu = processTop.cpu()
return {
total: cpu.percent / cpus.length,
user: (cpu.user / cpu.time) / cpus.length,
system: (cpu.system / cpu.time) / cpus.length
}
}
28 changes: 28 additions & 0 deletions lib/metrics/platforms/generic/stats.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
'use strict'

const processCpu = require('./process-cpu')

class Stats {
constructor () {
this.stats = {
'system.process.cpu.total.norm.pct': 0,
'system.process.cpu.system.norm.pct': 0,
'system.process.cpu.user.norm.pct': 0
}
}

toJSON () {
return this.stats
}

collect (cb) {
const cpu = processCpu()
this.stats['system.process.cpu.total.norm.pct'] = cpu.total
this.stats['system.process.cpu.system.norm.pct'] = cpu.system
this.stats['system.process.cpu.user.norm.pct'] = cpu.user

if (cb) process.nextTick(cb)
}
}

module.exports = Stats
2 changes: 1 addition & 1 deletion lib/metrics/platforms/generic/system-cpu.js
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ function cpuAverage () {
function cpuPercent (last, next) {
const idle = next.idle - last.idle
const total = next.total - last.total
return 1 - idle / total
return 1 - idle / total || 0
}

let last = cpuAverage()
Expand Down
16 changes: 3 additions & 13 deletions lib/metrics/platforms/linux/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,12 @@

const Stats = require('./stats')

module.exports = function createSystemMetrics (registry, reportingIntervalInSeconds) {
module.exports = function createSystemMetrics (registry) {
const stats = new Stats()
stats.start(reportingIntervalInSeconds)

const metrics = [
'system.cpu.total.norm.pct',
'system.memory.total',
'system.memory.actual.free',
'system.process.cpu.total.norm.pct',
'system.process.memory.size',
'system.process.memory.rss.bytes'
]
registry.registerCollector(stats)

for (let metric of metrics) {
for (let metric of Object.keys(stats.toJSON())) {
registry.getOrCreateGauge(metric, () => stats.toJSON()[metric])
}

registry.collector = stats
}
Loading