Skip to content
This repository has been archived by the owner on Jan 12, 2022. It is now read-only.

controller: Add metrics listener with initial event count metric as demo #1322

Merged
merged 3 commits into from
Aug 25, 2021

Conversation

nickbp
Copy link
Contributor

@nickbp nickbp commented Aug 25, 2021

Adds endpoint for getting metrics from the controller. Defaults to port 8900 but can be configured or disabled via optional --metrics-port arg. Resolves the TODO METRIC as an example to start with.

Need to check if there's anything needed to hook up these metrics into the system tenant...

$ curl -v http://localhost:8900/metrics

# HELP process_cpu_user_seconds_total Total user CPU time spent in seconds.
# TYPE process_cpu_user_seconds_total counter
process_cpu_user_seconds_total 1.224548

# HELP process_cpu_system_seconds_total Total system CPU time spent in seconds.
# TYPE process_cpu_system_seconds_total counter
process_cpu_system_seconds_total 0.106098

# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 1.330646

# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1629868507

[...]

# HELP nodejs_eventloop_lag_seconds Lag of event loop in seconds.
# TYPE nodejs_eventloop_lag_seconds gauge
nodejs_eventloop_lag_seconds 0

# HELP nodejs_eventloop_lag_min_seconds The minimum recorded event loop delay.
# TYPE nodejs_eventloop_lag_min_seconds gauge
nodejs_eventloop_lag_min_seconds 0.003028992

# HELP nodejs_eventloop_lag_max_seconds The maximum recorded event loop delay.
# TYPE nodejs_eventloop_lag_max_seconds gauge
nodejs_eventloop_lag_max_seconds 0.088735743

[...]

# HELP informer_events Incremented when a new k8s/graphql event is received
# TYPE informer_events counter
informer_events{type="FETCH_K8S_NAMESPACES_SUCCESS"} 1
informer_events{type="FETCH_K8S_V1ALERTMANAGERS_SUCCESS"} 1
informer_events{type="FETCH_K8S_V1PODMONITORS_SUCCESS"} 1
informer_events{type="FETCH_K8S_V1PROMETHEUSRULES_SUCCESS"} 1
informer_events{type="FETCH_K8S_V1PROMETHEUSS_SUCCESS"} 1
[...]
informer_events{type="ON_ADDED_K8S_CONFIGMAPS_REQUEST"} 50
informer_events{type="FETCH_K8S_CRDS_SUCCESS"} 1
informer_events{type="ON_ADDED_K8S_CRDS"} 16
informer_events{type="ON_UPDATED_K8S_STATEFUL_SETS"} 6
informer_events{type="ON_UPDATED_K8S_NODES"} 2

Signed-off-by: Nick Parker nick@opstrace.com

Nick Parker added 3 commits August 25, 2021 17:17
Adds endpoint for getting metrics from the controller. Defaults to port 8900 but can be configured or disabled via optional `--metrics-port` arg. Resolves the sole `TODO METRICS` entry as an example to start with.

```
$ curl -v http://localhost:8900/metrics

process_cpu_user_seconds_total 1.224548

process_cpu_system_seconds_total 0.106098

process_cpu_seconds_total 1.330646

process_start_time_seconds 1629868507

[...]

nodejs_eventloop_lag_seconds 0

nodejs_eventloop_lag_min_seconds 0.003028992

nodejs_eventloop_lag_max_seconds 0.088735743

[...]

informer_events{type="FETCH_K8S_NAMESPACES_SUCCESS"} 1
informer_events{type="FETCH_K8S_V1ALERTMANAGERS_SUCCESS"} 1
informer_events{type="FETCH_K8S_V1PODMONITORS_SUCCESS"} 1
informer_events{type="FETCH_K8S_V1PROMETHEUSRULES_SUCCESS"} 1
informer_events{type="FETCH_K8S_V1PROMETHEUSS_SUCCESS"} 1
[...]
informer_events{type="ON_ADDED_K8S_CONFIGMAPS_REQUEST"} 50
informer_events{type="FETCH_K8S_CRDS_SUCCESS"} 1
informer_events{type="ON_ADDED_K8S_CRDS"} 16
informer_events{type="ON_UPDATED_K8S_STATEFUL_SETS"} 6
informer_events{type="ON_UPDATED_K8S_NODES"} 2
```

Signed-off-by: Nick Parker <nick@opstrace.com>
Signed-off-by: Nick Parker <nick@opstrace.com>
Signed-off-by: Nick Parker <nick@opstrace.com>
@@ -126,7 +139,7 @@ async function main() {

if (require.main === module) {
process.on("SIGINT", function () {
log.debug("Received SIGINT, exiting");
log.info("Received SIGINT, exiting");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jgehrcke
Copy link
Contributor

Nice! We may want to add an HTTP server terminator though to make sure that we shutdown the NodeJS HTTP server within predictable time boundaries.

See for example

const httpServerTerminator = createHttpTerminator({

Copy link
Contributor

@jgehrcke jgehrcke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 let's get this in!

@jgehrcke jgehrcke merged commit f6989cd into main Aug 25, 2021
@jgehrcke jgehrcke deleted the nick/controller-metrics branch August 25, 2021 08:53
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants