[logging/metrics] Provide ability to log ops data from core metrics service #86162

lukeelmers · 2020-12-16T18:47:30Z

Related to #13241

In the legacy platform, we rely on the @hapi/good ops event for automatically logging ops metrics at a particular interval. The opsInterval is configurable and provided to the hapi/good configuration via the legacy logging server.

As part of our effort to remove our dependency on @hapi/good, we should provide some equivalent functionality in the Kibana platform logger via the metrics service.

Currently good includes a handful of metrics in the ops event payload**, however the legacy logger only pulls the following items from the ops event payload:

Ops formatted log property	Location in `@hapi/good` ops payload	Location in metrics service
type	event (`'ops'`)
`@timestamp`	timestamp
tags	-
pid	pid	process.pid
memory	proc.mem.heapUsed	process.memory.heap.used_in_bytes
uptime	proc.uptime	process.uptime_in_millis
load	os.load	os.load
delay	proc.delay	process.event_loop_delay

** More detail on the items good includes in the payload can be found in these old docs from even-better

Tasks:

Update the core metrics service to log these metrics at the ops.interval from config
- Metrics should already be collecting all of these items in different locations, and is already using ops.interval to specify the collection interval
Integration tests
Documentation on where to find these logs moving forward

Open questions / things to investigate:

Which logging context to use?
- 'metrics' probably makes the most sense unless there's a reason we want it to be separate from the logs related to the service itself
Is there anything we can remove/simplify in @kbn/legacy-logging that still preserves BWC?
- e.g. Can we remove some of the special handling we are doing in the log formatting so that we are reformatting the existing data from the metrics service instead of what comes from the hapi/good ops events? This would be cleaner.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-12-16T18:47:32Z

Pinging @elastic/kibana-core (Team:Core)

TinaHeiligers · 2021-01-11T18:52:08Z

@lukeelmers I've gone with intercepting refreshMetrics(), where we collect the metrics at the configured ops.interval, using a similar implementation to the legacy logging implementation to extract the data we're interested in and format it in the same manner:

Logging metrics from within refreshMetrics()

// src/core/server/metrics/metrics_service.ts

  private async refreshMetrics() {
    this.logger.debug('Refreshing metrics');
    const metrics = await this.metricsCollector!.collect(); // capture the metrics
    const opsLogsMetrics = this.extractOpsLogsData(metrics); // extract and format the metrics for logging
    this.logger.info(opsLogsMetrics); // log the metrics out

    this.metricsCollector!.reset();
    this.metrics$.next(metrics);
  }

extracting and formatting metrics for logging:

// src/core/server/mertics/metrics_service.ts

  private extractOpsLogsData({ process, os }: Partial<OpsMetrics>): string {
    const memoryLogEntryInMB = numeral(process?.memory?.heap?.used_in_bytes ?? 0).format('0.0b');
    // ProcessMetricsCollector converts from seconds to milliseconds. Format here is HH:mm:ss for backward compatibility
    const uptimeLogEntry = numeral((process?.uptime_in_millis ?? 0) / 1000).format('00:00:00');
    const loadLogEntry = [...Object.values(os?.load ?? [])]
      .map((val: number) => {
        return numeral(val).format('0.00');
      })
      .join(' ');
    const delayLogEntry = numeral(process?.event_loop_delay ?? 0).format('0.000');

    return `memory: ${memoryLogEntryInMB} uptime: ${uptimeLogEntry} load: [${loadLogEntry}] delay: ${delayLogEntry}`;
  }

Example of the output:

With logging appenders:

// kibana.dev.yml
logging:
  events:
    log: ['warning', 'error', 'fatal', 'info', 'reporting']
    json: false
    verbose: true
    # added for logging work
  appenders:
    console:
      kind: console
      layout:
        kind: pattern
        highlight: true

  root:
    appenders: [default]
    level: warn

  loggers:
    - context: metrics
      appenders: [console]
      level: info

// outputs:
[2021-01-11T18:03:01.138Z][INFO ][http.server.Kibana] http server running at http://localhost:5601/chr
[2021-01-11T18:03:04.088Z][INFO ][metrics] memory: 331.7MB uptime: 0:00:13 load: [2.86 4.13 3.38] delay: 0.288
[2021-01-11T18:03:04.428Z][INFO ][metrics] memory: 246.9MB uptime: 0:00:18 load: [2.86 4.13 3.38] delay: 0.223

Without appenders, the logs are forwarded to the legacy logging system and we will get duplicate data:

//outputs:
server    ops   [10:57:45.324]  memory: 356.6MB uptime: 0:00:21 load: [8.12 5.40 3.30] delay: 3.034				—> legacy logging
server    log   [10:57:45.802] [info][metrics] memory: 357.3MB uptime: 0:00:21 load: [8.12 5.40 3.30] delay: 0.155		—> NP logging forwarded to legacy  logging
[2021-01-11T17:57:47.954Z][INFO ][metrics] memory: 121.1MB uptime: 0:00:29 load: [8.12 5.40 3.30] delay: 0.226	—> NP logging from metrics service

I'd like to make sure this is the correct approach and there aren't any obvious issues with it, before diving into integration tests (for which I'll need to call on you for some more guidance 😉 ).

pgayvallet · 2021-01-14T14:57:49Z

Without appenders, the logs are forwarded to the legacy logging system and we will get duplicate data:

That's the same issue @lukeelmers encountered with with request/response logging in #13241. I guess the solution will be to log them under the debug level in core to avoid this duplication?

TinaHeiligers · 2021-01-14T18:02:34Z

@pgayvallet In my draft for this work, I've gone with logging the data under the info level.

lukeelmers added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Feature:Logging labels Dec 16, 2020

lukeelmers assigned TinaHeiligers Dec 16, 2020

lukeelmers mentioned this issue Dec 16, 2020

Log HTTP requests, responses #13241

Closed

3 tasks

TinaHeiligers mentioned this issue Jan 12, 2021

[core.logging] Add ops logs to the KP logging system #88070

Merged

4 tasks

TinaHeiligers closed this as completed in #88070 Jan 27, 2021

lukeelmers mentioned this issue Mar 16, 2021

[8.0] Remove @kbn/legacy-logging #50660

Closed

14 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[logging/metrics] Provide ability to log ops data from core metrics service #86162

[logging/metrics] Provide ability to log ops data from core metrics service #86162

lukeelmers commented Dec 16, 2020 •

edited

Loading

elasticmachine commented Dec 16, 2020

TinaHeiligers commented Jan 11, 2021

pgayvallet commented Jan 14, 2021

TinaHeiligers commented Jan 14, 2021 •

edited

Loading

[logging/metrics] Provide ability to log ops data from core metrics service #86162

[logging/metrics] Provide ability to log ops data from core metrics service #86162

Comments

lukeelmers commented Dec 16, 2020 • edited Loading

elasticmachine commented Dec 16, 2020

TinaHeiligers commented Jan 11, 2021

pgayvallet commented Jan 14, 2021

TinaHeiligers commented Jan 14, 2021 • edited Loading

lukeelmers commented Dec 16, 2020 •

edited

Loading

TinaHeiligers commented Jan 14, 2021 •

edited

Loading