Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep EOF at end of OpenMetrics output #7982

Merged
merged 1 commit into from Nov 10, 2023

Conversation

tjquinno
Copy link
Member

@tjquinno tjquinno commented Nov 9, 2023

Description

Resolves #7981

Now that Helidon's metrics is based on Micrometer, we use Micrometer's Prometheus meter registry for formatting the output. The Prometheus meter registry handles both Prometheus exposition format (which does not include a trailing EOF line) and OpenMetrics (which does).

That said, our formatting needs to do some processing on the output from that registry before returning it and that processing was incorrectly removing the # EOF string from the end of OpenMetrics output.

The changes here remove that trimming and also enhance unit tests of the formatter to:

  1. Select OpenMetrics as the output (since that's the format, not the Prometheus exposition format, that has the # EOF trailer.
  2. Make sure the formatted output ends with that trailer.
  3. Slightly revise the expected results. Prometheus format allows for (and the Micrometer Prometheus meter registry includes) a trailing , after tags; the OpenMetrics output does not have that trailing , and since the tests now specify OpenMetrics format they need to not expect the trailing , after tags.

As a test specific to the use case in the issue, I ran the SE Quickstart app and started a Prometheus server locally using the prometheus.yml configuration file below. The Prometheus server now accepts and processes the response from the metrics endpoint. (Most of the config is boilerplate; the interesting part is the last job in the scrape_config section.)

# my global config
global:
  scrape_interval:     5s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']

  - job_name: 'helidon'
    metrics_path: /observe/metrics
    static_configs:
    - targets: ['localhost:8080']

Documentation

No doc impact - this is a bug fix.

Signed-off-by: Tim Quinn <tim.quinn@oracle.com>
@tjquinno tjquinno self-assigned this Nov 9, 2023
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Nov 9, 2023
@tjquinno tjquinno merged commit 31afdb4 into helidon-io:main Nov 10, 2023
12 checks passed
@tjquinno tjquinno deleted the 4.x-prom-eof branch November 10, 2023 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Helidon 4 SE] Metrics format not compatible with Prometheus: data does not end with # EOF
2 participants