Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non negligible divergence between recorded metric and original expression #1832

Closed
noushi opened this Issue Jul 20, 2016 · 3 comments

Comments

Projects
None yet
2 participants
@noushi
Copy link

noushi commented Jul 20, 2016

What did you do?
To speed up console node list, I created a record rule:

node:cpu_iowait:cur_avg = 100 * (avg without(cpu, mode) (irate(node_cpu{mode='iowait'}[5m])))

What did you expect to see?
I expected the recorded metric to be almost identical to the original expression.

What did you see instead? Under which circumstances?
Instead, graphing

abs(100 * (avg without(cpu, mode) (irate(node_cpu{mode='iowait'}[5m]))) - node:cpu_iowait:cur_avg)

shows a divergence of more than 20% on multiple occasions:
prom_divergence_between_cpu_iowait_recorded_and_original-15m_window png
prom_divergence_between_cpu_iowait_recorded_and_original-6h_window

DISCLAIMER: The server showing the highest divergence is one our most busiest.

Environment

  • System information:
$ uname -srm
Linux 3.19.0-61-generic x86_64`
  • Prometheus version:
    I'm using a Docker container.

From /status:

Version     0.20.0
Revision    aeab25c
  • Prometheus configuration file:
global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.

rule_files:
   - rules/node_speedup.rules

scrape_configs:

  - job_name: 'node'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    static_configs:
        <node_list>
        ...
  • Logs:

I didn't see any relevant entries.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jul 20, 2016

Given the default eval interval is 1m and you're scraping every 15s, this is an expected result when using irate. Try rate instead.

@noushi

This comment has been minimized.

Copy link
Author

noushi commented Jul 25, 2016

Thank you Brian, I'lll close this issue as I'm not working on it anymore.

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.