Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Range selector over 16s is broken #3939

Closed
free opened this Issue Mar 9, 2018 · 7 comments

Comments

Projects
None yet
2 participants
@free
Copy link
Contributor

free commented Mar 9, 2018

Apologies for the weirdly specific and yet vague issue title. Essentially what I see happening is that when a function that takes a range selector (e.g. rate()) is used in a query_range request with a range selector of 16s over the start of a series with 1s resolution, it appears as if it will be fed the first 16 points in the series, then (as if those first 16 points suddenly disappeared), the following 1 point, 2 points, 3 points and so on. So essentially the rate graph looks like this:

rate foo 16s

The 17s rate (and most other range values except 15s, 31s, 32s, 61s...) look correct:

rate foo 17s

What did you do?

Did a /query_range?query=rate(gen_counter[16s])&step=1 over the start of a counter with 1s resolution, incrementing by 1 every second.

What did you expect to see?

A smooth increase from 0 to 1 over the first 16 seconds, followed by a flat rate of 1.

What did you see instead? Under which circumstances?

A smooth increase from 0 to 1, followed by a missing value, followed by a smooth increase from 0 to 1, followed by a flat rate of 1.

Environment

  • System information:

    Linux 4.13.0-36-generic x86_64

  • Prometheus version:

    prometheus, version 2.2.0 (branch: HEAD, revision: f63e7db)
    build user: root@52af9f66ce71
    build date: 20180308-16:40:42
    go version: go1.10

  • Prometheus configuration file:

# Global configuration
global:
  scrape_interval:     1s
  evaluation_interval: 1s

  external_labels:
      monitor: 'prometheus'

rule_files:
  - 'rules/*.rules.yml'
  • rules/generator.rules.yml:
groups:
- name: generator.rules
  rules:

  - record: gen_helper
    expr: 0
    labels:
      job: gen_test

  - record: gen_counter
    expr: (gen_counter + 1) or gen_helper

To reproduce, dump the contents of the config directory in the attached config.tar.gz into a fresh Prometheus 2.2.0 installation, fire up Prometheus, then go to http://localhost:9090/graph?g0.range_input=1m&g0.expr=rate(gen_counter%5B16s%5D)&g0.tab=0&g1.range_input=1h&g1.expr=gen_counter%5B100s%5D&g1.tab=1 and keep refreshing until you get the first half minute of the newly generated series covered by the rate(gen_counter[16s]) graph. I suspect the problem is somewhere in promql/engine.go but I couldn't figure out where exactly.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 9, 2018

That smells like an issue with BufferedSeriesIterator. Does the same happen with a 2s interval 32s range?

@free

This comment has been minimized.

Copy link
Contributor Author

free commented Mar 9, 2018

Same thing happens with 2s resolution and 31s or 32s range.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 9, 2018

16s is working for me, but I'm seeing the issue with 15s.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 9, 2018

Found the issue, it's in sampleRing.add. When the ring is doubled the value of the local variable l is not increased, so the wraparound logic in the pruning is using the wrong limit.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 9, 2018

Thanks for the detailed report, #3942 should fix this.

@free

This comment has been minimized.

Copy link
Contributor Author

free commented Mar 9, 2018

Thank you for the quick fix.

brian-brazil added a commit that referenced this issue Mar 12, 2018

brian-brazil added a commit that referenced this issue Mar 14, 2018

sipian pushed a commit to sipian/prometheus that referenced this issue May 18, 2018

sipian pushed a commit to sipian/prometheus that referenced this issue May 18, 2018

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.