Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LIMIT returns a non-deterministic series #3166

Closed
beckettsean opened this issue Jun 26, 2015 · 4 comments
Closed

LIMIT returns a non-deterministic series #3166

beckettsean opened this issue Jun 26, 2015 · 4 comments
Assignees
Milestone

Comments

@beckettsean
Copy link
Contributor

I'm using telegraf to write data, it writes a series for each CPU core, so I have 8 tag sets underneath the cpu_system measurement. Here's the most recent point from all 8 series:

> select value, cpu from cpu_system limit 1
name: cpu_system
tags: cpu=cpu0
time                value
----                -----
2015-06-26T19:21:54.661779022Z  61665.6171875


name: cpu_system
tags: cpu=cpu1
time                value
----                -----
2015-06-26T19:21:54.661779022Z  4253.4453125


name: cpu_system
tags: cpu=cpu2
time                value
----                -----
2015-06-26T19:21:54.661779022Z  34324.8046875


name: cpu_system
tags: cpu=cpu3
time                value
----                -----
2015-06-26T19:21:54.661779022Z  4455.140625


name: cpu_system
tags: cpu=cpu4
time                value
----                -----
2015-06-26T19:21:54.661779022Z  35394.4921875


name: cpu_system
tags: cpu=cpu5
time                value
----                -----
2015-06-26T19:21:54.661779022Z  4444.15625


name: cpu_system
tags: cpu=cpu6
time                value
----                -----
2015-06-26T19:21:54.661779022Z  34706.5703125


name: cpu_system
tags: cpu=cpu7
time                value
----                -----
2015-06-26T19:21:54.661779022Z  4447.0546875

If I then run the select statement without selecting the tag (cpu) then I get the most recent point from a randomly selected series:

(note that the timestamps are identical, and the values are all contained in the set of 8 above, and they do repeat randomly)

@pauldix and I agree this should be deterministic behavior. According to him, LIMIT should pull from the series with the lowest index. Since I'm on a single-node system the index should be consistent. However, the series chosen is not.

> select value from cpu_system limit 1
name: cpu_system
----------------
time                value
2015-06-26T19:21:54.661779022Z  61665.6171875

> select value from cpu_system limit 1
name: cpu_system
----------------
time                value
2015-06-26T19:21:54.661779022Z  34324.8046875

> select value from cpu_system limit 1
name: cpu_system
----------------
time                value
2015-06-26T19:21:54.661779022Z  34324.8046875

> select value from cpu_system limit 1
name: cpu_system
----------------
time                value
2015-06-26T19:21:54.661779022Z  35394.4921875

> select value from cpu_system limit 1
name: cpu_system
----------------
time                value
2015-06-26T19:21:54.661779022Z  4447.0546875

> select value from cpu_system limit 1
name: cpu_system
----------------
time                value
2015-06-26T19:21:54.661779022Z  34706.5703125

> select value from cpu_system limit 1
name: cpu_system
----------------
time                value
2015-06-26T19:21:54.661779022Z  61665.6171875

> select value from cpu_system limit 1
name: cpu_system
----------------
time                value
2015-06-26T19:21:54.661779022Z  4447.0546875

> select value from cpu_system limit 1
name: cpu_system
----------------
time                value
2015-06-26T19:21:54.661779022Z  61665.6171875

> select value from cpu_system limit 1
name: cpu_system
----------------
time                value
2015-06-26T19:21:54.661779022Z  4444.15625

> select value from cpu_system limit 1
name: cpu_system
----------------
time                value
2015-06-26T19:21:54.661779022Z  4253.4453125

> select value from cpu_system limit 1
name: cpu_system
----------------
time                value
2015-06-26T19:21:54.661779022Z  4447.0546875

> select value from cpu_system limit 1
name: cpu_system
----------------
time                value
2015-06-26T19:21:54.661779022Z  34706.5703125

> select value from cpu_system limit 1
name: cpu_system
----------------
time                value
2015-06-26T19:21:54.661779022Z  4447.0546875

> select value from cpu_system limit 1
name: cpu_system
----------------
time                value
2015-06-26T19:21:54.661779022Z  4444.15625

> select value from cpu_system limit 1
name: cpu_system
----------------
time                value
2015-06-26T19:21:54.661779022Z  4253.4453125
@beckettsean beckettsean added this to the 0.9.3 milestone Jul 15, 2015
@beckettsean beckettsean modified the milestones: Next Point Release, 0.9.3 Aug 6, 2015
@jsternberg
Copy link
Contributor

@beckettsean is it possible to check if this is still valid with the new query engine?

@beckettsean
Copy link
Contributor Author

It is still not deterministic in 0.11.1:

> select * from cpu limit 1
name: cpu
---------
time            cpu host        usage_guest usage_guest_nice    usage_idle      usage_iowait    usage_irq   usage_nice  usage_softirq   usage_steal usage_system        usage_user
1456350580000000000 cpu1    sean-stable 0       0           99.79939819458393   0       0       0       0       0       0.10030090270812698 0.10030090270812253

> select * from cpu limit 1
name: cpu
---------
time            cpu host        usage_guest usage_guest_nice    usage_idle      usage_iowait    usage_irq   usage_nice  usage_softirq   usage_steal usage_system        usage_user
1456350580000000000 cpu0    sean-stable 0       0           99.49899799599173   0       0       0       0       0       0.30060120240480265 0.20040080160320178

@beckettsean
Copy link
Contributor Author

explicit ORDER BY does not introduce determinism, either:

> select * from cpu order by time asc limit 1
name: cpu
---------
time            cpu host        usage_guest usage_guest_nice    usage_idle      usage_iowait    usage_irq   usage_nice  usage_softirq   usage_steal usage_system        usage_user
1456350580000000000 cpu0    sean-stable 0       0           99.49899799599173   0       0       0       0       0       0.30060120240480265 0.20040080160320178

> select * from cpu order by time asc limit 1
name: cpu
---------
time            cpu     host        usage_guest usage_guest_nice    usage_idle      usage_iowait    usage_irq   usage_nice  usage_softirq       usage_steal usage_system        usage_user
1456350580000000000 cpu-total   sean-stable 0       0           99.59919839679381   0       0       0       0.05010020040080163 0       0.2004008016032068  0.15030060120241065

@jsternberg jsternberg self-assigned this Apr 7, 2016
@jsternberg
Copy link
Contributor

Ok, I'll track this as something to potentially work on for 0.13.

@jsternberg jsternberg modified the milestones: 0.13.0, Future Point Release Apr 7, 2016
jsternberg added a commit that referenced this issue Apr 18, 2016
The series keys within a tag set were previously not sorted which would
cause the output to be non-deterministic. This sorts the output series
by their keys so it has a consistent output especially when using
limits.

Fixes #3166.
@timhallinflux timhallinflux modified the milestones: 1.0.0, 0.13.0 Dec 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants