Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System iterators returning data out of order #8695

Closed
e-dard opened this issue Aug 14, 2017 · 1 comment · Fixed by #8735
Closed

System iterators returning data out of order #8695

e-dard opened this issue Aug 14, 2017 · 1 comment · Fixed by #8735
Assignees
Labels
area/influxql Issues related to InfluxQL query language kind/bug
Milestone

Comments

@e-dard
Copy link
Contributor

e-dard commented Aug 14, 2017

Currently system iterators are not being merged correctly across multiple shards. Specifically they're being merged in an arbitrary order.

While it appears as if this issue is only affecting the tsi1 index, the iterators are incorrectly being merged in both the inmem and tsi1 indexes. However, since the inmem index is actually duplicated across shards, the result of merging system iterators is correct (even though the logic to merge them is faulty!!!)

This caused me much frustration and amusement at the same time when debugging this issue.

Using the tsi1 index, here are some examples of incorrect behaviour:

SHOW SERIES
>
> create database db
> insert cpu,host=serverB value=1
> insert cpu,host=serverA value=1 1689894000000000000
>
>
> show series
key
---
cpu,host=serverB
cpu,host=serverA
>
>
> show shards
name: _internal
id database  retention_policy shard_group start_time           end_time             expiry_time          owners
-- --------  ---------------- ----------- ----------           --------             -----------          ------
1  _internal monitor          1           2017-08-11T00:00:00Z 2017-08-12T00:00:00Z 2017-08-19T00:00:00Z
10 _internal monitor          10          2017-08-14T00:00:00Z 2017-08-15T00:00:00Z 2017-08-22T00:00:00Z
SHOW TAG KEYS
> create database db
> insert cpu,region=west value=1
> insert cpu,host=serverA value=1 1689894000000000000
>
>
> show tag keys
name: cpu
tagKey
------
region
host
>
>
> show shards
name: _internal
id database  retention_policy shard_group start_time           end_time             expiry_time          owners
-- --------  ---------------- ----------- ----------           --------             -----------          ------
1  _internal monitor          1           2017-08-11T00:00:00Z 2017-08-12T00:00:00Z 2017-08-19T00:00:00Z
10 _internal monitor          10          2017-08-14T00:00:00Z 2017-08-15T00:00:00Z 2017-08-22T00:00:00Z
SHOW FIELD KEYS
> create database db
> insert cpu value=1
> insert cpu count=1 1689894000000000000
> show field keys
name: cpu
fieldKey fieldType
-------- ---------
value    float
count    float
>
>
> show shards
name: _internal
id database  retention_policy shard_group start_time           end_time             expiry_time          owners
-- --------  ---------------- ----------- ----------           --------             -----------          ------
1  _internal monitor          1           2017-08-11T00:00:00Z 2017-08-12T00:00:00Z 2017-08-19T00:00:00Z
10 _internal monitor          10          2017-08-14T00:00:00Z 2017-08-15T00:00:00Z 2017-08-22T00:00:00Z

SHOW TAG VALUES and SHOW MEASUREMENTS are unaffected because they use the index directly and are not converted into SELECT queries.

The issue boils down to the merging operations in influxql, which currently considers the following facets of a point when merging it with others: (1) its measurement name; (2) its tag keys/values; (3) time.

Since regular points usually have all three of those, they're always sorted correctly across shards. The results of system iterators however do not have names, tags or time values and so end up being merged arbitrarily between streams of iterators. Since the inmem index streams are always identical though, the result is always correct.

@e-dard e-dard added area/influxql Issues related to InfluxQL query language backlog/storage kind/bug labels Aug 14, 2017
@e-dard e-dard added this to the 1.4.0 milestone Aug 14, 2017
@e-dard e-dard self-assigned this Aug 14, 2017
@e-dard
Copy link
Contributor Author

e-dard commented Aug 14, 2017

I have a fix for this, but I won't be able to get the PR up until later in the week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/influxql Issues related to InfluxQL query language kind/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant