Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix duplicate read when point is in WAL and disk #3482

Closed
wants to merge 1 commit into from

Conversation

benbjohnson
Copy link
Contributor

Overview

This commit fixes an issue where two points in a series with the same timestamp can be read twice if they are in the WAL and disk. The write to the WAL should overwrite the write to the disk and the client should only see the WAL version.

Fixes #3315.

/cc @ccutrer @pauldix @beckettsean

}
}
return c.readBuf()
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the whole change. It looks larger than it really is. The code is mostly expanded out for clarity. Previously we only checked if we we had a cursor item and either no cache or the buffer was a lower timestamp than the cache. Now it checks if the timestamps are equal and drains out the cache. Draining the cache works to remove any multiple duplicates that could have been written to the WAL.

The rest of the changes are to add testing to the v1 engine. All existing testing for the engine was left in the tsdb package for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I fixed another issue in 097b025 so that the cache is read instead of the buffer.

This commit fixes an issue where two points in a series with the same
timestamp can be read twice if they are in the WAL and disk. The write
to the WAL should overwrite the write to the disk and the client
should only see the WAL version.

Fixes influxdata#3315.

// Write point again.
if err := e.WritePoints([]tsdb.Point{
tsdb.NewPoint("cpu", tsdb.Tags{}, tsdb.Fields{"value": 100}, time.Unix(0, 1)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

second point should have a different value, which you can then read in the test verification to ensure that the latter point is the one that was actually used when a dupe was found

@pauldix
Copy link
Member

pauldix commented Jul 28, 2015

I just checked this out and tested it and still see the same behavior. Repro steps from the CLI:

create database foo;
use foo;
insert timetest2 value=27.2 1438039663000000000
# restart influxdb so that a WAL flush is triggered
insert timetest2 value=28.3 1438039663000000000
select * from timetest2
# points with duplicate timestamps

@ccutrer
Copy link

ccutrer commented Jul 30, 2015

I can also confirm that this isn't fixing it for me either

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[0.9.1-2] sometimes values duplicated in query
3 participants