Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ChunksIterator method to Series interface. #5882

Open
wants to merge 15 commits into
base: master
from

Conversation

@bwplotka
Copy link
Member

commented Aug 13, 2019

Continuation from: prometheus/tsdb#665

Changes:

  • Adjusts Iterator Interface of Series e.g Seek
  • Reuse single interface instead of 3 same ones spread across 3 packages.
  • Refactored tests
  • Added ChunkIterator to iterate over chunks if requested by the client.

Fixes: #5871

tomwilkie and others added 11 commits Jul 11, 2019
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
… method.

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
…o far.

TODO: tests & splitting into 2.

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
TODO: Add tests.

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
* Simplified; merged seek and non seek cases together. Added explicit min/max only for chunk series iterator, where it is relevant.
* Adjusted all seek implementation to match edge case requirement (double seek, failed seek + next).

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
…329

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
@bwplotka bwplotka force-pushed the add-chunks-to-queriers-iterator-tsdb branch from 18e0587 to f460245 Aug 13, 2019
bwplotka added 2 commits Aug 13, 2019
…o-queriers-iterator-tsdb

Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
@bwplotka bwplotka force-pushed the add-chunks-to-queriers-iterator-tsdb branch from f460245 to 1999ce8 Aug 13, 2019
@bwplotka bwplotka marked this pull request as ready for review Aug 13, 2019
// Err returns the current error.
Err() error
func (s *chunkSeries) ChunkIterator() ChunkIterator {
// TODO(bwplotka): This is wrong. We probably have to be strict in terms of min/maxt and tombstones.

This comment has been minimized.

Copy link
@bwplotka

bwplotka Aug 13, 2019

Author Member

TODO/TO decide:

When we want to apply tombstones in ChunkIterator? :/

TBH it might be the most efficient to pass intervals upstream e.g in streamed remote read protocol as well (???)

cc @brian-brazil

This comment has been minimized.

Copy link
@bwplotka

bwplotka Aug 13, 2019

Author Member

Two options:

A) Naive: Rewrite chunk on the fly applying mint,maxt and dranges. It is must have for Select interface which assumes things are scoped to given time range and tombstones are tolerated.
B) Ignore mint,maxt level. Maybe allow ChunkIterator to return deleted intervals as well. This probably means extending the remote read chunked protocol here for passing tombstones intervals as well -> let client side to apply intervals.. However, we leak "deleted" metrics. If they were deleted because of GDPR or something, this is kind of wrong (:

I think overall we need to do A here. Thoughts @brian-brazil @tomwilkie @codesome ?

This comment has been minimized.

Copy link
@brian-brazil

brian-brazil Aug 14, 2019

Member

Definitely A, for general data security we don't want to be returning deleted data. It'd also push that complexity onto all readers, and they'd probably forget it.

This comment has been minimized.

Copy link
@bwplotka

bwplotka Aug 15, 2019

Author Member

It'd also push that complexity onto all readers, and they'd probably forget it.

You would NOT put it to readers you mean?

Makes sense to me.. Let's do it then here.

bwplotka added 2 commits Aug 13, 2019
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
lint.
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
@bwplotka

This comment has been minimized.

Copy link
Member Author

commented Aug 13, 2019

Addressed all comments from the previous PR: prometheus/tsdb#665 PTAL @brian-brazil @krasi-georgiev

@@ -1617,6 +1617,8 @@ func newMemSeries(lset labels.Labels, id uint64, chunkRange int64) *memSeries {
ref: id,
chunkRange: chunkRange,
nextAt: math.MinInt64,
// This is to make sure iterator.At without advancing returns default values.

This comment has been minimized.

Copy link
@brian-brazil

brian-brazil Aug 14, 2019

Member

It's not valid to call At without advancing.

This comment has been minimized.

Copy link
@krasi-georgiev

krasi-georgiev Aug 15, 2019

Member

Maybe calling At before Next or Seek should cause iterator.Error
In memSafeIterator add a var initialized and when false memSafeIterator.At returns default values and causes memSafeIterator.Err.

Either way the logic should be in the iterator itself.

Copy link
Member

left a comment

so many iterators now, might run out of names soon 😄

@@ -1,5 +1,8 @@
## master / unreleased

- [CHANGE] `chunks.MergeOverlappingChunks` moved to `tsdb.MergeOverlappingChunks`
- [CHANGE] `Series` interface allows return chunk iterator that allows iterating over encoded chunks.

This comment has been minimized.

Copy link
@krasi-georgiev

krasi-georgiev Aug 15, 2019

Member

do you think it is worth it to still keep track of these?
Is it helpful to you or the cortex team?
cc @tomwilkie

I mean do we still need a CHANGELOG file at all? How would it be useful?

@@ -1175,6 +1358,19 @@ func (it *deletedIterator) At() (int64, float64) {
return it.it.At()
}

func (it *deletedIterator) Seek(t int64) bool {
if atT, _ := it.At(); t >= atT {
return false

This comment has been minimized.

Copy link
@krasi-georgiev

krasi-georgiev Aug 15, 2019

Member

Seek should return true when t==atT

If possible you can add the TestDeletedIterator to TestSeriesIterators and delete the TestDeletedIterator test or add a seek test to TestDeletedIterator

// ChunkIterator iterates over the chunk of a time series.
type ChunkIterator interface {
// Seek advances the iterator forward to the given timestamp.
// It advances to the chunk with min time at t or first chunk with min time after t.

This comment has been minimized.

Copy link
@krasi-georgiev

krasi-georgiev Aug 15, 2019

Member

Add a behavior description similar to the chunk.Iterator?
I mean what hapens in the different scenarios when requested time is the same or smaller than the current At.

var r []tsdbutil.Sample
if tc.seek != 0 {
testutil.Equals(t, tc.seekSuccess, it.Seek(tc.seek))
testutil.Equals(t, tc.seekSuccess, it.Seek(tc.seek)) // Next one should be noop.

This comment has been minimized.

Copy link
@krasi-georgiev

krasi-georgiev Aug 15, 2019

Member
Suggested change
testutil.Equals(t, tc.seekSuccess, it.Seek(tc.seek)) // Next one should be noop.
testutil.Equals(t, tc.seekSuccess, it.Seek(tc.seek)) // Next one should be noop.
testutil.Equals(t, false, it.Seek(tc.seek-1)) // Seek in the past should be noop and return false.
var r []chunks.Meta
if tc.seek != 0 {
testutil.Equals(t, tc.seekSuccess, it.Seek(tc.seek))
testutil.Equals(t, tc.seekSuccess, it.Seek(tc.seek)) // Next one should be noop.

This comment has been minimized.

Copy link
@krasi-georgiev

krasi-georgiev Aug 15, 2019

Member
Suggested change
testutil.Equals(t, tc.seekSuccess, it.Seek(tc.seek)) // Next one should be noop.
testutil.Equals(t, tc.seekSuccess, it.Seek(tc.seek)) // Next one should be noop.
testutil.Equals(t, false, it.Seek(tc.seek-1)) // Seek in the past should be noop and return false.
}
}

func (it *verticalMergeSeriesIterator) Seek(t int64) bool {
if it.initialized && it.curT >= t {
return true

This comment has been minimized.

Copy link
@krasi-georgiev

krasi-georgiev Aug 15, 2019

Member

hm this doesn't look right. This would return true when it.curT is 150 and the requested time is 100

I thought we want the following behaviour:
Iterator with minT:100, maxT:200, curT:150

Seek(150) - returns true and the Seek is noop, doesn't advance.
Seek(100) - returns false and is noop.
Seek(160) - returns true and advances the iterator to 160 or the next available timestamp.
Seek(210) - returns false and exhausts the iterator.
When it.Err is not nil return false in all cases.

Maybe putting an example like this in the iterator description would make things a bir more clear for what is expected in the different scenarios.

if t > it.maxt {
// Exhaust iterator.
it.i = len(it.chunks)

This comment has been minimized.

Copy link
@krasi-georgiev

krasi-georgiev Aug 15, 2019

Member

maybe it is just my style, but I would think it will be easier to follow with an explicit var it.exhausted = true. Can leave as is if you prefer.

return false
}
return true
return t <= it.maxt
}
if err := it.cur.Err(); err != nil {

This comment has been minimized.

Copy link
@krasi-georgiev

krasi-georgiev Aug 15, 2019

Member
Suggested change
if err := it.cur.Err(); err != nil {
if it.cur.Err() != nil {
@@ -940,7 +957,7 @@ func (it *chainedSeriesIterator) Next() bool {
if it.cur.Next() {
return true
}
if err := it.cur.Err(); err != nil {
if it.cur.Err() != nil {

This comment has been minimized.

Copy link
@krasi-georgiev

krasi-georgiev Aug 15, 2019

Member

I completely forgot we can do that 🥇 nice reminder, thanks!

@@ -1053,8 +1304,9 @@ func TestChunkSeriesIterator_DoubleSeek(t *testing.T) {
}

res := newChunkSeriesIterator(chkMetas, nil, 2, 8)
testutil.Assert(t, res.Seek(1) == true, "")
testutil.Assert(t, res.Seek(2) == true, "")
testutil.Assert(t, res.Seek(1), "")

This comment has been minimized.

Copy link
@krasi-georgiev

krasi-georgiev Aug 15, 2019

Member

don't think we even need this test anymore as we can cover it in TestSeriesIterators

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.