Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tsdb: SeriesIterator.Seek() is vaguely defined #5871

Closed
free opened this issue May 9, 2018 · 9 comments · May be fixed by prometheus-junkyard/tsdb#665
Closed

tsdb: SeriesIterator.Seek() is vaguely defined #5871

free opened this issue May 9, 2018 · 9 comments · May be fixed by prometheus-junkyard/tsdb#665

Comments

@free
Copy link
Contributor

free commented May 9, 2018

Context: I was looking into cleaning up the iterator code, as I'm trying to add a separate SeekBefore() method, that would return the value at time t as defined by Prometheus without always parsing and iterating over a whole chunk covering the previous 5 minutes (when the looked up value is toward the beginning of a chunk).

However, I ran into some issues regarding how chunked series and iterators should actually work. There are a couple of gray areas that are only partially covered by comments or test code, making it impossible to figure out what is the expected correct behavior:

  1. blockQuerier and the resulting chunkedSeries and chunkSeriesIterator all have mint and maxt fields, which (along with (*chunkSeriesIterator).Seek() implementation details) suggest that iterators should only return samples with timestamps mint <= t <= maxt. However, there are bugs (such as the one I'm attempting to fix with Seek() shouldn't return true when past maxt. Also add some tests to s… prometheus-junkyard/tsdb#327) and even confusing/confused unit tests -- which appear to create an iterator with mint = 5, maxt = 8 and expect (*chunkSeriesIterator).Seek() to return a value with timestamp 3.

    Aremint and maxt supposed to be "hard" limits on the respective iterators or only on which chunks are selected?

  2. I gather (from the implementation again) that SeriesIterator.Seek() is only supposed to seek forward (which doesn't work as expected, as it will also seek backward within the same chunk). Is this assumption correct?
  3. Is Seek() supposed to always advance the iterator or should a second Seek() call with the same timestamp as the immediately preceding one leave the iterator positioned where it was before?
  4. Is it actually a good idea for a Seek() with a timestamp t < mint to ever succeed? If so a poorly defined series e.g. [now - 10m, now] will "successfully" produce a value of the series at now - 1y equal to the first value after now - 10m.

I'm glad to do the work (including properly documenting the code) regardless of what the answers are, I'd just like to get some "authoritative" input. Thanks.

@free
Copy link
Contributor Author

free commented May 9, 2018

OK, I figured out the confusing unit test: the chunkSeriesIterator tests actually filter out values outside of the [mint, maxt] interval, which means that mint and maxt are hard limits (and that instances of Seek() or Next() returning values outside those boundaries are bugs).

But questions (2), (3) and (4) still stand. Thanks.

@gouthamve
Copy link
Member

Hi, thanks for taking a look! You've made some really valid points!

  1. Yes, mint and maxt are hard limits and if return values out of that range, it is a bug. I think we are lucky so far because prometheus also does some filtering, but we should definitely fix them in tsdb.

  2. Seek should only seek forward and we should not allow it to seek backwards in the same chunk.

  3. Seek need not forward the iterator. If there are two Seek(t0), then the second one is essentially a noop.

  4. Yes, I think we should. By definition, Seek(t0) will give you the first point equal to or after t0. For the case you highlighted, it might seem a little odd, but you are assuming the delta between the points to be small. If the points are indeed spaced out one every 5 years, for example, if I seek something one year behind, I should see the first value.

Also, we should re-use the iterator if it's the same chunk, we are decompressing the chunk from beginning every single Seek, which should NOT happen :) https://github.com/prometheus/tsdb/blob/master/querier.go#L800

Finally, if you want to add SeekBefore to TSDB, I don't we would want that. I think the premise was that we have a streaming interface with only reading forward. And SeekBefore semantics would mean we might have to go back in time and decompress chunks from the past. I am actually not sure, what you are trying to achieve, but do you mind opening a new issue with the proposal?

Thanks for volunteering to do the work, I look forward to reviewing the PRs and let me know if you need help.

@free
Copy link
Contributor Author

free commented May 9, 2018

PR coming up soon (still have some documentation to go through).

Regarding question (4) though, my example might not have been the best. What I am worried about is that one might request a chunkedSeries that covers the past 10 minutes and then Seek() for the value 12 minutes ago: due to the mint limitation the result will be from the past 10 minutes, even though other samples might exist between 12 and 10 minutes ago and given a different mint the result would be different.

More importantly I believe that requesting the last 10 minutes and then seeking back beyond that is very likely to be a bug and it is more likely go unnoticed if a value is returned than if it consistently fails.

@free
Copy link
Contributor Author

free commented May 9, 2018

Oh, and about SeekBefore(), the idea is to find the last value at or before time t, not to seek backwards.

Currently the way the instant value of a series at time t is retrieved is to do a Seek(t - 5min) and then iterate forward, caching all values on the Prometheus side and use that cache to find the last value. This is inefficient (particularly with high resolution data) because t - 5min often falls within a previous chunk which must all be parsed and iterated over only to find the actual value in the next chunk. A SeekBefore() would first attempt to find the value at or before t in the chunk that contains t (where it will be found in 119/120 cases) and only if that fails go back a chunk. But it would all happen within the iterator, so from the client's point of view the iterator itself never goes backwards.

It is simply an optimization for retrieving the instant value of a series without always iterating over the previous 5 minutes. The chunkedSeries still gets created with mint = t - 5min, maxt = t, but instead of blindly iterating over all chunks covering that range, it will only iterate over the last chunk 99%+ of the time. I remember seeing a TODO from @fabxc somewhere in the code regarding this particular optimization.

free referenced this issue in free/tsdb May 9, 2018
Signed-off-by: Alin Sinpalean <alin.sinpalean@gmail.com>
@free
Copy link
Contributor Author

free commented May 9, 2018

Anyway, created PR prometheus-junkyard/tsdb#329 which attempts to fix (1), (2) and (3).

I also see (4) as an issue, as well as what At() should do when called either before Next()/Seek() or after they have returned false. But I leave that decision up to you guys.

@bwplotka
Copy link
Member

bwplotka commented Aug 9, 2019

I added more info to Seek interface, refactored tests to check the mentioned behaviors and adjusted all implementations in prometheus-junkyard/tsdb#665

I belive this PR fixes this issue.

@bwplotka bwplotka changed the title SeriesIterator.Seek() is vaguely defined tsdb: SeriesIterator.Seek() is vaguely defined Aug 13, 2019
@bwplotka bwplotka transferred this issue from prometheus-junkyard/tsdb Aug 13, 2019
@brian-brazil
Copy link
Contributor

@bwplotka @free Was this resolved in the end?

@bboreham
Copy link
Member

bboreham commented Jul 6, 2021

Is there anything left to do here?

@beorn7
Copy link
Member

beorn7 commented Feb 27, 2024

Hello from the bug scrub.

Is there anything left to do here?

We take three years of silence as "no". Closing.

@beorn7 beorn7 closed this as completed Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment