Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

querier: bugfix - data gaps when switching iterators #3010

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Expand Up @@ -16,6 +16,7 @@ We use *breaking :warning:* word for marking changes that are not backward compa
* [#3095](https://github.com/thanos-io/thanos/pull/3095) Rule: update manager when all rule files are removed.
* [#3098](https://github.com/thanos-io/thanos/pull/3098) ui: Fix Block Viewer for Compactor and Store
* [#3105](https://github.com/thanos-io/thanos/pull/3105) Query: Fix overwriting maxSourceResolution when auto downsampling is enabled.
* [#3010](https://github.com/thanos-io/thanos/pull/3010) Querier: Added a flag to override the default look back delta in promql. The flag should be set to at least 2 times the slowest scrape interval or left unset to use the Prometheus defaults of 5min.

## [v0.15.0-rc.0](https://github.com/thanos-io/thanos/releases/tag/v0.15.0-rc.0) - 2020.08.26

Expand Down
5 changes: 5 additions & 0 deletions cmd/thanos/query.go
Expand Up @@ -70,6 +70,8 @@ func registerQuery(m map[string]setupFunc, app *kingpin.Application) {
maxConcurrentQueries := cmd.Flag("query.max-concurrent", "Maximum number of queries processed concurrently by query node.").
Default("20").Int()

lookbackDelta := cmd.Flag("query.lookback-delta", "The maximum lookback duration for retrieving metrics during expression evaluations. PromQL always evaluates the query for the certain timestamp (query range timestamps are deduced by step). Since scrape intervals might be different, PromQL looks back for given amount of time to get latest sample. If it exceeds the maximum lookback delta it assumes series is stale and returns none (a gap). This is why lookback delta should be set to at least 2 times of the slowest scrape interval. If unset it will use the promql default of 5m").Duration()

maxConcurrentSelects := cmd.Flag("query.max-concurrent-select", "Maximum number of select requests made concurrently per a query.").
Default("4").Int()

Expand Down Expand Up @@ -175,6 +177,7 @@ func registerQuery(m map[string]setupFunc, app *kingpin.Application) {
*maxConcurrentQueries,
*maxConcurrentSelects,
time.Duration(*queryTimeout),
*lookbackDelta,
time.Duration(*defaultEvaluationInterval),
time.Duration(*storeResponseTimeout),
*queryReplicaLabels,
Expand Down Expand Up @@ -222,6 +225,7 @@ func runQuery(
maxConcurrentQueries int,
maxConcurrentSelects int,
queryTimeout time.Duration,
lookbackDelta time.Duration,
defaultEvaluationInterval time.Duration,
storeResponseTimeout time.Duration,
queryReplicaLabels []string,
Expand Down Expand Up @@ -312,6 +316,7 @@ func runQuery(
NoStepSubqueryIntervalFn: func(rangeMillis int64) int64 {
return defaultEvaluationInterval.Milliseconds()
},
LookbackDelta: lookbackDelta,
},
)
)
Expand Down
13 changes: 13 additions & 0 deletions docs/components/query.md
Expand Up @@ -367,6 +367,19 @@ Flags:
--query.timeout=2m Maximum time to process query by query node.
--query.max-concurrent=20 Maximum number of queries processed
concurrently by query node.
--query.lookback-delta=QUERY.LOOKBACK-DELTA
kakkoyun marked this conversation as resolved.
Show resolved Hide resolved
The maximum lookback duration for retrieving
metrics during expression evaluations. PromQL
always evaluates the query for the certain
timestamp (query range timestamps are deduced
by step). Since scrape intervals might be
different, PromQL looks back for given amount
of time to get latest sample. If it exceeds the
maximum lookback delta it assumes series is
stale and returns none (a gap). This is why
lookback delta should be set to at least 2
times of the slowest scrape interval. If unset
it will use the promql default of 5m
--query.max-concurrent-select=4
Maximum number of select requests made
concurrently per a query.
Expand Down
7 changes: 4 additions & 3 deletions pkg/query/iter.go
Expand Up @@ -302,7 +302,7 @@ type chunkSeriesIterator struct {
func newChunkSeriesIterator(cs []chunkenc.Iterator) chunkenc.Iterator {
if len(cs) == 0 {
// This should not happen. StoreAPI implementations should not send empty results.
return errSeriesIterator{}
return errSeriesIterator{err: errors.Errorf("store returned an empty result")}
}
return &chunkSeriesIterator{chunks: cs}
}
Expand Down Expand Up @@ -503,7 +503,8 @@ func (it noopAdjustableSeriesIterator) adjustAtValue(float64) {}
// Replica 1 counter scrapes: 20 30 40 Nan - 0 5
// Replica 2 counter scrapes: 25 35 45 Nan - 2
//
// Now for downsampling purposes we are accounting the resets so our replicas before going to dedup iterator looks like this:
// Now for downsampling purposes we are accounting the resets(rewriting the samples value)
// so our replicas before going to dedup iterator looks like this:
//
// Replica 1 counter total: 20 30 40 - - 40 45
// Replica 2 counter total: 25 35 45 - - 47
Expand Down Expand Up @@ -648,7 +649,7 @@ func (it *dedupSeriesIterator) Seek(t int64) bool {
// Don't use underlying Seek, but iterate over next to not miss gaps.
for {
ts, _ := it.At()
if ts > 0 && ts >= t {
krasi-georgiev marked this conversation as resolved.
Show resolved Hide resolved
if ts >= t {
return true
}
if !it.Next() {
Expand Down