Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support histograms in pkg/querier #4376

Merged
merged 36 commits into from
Mar 14, 2023
Merged

Support histograms in pkg/querier #4376

merged 36 commits into from
Mar 14, 2023

Conversation

codesome
Copy link
Member

@codesome codesome commented Mar 6, 2023

What this PR does

Syncs the changes in pkg/querier (except remote_read, see #4425) from sparsehistogram branch.

In general terms: adds support to further iterators (beyond Batch iterator) for the querier to be able to server PromQL.

This does not turn native histograms on in the read path. We'll do that together with all the e2e test updates.

Related to: #3478
Fixes: #4405

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@codesome codesome marked this pull request as ready for review March 6, 2023 10:07
@codesome codesome requested a review from a team as a code owner March 6, 2023 10:07
@codesome codesome requested a review from krajorama March 6, 2023 10:07
pkg/querier/iterators/chunk_merge_iterator.go Show resolved Hide resolved
pkg/querier/querier_test.go Outdated Show resolved Hide resolved
pkg/querier/querier_test.go Outdated Show resolved Hide resolved
pkg/querier/distributor_queryable_test.go Outdated Show resolved Hide resolved
pkg/querier/distributor_queryable_test.go Outdated Show resolved Hide resolved
pkg/querier/querier_test.go Outdated Show resolved Hide resolved
pkg/querier/remote_read.go Outdated Show resolved Hide resolved
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Allow time series where there are both float and histogram samples.
Add test to verify it.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Copy link
Member Author

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of small comments

pkg/querier/iterators/chunk_iterator.go Outdated Show resolved Hide resolved
if iT == jT {
iTyp := (*h)[i].AtType()
jTyp := (*h)[j].AtType()
return iTyp > jTyp
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth a comment here. I guess we are assuming that float type has a lower numerical value than histogram type, so we are making use of that to prefer histograms over float.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. It's about as arbitrary as the current way of working which can ignore some float values. We're assuming that we pretty much never end up in such situation as the storage will return consistent series.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To illustrate: create two float (XOR) chunks with the same timestamps, but different values. The merge iterator simply ignores one of them.

}
assert.Equal(t, chunkenc.ValNone, iter.Next())
})
}
}

func TestChunkMergeIteratorSeek(t *testing.T) {
func TestChunkMergeIteratorMixed(t *testing.T) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also be checking if the result type are right for given time ranges. For example if at timestamp 80 if we were expecting Histogram, and if we get a float, then this test does not catch it.

With how the test case is, it can be as simple as 3 sequential for loops, each expecting a fixed type, and we don't need switch case anymore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made something a bit more sophisticated while reusing code

@@ -224,10 +270,90 @@ func TestQuerier_QueryableReturnsChunksOutsideQueriedRange(t *testing.T) {
}, m[0].Points)
}

func mockTSDB(t *testing.T, mint model.Time, samples int, step, chunkOffset time.Duration, samplesPerChunk int) (storage.Queryable, model.Time) {
func TestBatchMergeChunks(t *testing.T) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test does not have histograms. Was it to test something else or some issue with how GitHub is showing the diff?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The batch merger has its own tests, so I think this is ok, as it needs to just check that the querier can use the batch merger-er.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment to explain why this test is there in the first place

pkg/querier/remote_read.go Outdated Show resolved Hide resolved
pkg/querier/timeseries_series_set.go Outdated Show resolved Hide resolved
pkg/querier/timeseries_series_set.go Outdated Show resolved Hide resolved
},
Histograms: []mimirpb.Histogram{
mimirpb.FromHistogramToHistogramProto(1232, generateTestHistogram(7)),
mimirpb.FromFloatHistogramToHistogramProto(1233, generateTestFloatHistogram(8)),
},
},
}

it := ts.Iterator(nil)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we could use a slice of expected stuff and use a loop to do the matching to avoid a lot of duplication below. In case of seek, we could make it check against a partial slice. But if you feel it is easier this way, then I have no problem :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to reinvent this testcase right now, plus I've added a case where we test two things after Next call, so I'd rather defer this question.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rewrote using test util functions to be more concise

krajorama and others added 16 commits March 8, 2023 08:04
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
The iterator was not doing do same type check as the others.
The merge iterator can be used directly in the querier for feeding
PromQL so it needs to be able to auto convert histogram to float
histogram.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Replace panic with assertion.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Prepare to merge the float and histogram tests and add more cases

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Let the mock TSDB generate samples based on supplied type function

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
…togram

with test

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Explain why this test is there.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Write a distributed querier test that involves both float and histogram
samples at the same time.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Went to #4425

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
@pstibrany
Copy link
Member

The CHANGELOG has just been cut to prepare for the next Mimir release. Please rebase main and eventually move the CHANGELOG entry added / updated in this PR to the top of the CHANGELOG document. Thanks!

Use the iterator test functions.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
If we cache an integer histogram but then read a float histogram,
the returned float histogram is invalid. Fix by using the correct
cache.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
pkg/querier/block_test.go Outdated Show resolved Hide resolved
# Conflicts:
#	pkg/querier/iterators/chunk_iterator.go
#	pkg/querier/iterators/chunk_iterator_test.go
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, i left mostly nitpicks and one or two minor bugs

pkg/querier/iterators/chunk_iterator.go Outdated Show resolved Hide resolved
pkg/querier/iterators/chunk_merge_iterator_test.go Outdated Show resolved Hide resolved
pkg/querier/iterators/chunk_merge_iterator_test.go Outdated Show resolved Hide resolved
Comment on lines +143 to +151
if i < 50 {
test.RequireIteratorIthFloat(t, i, iter, valueType)
return
}
if i < 125 {
test.RequireIteratorIthHistogram(t, i, iter, valueType)
return
}
test.RequireIteratorIthFloatHistogram(t, i, iter, valueType)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do i have this right - because the chunks overlap and the timestamps of their samples are exactly the same, then the samples from the chunk with "more data" takes precedence. Can you maybe leave a comment for this? It looks like TestChunkMergeIteratorMixed is written in the same way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I'll add a comment

pkg/querier/block_test.go Outdated Show resolved Hide resolved
pkg/querier/querier_test.go Outdated Show resolved Hide resolved
pkg/querier/querier_test.go Outdated Show resolved Hide resolved
pkg/querier/querier_test.go Outdated Show resolved Hide resolved
pkg/querier/timeseries_series_set.go Show resolved Hide resolved
pkg/querier/timeseries_series_set_test.go Outdated Show resolved Hide resolved
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you for addressing my comments!

Copy link
Member Author

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I glanced over the non-test files. Just have 2 non-critical comments.

Comment on lines +82 to +84
// use cache as histogram
t, h := i.AtHistogram()
return t, h.ToFloat()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we not using the cache for this case? It will be common to call AtFloatHistogram() on integer histograms (I don't know how often the cache is hit here).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can cache the float histogram value here as well as the integer histogram

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case it is safer to set nil to the cached histograms when we set cacheValid as false.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created issue: #4489

if h.IsFloatHistogram() {
return h.Timestamp, mimirpb.FromHistogramProtoToFloatHistogram(&h)
}
return h.Timestamp, mimirpb.FromHistogramProtoToHistogram(&h).ToFloat()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a TODO: We can directly do the conversion to float instead of getting integer histogram first and then converting to float. It will save some computation and allocation. Also for other places where we do it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created issue: #4488

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants