-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValidateHistogram: strict Count check in absence of NaNs #12977
ValidateHistogram: strict Count check in absence of NaNs #12977
Conversation
package histogram | ||
|
||
// GenerateBigTestHistograms generates a slice of histograms with given number of buckets each. | ||
func GenerateBigTestHistograms(numHistograms, numBuckets int) []*Histogram { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved from tsdb/head_test.go
model/histogram/validate.go
Outdated
ErrHistogramCountNotBigEnough = errors.New("histogram's observation count should be at least the number of observations found in the buckets") | ||
ErrHistogramCountMismatch = errors.New("histogram's observation count should equal the number of observations found in the buckets (in absence of NaN)") | ||
ErrHistogramNegativeBucketCount = errors.New("histogram has a bucket whose observation count is negative") | ||
ErrHistogramSpanNegativeOffset = errors.New("histogram has a span whose offset is negative") | ||
ErrHistogramSpansBucketsMismatch = errors.New("histogram spans specify different number of buckets than provided") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved from storage/interface.go to avoid dependency cycle.
model/histogram/validate.go
Outdated
ErrHistogramSpansBucketsMismatch = errors.New("histogram spans specify different number of buckets than provided") | ||
) | ||
|
||
func ValidateHistogram(h *Histogram) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved from tsdb/head_append.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this could be a method of Histogram
now, just called Validate
?
Would also be good to give it a doc comment to explain what is validated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, makes sense. Changed from func ValidateHistogram(h *Histogram) error
to func (h *Histogram) Validate() error
and added doc comments.
model/histogram/validate.go
Outdated
return nil | ||
} | ||
|
||
func ValidateFloatHistogram(h *FloatHistogram) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved from tsdb/head_append.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corresponding comment here:
Maybe this could be a method of FloatHistogram
now, just called Validate
?
Would also be good to give it a doc comment to explain what is validated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here - changed from func ValidateFloatHistogram(h *FloatHistogram) error
to func (h *FloatHistogram) Validate() error
and added doc comments.
model/histogram/validate.go
Outdated
return nil | ||
} | ||
|
||
func checkHistogramSpans(spans []Span, numBuckets int) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved from tsdb/head_append.go
return nil | ||
} | ||
|
||
func checkHistogramBuckets[BC histogram.BucketCount, IBC histogram.InternalBucketCount](buckets []IBC, count *BC, deltas bool) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to model/histogram/validate.go
@@ -4827,170 +4823,6 @@ func TestReplayAfterMmapReplayError(t *testing.T) { | |||
require.NoError(t, h.Close()) | |||
} | |||
|
|||
func TestHistogramValidation(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to model/histogram/validate_test.go
} | ||
} | ||
|
||
func BenchmarkHistogramValidation(b *testing.B) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to model/histogram/validate_test.go
} | ||
} | ||
|
||
func generateBigTestHistograms(numHistograms, numBuckets int) []*histogram.Histogram { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to model/histogram/test_utils.go
{30, 696}, | ||
{30, 700}, | ||
{30, 708}, | ||
{30, 693}, | ||
{40, 896}, | ||
{40, 899}, | ||
{40, 896}, | ||
{30, 690}, | ||
{30, 691}, | ||
{30, 692}, | ||
{30, 695}, | ||
{30, 694}, | ||
{30, 693}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like these changes in the encoded chunks were triggered by fixing how the observationCount
is calculated when generating those synthetic histograms:
2acb51e#diff-7c1dfea5c568897587621510c04d322f908a8ed40751e592d63d2f8422b6a78dL4961-R4969
5d23a1d
to
de67dde
Compare
On my list. Sorry for not noticing this is the last 2 weeks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Just an idea about making the validation function methods. Do you think that would make sense?
model/histogram/validate.go
Outdated
ErrHistogramSpansBucketsMismatch = errors.New("histogram spans specify different number of buckets than provided") | ||
) | ||
|
||
func ValidateHistogram(h *Histogram) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this could be a method of Histogram
now, just called Validate
?
Would also be good to give it a doc comment to explain what is validated.
model/histogram/validate.go
Outdated
return nil | ||
} | ||
|
||
func ValidateFloatHistogram(h *FloatHistogram) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corresponding comment here:
Maybe this could be a method of FloatHistogram
now, just called Validate
?
Would also be good to give it a doc comment to explain what is validated.
Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>
Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>
de67dde
to
184d142
Compare
Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>
184d142
to
ebed7d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much.
This change was inspired by this comment: #12739 (comment)
Prior to this change, a non-strict (
>
) validation ofHistogram.Count
had to be done to accommodate edge cases resulting fromNaN
observations.After this change, those edge cases will be detected by checking
Sum
for beingNaN
, and if it is not, then a strict (!=
) validation ofHistogram.Count
will be performed instead. I assume this makes sense (but really open to feedback in case I have missed something).Note that I also took the liberty to move the validation code from
tsdb
tomodel/histogram
package. IMHO validation code being next to the domain object that it validates makes it easier to discover and use (avoiding extra package dependency). Also, consistency: I see that some other domain objects (Labels
,RuleNode
) have their validation functions undermodel
package. I did this move in a separate commit to make the actual logic change easier to review (also, easy to revert if I'm suggested that this move is not a good idea for some reason, or suggested to separate it out).