-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
promql(histograms): Change sample total calculation for histograms #12609
promql(histograms): Change sample total calculation for histograms #12609
Conversation
f60d015
to
fd4895f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much.
Very cool calculation of the actual size of a histogram. However, as explained in the comment, we have to do something less expensive and safer (which ideally still yields the same result).
promql/value.go
Outdated
@@ -168,6 +168,21 @@ func (p HPoint) MarshalJSON() ([]byte, error) { | |||
return json.Marshal([...]interface{}{float64(p.T) / 1000, h}) | |||
} | |||
|
|||
// histogramSamples returns the number of samples representing the histogram. | |||
// each sample is 4 bytes. | |||
func (p HPoint) histogramSamples() int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would call it more like histogramSize
and explain it as "measured in relation to a conventional FPoint
i.e. 16 bytes.
promql/value.go
Outdated
// histogramSamples returns the number of samples representing the histogram. | ||
// each sample is 4 bytes. | ||
func (p HPoint) histogramSamples() int { | ||
return p.H.Size() / 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be (p.H.Size() + 8) / 16
.
In TotalSamples
below, we count each FPoint
as "one sample". One FPoint
is 16 bytes, a 64bit int timestamp and a 64bit float sample value. So "one sample" so far has been 16 bytes.
For an HPoint
, we have the same timestamp (8 bytes), and then a number of bytes in the histogram, the sum of which we need to divide by the 16 bytes we considered "one sample" so far.
promql/value.go
Outdated
} | ||
|
||
// totalHistogramSamples returns the total number of samples in the given slice of histograms. | ||
func totalHistogramSamples(histograms []HPoint) int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same naming thoughts as above.
@@ -267,7 +282,7 @@ func (m Matrix) String() string { | |||
func (m Matrix) TotalSamples() int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The doc comment here should be updated to reflect that histograms are taken into account in a weighted fashion.
Another code path that needs to change: https://github.com/prometheus/prometheus/blob/d6e1b1acdb1231f5bb694bc482b568ccea2b40af/promql/engine.go#L1339C6-L1339C6
|
And another one: Lines 1813 to 1824 in d6e1b1a
|
And another one: Lines 1862 to 1869 in d6e1b1a
|
Another one: Lines 2034 to 2036 in d6e1b1a
|
Sorry, this whole sample tracking now appears way more complicated than I have anticipated. |
380076c
to
ff69c62
Compare
b9189f0
to
a2b2106
Compare
Apologies for long delays. Still fighting my way through the review backlog… |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good. I only have a few naming and comment nits, and I think the size calculation needs a fix. See my comments. Maybe I'm missing something.
Otherwise, this looks fine.
Apologies again for the review delay.
model/histogram/float_histogram.go
Outdated
// NOTE: this is only valid for 64 bit architectures. | ||
func (fh *FloatHistogram) Size() int { | ||
// Size of each slice separately | ||
posSpanSize := len(fh.PositiveSpans) * 8 // 8 bytes - int64 * 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int64 * 2
isn't really a Span
. It's int32 + uint32
. Which still results in 8 bytes, so the number is correct.
model/histogram/float_histogram.go
Outdated
func (fh *FloatHistogram) Size() int { | ||
// Size of each slice separately | ||
posSpanSize := len(fh.PositiveSpans) * 8 // 8 bytes - int64 * 2 | ||
negSpanSize := len(fh.NegativeSpans) * 8 // 8 bytes - int64 * 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
model/histogram/float_histogram.go
Outdated
|
||
// Total size of the struct | ||
|
||
// fh is 32 bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are these coming from?
model/histogram/float_histogram.go
Outdated
// fh.NegativeSpans is 24 bytes | ||
// fh.PositiveBuckets is 24 bytes | ||
// fh.NegativeBuckets is 24 bytes | ||
structSize := 165 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I run the following on my (amd64) machine:
fh := histogram.FloatHistogram{}
fmt.Println(unsafe.Sizeof(fh))
I get 136 bytes.
Which makes sense to me. All your numbers above seem fine, except the fh is 32 bytes
, which I don't understand. If I add them all up, I get 133 bytes. However, the 1 byte CounterResetHint takes effectively 4 bytes because of alignment in the struct, so 133 + 3 = 136.
So the question is where you got the 32 bytes from at the top?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I honestly can't remember....perhaps when I had the function calculating everything automatically? You are right, I'm getting the same result, changing now.
promql/value.go
Outdated
return (p.H.Size() + 8) / 16 | ||
} | ||
|
||
// totalHistogramSize returns the total number of samples in the given slice of histograms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
histograms → HPoints
promql/value.go
Outdated
@@ -168,6 +168,23 @@ func (p HPoint) MarshalJSON() ([]byte, error) { | |||
return json.Marshal([...]interface{}{float64(p.T) / 1000, h}) | |||
} | |||
|
|||
// histogramSize returns the number of samples representing the histogram. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about: "histogramSize returns the size of the HPoint compared to the size of an FPoint. Or in other words: This HPoint is n times larger than an FPoint, where n is the returned number."
promql/value.go
Outdated
} | ||
|
||
// totalHistogramSize returns the total number of samples in the given slice of histograms. | ||
func totalHistogramSize(histograms []HPoint) int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would call this totalHPointSize
as it includes the timestamp and the pointer to the actual histogram.
@marctc is this still on your radar? I think it's very little left to do, and it would be nice to get this in soon. |
Yep, looking into it today 👍 |
cbe95f9
to
27d9497
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One remaining substantial change (structSize
is still 3 bytes off), and otherwise just comment and naming nits. Almost there…
model/histogram/float_histogram.go
Outdated
@@ -334,6 +334,33 @@ func (h *FloatHistogram) Equals(h2 *FloatHistogram) bool { | |||
return true | |||
} | |||
|
|||
// Size returns the size of the whole fields histogram in bytes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should make clear that it includes the pointer to the histogram.
How about "Size returns the total size of the FloatHistogram, which includes the size of the pointer to FloatHistogram, all its fields, and all elements contained in slices."
model/histogram/float_histogram.go
Outdated
// Total size of the struct | ||
|
||
// fh is 8 bytes | ||
// fh.CounterResetHint is 1 byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it takes 4 bytes because af alignment.
model/histogram/float_histogram.go
Outdated
// fh.NegativeSpans is 24 bytes | ||
// fh.PositiveBuckets is 24 bytes | ||
// fh.NegativeBuckets is 24 bytes | ||
structSize := 141 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this should be 144. (I suggested 136 in the last round of review, but I was not aware that your intention is to include the size of the pointer to FloatHistogram).
model/histogram/float_histogram.go
Outdated
|
||
// Total size of the struct | ||
|
||
// fh is 8 bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This is a sentence and should end with a period. (Same everywhere else. Generally, I would even finish longer comments that aren't formally sentences with a period, like the "Total size of the struct" above. Comments are plain English and should have punctuation as if you are writing an English text.)
promql/value.go
Outdated
@@ -264,10 +296,11 @@ func (m Matrix) String() string { | |||
} | |||
|
|||
// TotalSamples returns the total number of samples in the series within a matrix. | |||
// It takes into account the number of samples in the histograms using the histogramSize method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just repeat the same sentence as you have done above for func (vec Vector) TotalSamples()
, i.e.
// Float samples have a weight of 1 in this number, while histogram samples have a higher
// weight according to their size compared with the size of a float sample.
// See HPoint.histogramSize for details.
promql/value.go
Outdated
// The total size is calculated considering the histogram timestamp (p.T - 8 bytes), | ||
// and then a number of bytes in the histogram. | ||
// This sum is divided by 16, as samples are 16 bytes. | ||
func (p HPoint) histogramSize() int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for more naming nits. But why not call this just size()
? It returns the size of the whole HPoint
, not just of the histogram that is part of the HPoint.
Signed-off-by: Marc Tuduri <marctc@protonmail.com>
27d9497
to
98bb3d0
Compare
Signed-off-by: Marc Tuduri <marctc@protonmail.com>
Signed-off-by: Marc Tuduri <marctc@protonmail.com>
98bb3d0
to
1ce066e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much. And sorry for this being way more involved that I initially thought.
This PR changes the way of number samples are calculated when doing query operations with
histograms.
Added method for
FloatHistogram
s that calculates the size in bytes of the whole structand its fields.
Added method that calculates number of samples (4 bytes per sample) of histogram in
HPoint
struct.Fixes #11555