-
Notifications
You must be signed in to change notification settings - Fork 1.2k
metricvec: handle hash collision for labeled metrics #221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metricvec: handle hash collision for labeled metrics #221
Conversation
prometheus/vec.go
Outdated
if ok && len(metrics) == 1 { | ||
return metrics[0].metric, nil | ||
} | ||
m.mtx.RUnlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the lock get released in case of the happy path above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixing.
e8e3d37
to
1ac1158
Compare
Take a peak 9b464a4. It provides some further fixes around collision handling and greatly increases the unit test coverage. Zero-alloc behavior is preserved at the cost of some performance. |
// deleteByHash removes the metric from the hash bucket h. If there are | ||
// multiple matches in the bucket, use lvs to select a metric and remove only | ||
// that metric. | ||
func (m *MetricVec) deleteByHash(h uint64, values interface{}) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really a fan of interface type arguments. We're losing the advantages of the type system. Is this really necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To maintain correctness and zero-alloc behavior without repeating code, it is necessary. Any errors are caught during test time.
What is the issue? This is private code. Are you just afraid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, of course, repeat the code. Go isn't really afraid of that. But in this case, it would be a lot of code. I'm fine with the interface usage as it avoids the allocation in L117 of the old code.
It might be good, though, to document the requirements for values
at the top of the call stack (even if it is an unexported method). I'll leave a comment where I think it should be documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right @stevvooe, I couldn't come up with a typed implementation either without causing more allocations. We'll need to wait until golang's escape analysis becomes better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@grobie @beorn7 Just made a polymorphic implementation and confirmed we get allocations.
Up to you, but it might be better to allow allocations and wait for better escape analysis. It would avoid the needless complexity in this code. A simple string slice header allocation on each lookup would let us delete a massive swath of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we're more interested in as little as possible overhead of the client than a nice and short code base. @beorn7 is the gatekeeper of this repository though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, avoiding allocations in this code path is essential. It is very likely to be called very frequently by users.
Current code is the best solution, I guess.
Note: I did not do a full review. Only commented on two things I saw while on mobile. |
Spotting dead locks while looking at code on his mobile phone — that's Super-Grobie's super power... |
9b464a4
to
db87834
Compare
Signed-off-by: Stephen J Day <stephen.day@docker.com>
While hash collisions are quite rare, the current state of the client library carries a risk of merging two separate label values into a single metric bucket. The effects are near impossible to detect and the result will be missing or incorrect counters. This changeset handles hash collisions by falling back to collision resolution if multiple label values hash to the same value. This works similar to separate chaining using a slice. Extra storage is minimized to only the value key slice to that metrics can be differentiated within a bucket. In general, the cost of handling collisions is completely minimized under normal operation. Performance does show slight increases in certain areas, but these are more likely statistically anomalies. More importantly, zero allocation behavior for metrics is preserved on the fast path. Minimal allocations may be made during collision handling but this has minimal effect. Benchmark comparisons with and without collision resolution follow. ``` benchmark old ns/op new ns/op delta BenchmarkCounterWithLabelValues-4 99.0 107 +8.08% BenchmarkCounterWithLabelValuesConcurrent-4 79.6 91.0 +14.32% BenchmarkCounterWithMappedLabels-4 518 542 +4.63% BenchmarkCounterWithPreparedMappedLabels-4 127 137 +7.87% BenchmarkCounterNoLabels-4 19.5 19.1 -2.05% BenchmarkGaugeWithLabelValues-4 97.4 110 +12.94% BenchmarkGaugeNoLabels-4 12.4 10.3 -16.94% BenchmarkSummaryWithLabelValues-4 1204 915 -24.00% BenchmarkSummaryNoLabels-4 936 847 -9.51% BenchmarkHistogramWithLabelValues-4 147 147 +0.00% BenchmarkHistogramNoLabels-4 50.6 49.3 -2.57% BenchmarkHistogramObserve1-4 37.9 37.5 -1.06% BenchmarkHistogramObserve2-4 122 137 +12.30% BenchmarkHistogramObserve4-4 310 352 +13.55% BenchmarkHistogramObserve8-4 691 729 +5.50% BenchmarkHistogramWrite1-4 3374 3097 -8.21% BenchmarkHistogramWrite2-4 5310 5051 -4.88% BenchmarkHistogramWrite4-4 12094 10690 -11.61% BenchmarkHistogramWrite8-4 19416 17755 -8.55% BenchmarkHandler-4 11934304 13765894 +15.35% BenchmarkSummaryObserve1-4 1119 1105 -1.25% BenchmarkSummaryObserve2-4 3679 3430 -6.77% BenchmarkSummaryObserve4-4 10678 7982 -25.25% BenchmarkSummaryObserve8-4 22974 16689 -27.36% BenchmarkSummaryWrite1-4 25962 14680 -43.46% BenchmarkSummaryWrite2-4 38019 35073 -7.75% BenchmarkSummaryWrite4-4 78027 56816 -27.18% BenchmarkSummaryWrite8-4 117220 132248 +12.82% BenchmarkMetricVecWithLabelValuesBasic-4 138 133 -3.62% BenchmarkMetricVecWithLabelValues2Keys10ValueCardinality-4 150 144 -4.00% BenchmarkMetricVecWithLabelValues4Keys10ValueCardinality-4 263 256 -2.66% BenchmarkMetricVecWithLabelValues2Keys100ValueCardinality-4 145 155 +6.90% BenchmarkMetricVecWithLabelValues10Keys100ValueCardinality-4 606 634 +4.62% BenchmarkMetricVecWithLabelValues10Keys1000ValueCardinality-4 746 641 -14.08% benchmark old allocs new allocs delta BenchmarkCounterWithLabelValues-4 0 0 +0.00% BenchmarkCounterWithLabelValuesConcurrent-4 0 0 +0.00% BenchmarkCounterWithMappedLabels-4 2 2 +0.00% BenchmarkCounterWithPreparedMappedLabels-4 0 0 +0.00% BenchmarkCounterNoLabels-4 0 0 +0.00% BenchmarkGaugeWithLabelValues-4 0 0 +0.00% BenchmarkGaugeNoLabels-4 0 0 +0.00% BenchmarkSummaryWithLabelValues-4 0 0 +0.00% BenchmarkSummaryNoLabels-4 0 0 +0.00% BenchmarkHistogramWithLabelValues-4 0 0 +0.00% BenchmarkHistogramNoLabels-4 0 0 +0.00% BenchmarkMetricVecWithLabelValuesBasic-4 0 0 +0.00% BenchmarkMetricVecWithLabelValues2Keys10ValueCardinality-4 0 0 +0.00% BenchmarkMetricVecWithLabelValues4Keys10ValueCardinality-4 0 0 +0.00% BenchmarkMetricVecWithLabelValues2Keys100ValueCardinality-4 0 0 +0.00% BenchmarkMetricVecWithLabelValues10Keys100ValueCardinality-4 0 0 +0.00% BenchmarkMetricVecWithLabelValues10Keys1000ValueCardinality-4 0 0 +0.00% benchmark old bytes new bytes delta BenchmarkCounterWithLabelValues-4 0 0 +0.00% BenchmarkCounterWithLabelValuesConcurrent-4 0 0 +0.00% BenchmarkCounterWithMappedLabels-4 336 336 +0.00% BenchmarkCounterWithPreparedMappedLabels-4 0 0 +0.00% BenchmarkCounterNoLabels-4 0 0 +0.00% BenchmarkGaugeWithLabelValues-4 0 0 +0.00% BenchmarkGaugeNoLabels-4 0 0 +0.00% BenchmarkSummaryWithLabelValues-4 0 0 +0.00% BenchmarkSummaryNoLabels-4 0 0 +0.00% BenchmarkHistogramWithLabelValues-4 0 0 +0.00% BenchmarkHistogramNoLabels-4 0 0 +0.00% BenchmarkMetricVecWithLabelValuesBasic-4 0 0 +0.00% BenchmarkMetricVecWithLabelValues2Keys10ValueCardinality-4 0 0 +0.00% BenchmarkMetricVecWithLabelValues4Keys10ValueCardinality-4 0 0 +0.00% BenchmarkMetricVecWithLabelValues2Keys100ValueCardinality-4 0 0 +0.00% BenchmarkMetricVecWithLabelValues10Keys100ValueCardinality-4 0 0 +0.00% BenchmarkMetricVecWithLabelValues10Keys1000ValueCardinality-4 0 0 +0.00% ``` Signed-off-by: Stephen J Day <stephen.day@docker.com>
db87834
to
0992f45
Compare
prometheus/vec.go
Outdated
// getOrCreateMetric retrieves the metric by hash and label value or creates it | ||
// and returns the new one. | ||
// | ||
// This function takes a lock, so avoid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfinished sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here is the other entry point where the requirements for interface{}
should be documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfinished sentence.
Woops. Was trying to get this pushed for review.
Looks good. Just a few naming and comment nits. I'll try to dig out a fnv64a hash collision. |
|
h := hashNew() | ||
for _, val := range vals { | ||
h = hashAdd(h, val) | ||
h = m.hashAdd(h, val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should actually be followed by hashAddByte(h, separatorByte)
(this is a bug in the original code already, hash collisions could be created with label values like {"foo", "bar"}
and {"foob", "ar"}
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
I'll use the 0x00
to keep sort order of tuple in case we decide to optimize this differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
separatorByte
(= 0xFF) has the advantage that it is not a valid byte value anywhere in a UTF-8 string, while 0x00 is (I believe). It is also consistently used in other hashing in Prometheus code.
Since the hashing is completely internal here, it doesn't really matter. Even if somebody constructed a collision, it doesn't matter anymore as you have implemented the detection here. Still I'd prefer consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
separatorByte (= 0xFF) has the advantage that it is not a valid byte value anywhere in a UTF-8 string, while 0x00 is (I believe). It is also consistently used in other hashing in Prometheus code.
The main problem there is that if you're debugging and print in the wrong place, things explode. Also, if you ever want to use these keys to binary search, which may make sense in certain cases, you'll have to generate one with a different separator that doesn't mess with tuple-oriented sort order.
0x00
separated records are near standard. I'm really surprised that one would go with anything else. Where else in the code base is this used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is about hashing, not about tuples. Only add this character while creating the hash, nowhere else.
https://github.com/prometheus/common/blob/master/model/signature.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good sir, we are hashing tuples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For certain definition of tuples, certainly. But we are not modifying the "tuple". The separator byte is only added during hashing, there is no way it will ever be printed or used in sorting.
The null byte is a valid UTF-8 character. I could put it legally into a label value. The 0xFF character can never be part of a valid UTF-8 string. That qualifies it as a good separator character during hashing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 0xFF character can never be part of a valid UTF-8 string. That qualifies it as a good separator character during hashing.
I don't see this link but I've used the constant model.SeparatorByte
.
To give you context, we had a sort error in the docker registry API due to the use of a /
that sorts earlier than a -
. Changing to 0x00
gives you expected, lexical sort order.
Thanks for hunting this down, but this won't test the 2+N search case. That extra test caught a number of bugs in both the new and original implementation. |
More strings with hash value 0: =. hx"iX<; QvXtM%gt;@Fp% _"kWk=-v$c |
0992f45
to
27c4f00
Compare
After increasing unit test coverage, it was found that the split function call nature of metric matching wasn't working well in many cases. By increasing test coverage, we've ensured that both the fast path and fallback collision path are working appropriately. With these changes, there is a further performance hit, but now the results are ensured to be correct. Signed-off-by: Stephen J Day <stephen.day@docker.com>
27c4f00
to
4db77b0
Compare
Most commentary is dealt with. I have not added the new collision tests yet, as I don't see a way to get the coverage needed. If you're fine with getting rid of the extra coverage in, I'll re-tool to use the provided values. If we want to keep it, I'll add a separate test with the colliding values. |
Tests are fine. I'm merging this now, as I want to have it int the release planned for today. Feel free to add a test with a true collision in a separate PR. I can also do it myself. Thanks for your contribution. |
Only now I get it: the benchmarks in the commit description are superseeded by the following commit. We indeed have to rework to have no allocations in the Thanks again. |
@beorn7 Alright, thanks for taking the PR. Let me know if you need anything else. |
While hash collisions are quite rare, the current state of the client
library carries a risk of merging two separate label values into a
single metric bucket. The effects are near impossible to detect and the
result will be missing or incorrect counters.
This changeset handles hash collisions by falling back to collision
resolution if multiple label values hash to the same value. This works
similar to separate chaining using a slice. Extra storage is minimized
to only the value key slice to that metrics can be differentiated
within a bucket.
In general, the cost of handling collisions is completely minimized
under normal operation. Performance does show slight increases in
certain areas, but these are more likely statistically anomalies. More
importantly, zero allocation behavior for metrics is preserved on the
fast path. Minimal allocations may be made during collision handling but
this has minimal effect.
Benchmark comparisons with and without collision resolution follow.
Signed-off-by: Stephen J Day stephen.day@docker.com
Closes #220