stream aggregation: fix possible duplicated aggregation results #7118

Haleygo · 2024-09-27T10:05:19Z

When ingesting samples with the same labels(duplicated samples or samples with the same labels after by or without options). They could register different entries for the same labelset in LabelsCompressor.
For example, both index 99 and 100 can be assigned to label foo=1 in two concurrent pushes. Then due to differing label indexes in encoded keys, the samples will appear as distinct in aggrState, resulting in duplicated results after decompressing the label indexes.

VictoriaMetrics/lib/streamaggr/streamaggr.go

Line 933 in fbde238

buf = compressLabels(buf, inputLabels.Labels, outputLabels.Labels)

In this pull request, since we need to store idxToLabel first to ensure the idx can be searched after lc.labelToIdxStore,
the lc.idxToLabel still could contain a duplicated entries [100]="foo=1". But given the low likelihood of this issue and the size of idxToLabel, it should be fine.

lib/promutils/labelscompressor.go

AndrewChubatiuk

LGTM

Before, same labelset could create multiple indexes in LabelsCompressor when there are concurrent sample ingestions, leading to duplicated states in aggrState and results.

hagen1778

LGTM

f41gh7

LGTM

When ingesting samples with the same labels(duplicated samples or samples with the same labels after `by` or `without` options). They could register different entries for the same labelset in LabelsCompressor. For example, both index 99 and 100 can be assigned to label `foo=1` in two concurrent pushes. Then due to differing label indexes in encoded keys, the samples will appear as distinct in aggrState, resulting in duplicated results after decompressing the label indexes. https://github.com/VictoriaMetrics/VictoriaMetrics/blob/fbde238cdcdf4e2c892d85a3e9e2be6e54e69cef/lib/streamaggr/streamaggr.go#L933 In this pull request, since we need to store `idxToLabel` first to ensure the idx can be searched after `lc.labelToIdxStore`, the `lc.idxToLabel` still could contain a duplicated entries [100]="foo=1". But given the low likelihood of this issue and the size of idxToLabel, it should be fine.

Haleygo requested review from valyala, hagen1778 and AndrewChubatiuk September 27, 2024 10:05

Haleygo force-pushed the fix-duplicated-aggregation-results branch 2 times, most recently from 183c227 to 1f0f380 Compare September 27, 2024 12:15

hagen1778 reviewed Sep 30, 2024

View reviewed changes

lib/promutils/labelscompressor.go Show resolved Hide resolved

hagen1778 assigned AndrewChubatiuk Sep 30, 2024

hagen1778 added the bug Something isn't working label Sep 30, 2024

AndrewChubatiuk approved these changes Sep 30, 2024

View reviewed changes

Haleygo added 3 commits September 30, 2024 18:29

stream aggregation: fix possible duplicated aggregation results

74c5780

Before, same labelset could create multiple indexes in LabelsCompressor when there are concurrent sample ingestions, leading to duplicated states in aggrState and results.

rephrase changelog

954c61e

add more comments

7b63105

Haleygo force-pushed the fix-duplicated-aggregation-results branch from d74c77f to 7b63105 Compare September 30, 2024 10:29

Haleygo requested a review from hagen1778 September 30, 2024 10:30

hagen1778 approved these changes Sep 30, 2024

View reviewed changes

f41gh7 approved these changes Sep 30, 2024

View reviewed changes

f41gh7 merged commit 664f337 into master Sep 30, 2024
8 checks passed

f41gh7 deleted the fix-duplicated-aggregation-results branch September 30, 2024 12:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stream aggregation: fix possible duplicated aggregation results #7118

stream aggregation: fix possible duplicated aggregation results #7118

Haleygo commented Sep 27, 2024 •

edited

Loading

AndrewChubatiuk left a comment

hagen1778 left a comment

f41gh7 left a comment

stream aggregation: fix possible duplicated aggregation results #7118

stream aggregation: fix possible duplicated aggregation results #7118

Conversation

Haleygo commented Sep 27, 2024 • edited Loading

AndrewChubatiuk left a comment

Choose a reason for hiding this comment

hagen1778 left a comment

Choose a reason for hiding this comment

f41gh7 left a comment

Choose a reason for hiding this comment

Haleygo commented Sep 27, 2024 •

edited

Loading