-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stream aggregation: fix possible duplicated aggregation results #7118
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Haleygo
force-pushed
the
fix-duplicated-aggregation-results
branch
2 times, most recently
from
September 27, 2024 12:15
183c227
to
1f0f380
Compare
hagen1778
reviewed
Sep 30, 2024
AndrewChubatiuk
approved these changes
Sep 30, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Before, same labelset could create multiple indexes in LabelsCompressor when there are concurrent sample ingestions, leading to duplicated states in aggrState and results.
Haleygo
force-pushed
the
fix-duplicated-aggregation-results
branch
from
September 30, 2024 10:29
d74c77f
to
7b63105
Compare
hagen1778
approved these changes
Sep 30, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
f41gh7
approved these changes
Sep 30, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
valyala
pushed a commit
that referenced
this pull request
Sep 30, 2024
When ingesting samples with the same labels(duplicated samples or samples with the same labels after `by` or `without` options). They could register different entries for the same labelset in LabelsCompressor. For example, both index 99 and 100 can be assigned to label `foo=1` in two concurrent pushes. Then due to differing label indexes in encoded keys, the samples will appear as distinct in aggrState, resulting in duplicated results after decompressing the label indexes. https://github.com/VictoriaMetrics/VictoriaMetrics/blob/fbde238cdcdf4e2c892d85a3e9e2be6e54e69cef/lib/streamaggr/streamaggr.go#L933 In this pull request, since we need to store `idxToLabel` first to ensure the idx can be searched after `lc.labelToIdxStore`, the `lc.idxToLabel` still could contain a duplicated entries [100]="foo=1". But given the low likelihood of this issue and the size of idxToLabel, it should be fine.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When ingesting samples with the same labels(duplicated samples or samples with the same labels after
by
orwithout
options). They could register different entries for the same labelset in LabelsCompressor.For example, both index 99 and 100 can be assigned to label
foo=1
in two concurrent pushes. Then due to differing label indexes in encoded keys, the samples will appear as distinct in aggrState, resulting in duplicated results after decompressing the label indexes.VictoriaMetrics/lib/streamaggr/streamaggr.go
Line 933 in fbde238
In this pull request, since we need to store
idxToLabel
first to ensure the idx can be searched afterlc.labelToIdxStore
,the
lc.idxToLabel
still could contain a duplicated entries [100]="foo=1". But given the low likelihood of this issue and the size of idxToLabel, it should be fine.