Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resort store response set on internal label dedup #6529

Closed
wants to merge 2 commits into from

Conversation

saswatamcode
Copy link
Member

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

This PR does a couple of things,

  • Modifies TestQueryStoreDedup to start a new StoreGW and minio instance on every testcase. The main motivation for doing this is that each testcase needs to allow StoreGW to advertise a different set of labels as external labels and that no two cases interfere with each other.
  • Resorts the store response set on internal label deduplication. When deduplicating on labels which are stored internally in TSDB, the store response set needs to be resorted after replica labels are removed. This PR detects when internal labels are being used for dedup, by checking for their presence in the Store client's external labels.

Aims to fix #6257. Takes after @fpetkovski's earlier work in #6317 (but inverts the approach).

Verification

Modified regression test cases for Store, Sidecar and Receive dedup and tested locally.

…s own

minio and StoreGW. The main motivation for doing this is that each testcase
needs to allow StoreGW to advertise a different set of labels as external
labels, and that no two cases interfere with each other.

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
When deduplicating on labels which are stored internally in TSDB,
the store response set needs to be resorted after replica labels
are removed. This commit detects when internal labels are being used for
dedup, by checking for their presence in the Store client's external labels.

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
@sonatype-lift
Copy link

sonatype-lift bot commented Jul 13, 2023

Sonatype Lift is retiring

Sonatype Lift will be retiring on Sep 12, 2023, with its analysis stopping on Aug 12, 2023. We understand that this news may come as a disappointment, and Sonatype is committed to helping you transition off it seamlessly. If you’d like to retain your data, please export your issues from the web console.
We are extremely grateful and thank you for your support over the years.

📖 Read about the impacts and timeline

@@ -618,6 +618,25 @@ func newAsyncRespSet(
}
}

// hasInternalReplicaLabels returns true if any replica label in the series request is not an
// external label for the given Client.
func hasInternalReplicaLabels(st Client, req *storepb.SeriesRequest) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will always lead to re-sorting if stores have different external labels right? Like prometheus_replica on a sidecar and rule_replica on a ruler.

@saswatamcode saswatamcode marked this pull request as draft July 13, 2023 07:56
@douglascamata
Copy link
Contributor

  • Modifies TestQueryStoreDedup to start a new StoreGW and minio instance on every testcase. The main motivation for doing this is that each testcase needs to allow StoreGW to advertise a different set of labels as external labels and that no two cases interfere with each other.

Is this relevant for the test? Is there any interference? Spinning up all these Store GWs, MinIOs, and uploading on each test case will take much more time relative to the way this is done right now. I was intentionally trying to save time, especially as I noticed using a single Store GW I was able to reproduce all scenarios and didn't identify any bad interference.

I think we might potentially want to keep both scenarios.

@saswatamcode
Copy link
Member Author

Yup it was interfering, but I think that's due to the nature of this fix. We match replica labels to store external labels exactly, and fall back otherwise. I modified it to per test case, so that the store gw exposes the exact replica labels needed for the test.

But yeah this can probably be reverted in the bloom filter approach.

@saswatamcode
Copy link
Member Author

@douglascamata reverted this test change in #6317. All these tests now pass with correct values there 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deduplication returning deduped and non-deduped results in 0.31.0+
3 participants