Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributor/Ingestor: Replace out of order validation with ingestion window #2452

Conversation

simonswine
Copy link
Contributor

@simonswine simonswine commented Sep 25, 2023

This change removes of order validation and instead make sure the ingested profiles are within the ingestion window, which will by default reach from now-1h to now+10m.

This will ensure a smoother experience with workloads that don't have unique labels sets

@simonswine simonswine force-pushed the 20230925_replace-out-of-order-with-ingestion-window branch from 48d4a31 to f14b9ba Compare September 26, 2023 16:24
@simonswine simonswine changed the title 20230925 replace out of order with ingestion window Replace out of order with ingestion window Sep 26, 2023
@simonswine simonswine changed the title Replace out of order with ingestion window Distributor/Ingestor: Replace out of order validation with ingestion window Sep 26, 2023
@simonswine simonswine marked this pull request as ready for review September 26, 2023 16:26
@simonswine simonswine requested a review from a team as a code owner September 26, 2023 16:26
@simonswine simonswine force-pushed the 20230925_replace-out-of-order-with-ingestion-window branch from f14b9ba to 9bbb6f4 Compare September 26, 2023 21:09
@kolesnikovae
Copy link
Collaborator

kolesnikovae commented Sep 27, 2023

I'm wondering if we should do anything about the order of the profiles in memory because I think it's necessary for deduplication. I see that profiles are sorted when we flush a row group on disk, but I couldn't find if we sort in-memory profiles explicitly at query time. I expected to find it here, but maybe I don't fully understand how it works.

Copy link
Collaborator

@korniltsev korniltsev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

should we remove delta as well? just read the design doc

@simonswine simonswine force-pushed the 20230925_replace-out-of-order-with-ingestion-window branch from 9bbb6f4 to fa5f8e9 Compare September 28, 2023 08:10
@simonswine simonswine force-pushed the 20230925_replace-out-of-order-with-ingestion-window branch from fa5f8e9 to 06b45e4 Compare September 28, 2023 08:10
@simonswine simonswine force-pushed the 20230925_replace-out-of-order-with-ingestion-window branch from 06b45e4 to fb00961 Compare September 28, 2023 08:11
@simonswine
Copy link
Contributor Author

I'm wondering if we should do anything about the order of the profiles in memory because I think it's necessary for deduplication. I see that profiles are sorted when we flush a row group on disk, but I couldn't find if we sort in-memory profiles explicitly at query time. I expected to find it here, but maybe I don't fully understand how it works.

@kolesnikovae you have raised a good point, I have added a binary search for the cases when profiles get added out of order.

Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Yep we need to keep the in memory sorted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants