New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TSID index: switch to per-day index instead of global one #4563
Labels
Comments
hagen1778
added
enhancement
New feature or request
performance
Performance-related issue
labels
Jul 3, 2023
cc @valyala |
f41gh7
added a commit
that referenced
this issue
Jul 13, 2023
indexDB rotation Previously, during indexDB dateMetricID cache was reseted and it caused a lot of new records creation. It may saturate memory usage, since lookups for exist entries were made. With new logic, daily index records will be pre-created at the 1 hour before indexDB rotation. There is no need to reset dateMetricID cache, since it belongs to indexDB. It greatly improves perforamnce. It should help to implement next feature #4563
f41gh7
added a commit
that referenced
this issue
Jul 17, 2023
during an hour before indexDB rotation start creating records at the next indexDB it must improve performance during switch for the next indexDB and remove ingestion issues. Since there is no need for creation new index records for timeseries already ingested into current indexDB #4563
f41gh7
added a commit
that referenced
this issue
Jul 18, 2023
during an hour before indexDB rotation start creating records at the next indexDB it must improve performance during switch for the next indexDB and remove ingestion issues. Since there is no need for creation new index records for timeseries already ingested into current indexDB #4563
valyala
added a commit
that referenced
this issue
Jul 22, 2023
valyala
added a commit
that referenced
this issue
Jul 22, 2023
valyala
added a commit
that referenced
this issue
Jul 22, 2023
valyala
added a commit
that referenced
this issue
Jul 22, 2023
valyala
added a commit
that referenced
this issue
Jul 22, 2023
* lib/storage: pre-create timeseries before indexDB rotation during an hour before indexDB rotation start creating records at the next indexDB it must improve performance during switch for the next indexDB and remove ingestion issues. Since there is no need for creation new index records for timeseries already ingested into current indexDB #4563 * lib/storage: further work on indexdb rotation optimization - Document the change at docs/CHAGNELOG.md - Move back various caches from indexDB to Storage. This makes the change less intrusive. The dateMetricIDCache now takes into account indexDB generation, so it stores (date, metricID) entries for both the current and the next indexDB. - Consolidate the code responsible for idbNext pre-filling into prefillNextIndexDB() function. This improves code readability and maintainability a bit. - Rewrite and simplify the code responsible for calculating the next retention timestamp. Add various tests for corner cases of this code. - Remove indexdb pre-filling from RegisterMetricNames() function, since this function is rarely called. It is OK to add indexdb entries on demand in this function. This simplifies the code. Updates #1401 * docs/CHANGELOG.md: refer to #4563 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
The commit 7094fa3 adds per-day index for |
valyala
added a commit
that referenced
this issue
Jul 22, 2023
* lib/storage: pre-create timeseries before indexDB rotation during an hour before indexDB rotation start creating records at the next indexDB it must improve performance during switch for the next indexDB and remove ingestion issues. Since there is no need for creation new index records for timeseries already ingested into current indexDB #4563 * lib/storage: further work on indexdb rotation optimization - Document the change at docs/CHAGNELOG.md - Move back various caches from indexDB to Storage. This makes the change less intrusive. The dateMetricIDCache now takes into account indexDB generation, so it stores (date, metricID) entries for both the current and the next indexDB. - Consolidate the code responsible for idbNext pre-filling into prefillNextIndexDB() function. This improves code readability and maintainability a bit. - Rewrite and simplify the code responsible for calculating the next retention timestamp. Add various tests for corner cases of this code. - Remove indexdb pre-filling from RegisterMetricNames() function, since this function is rarely called. It is OK to add indexdb entries on demand in this function. This simplifies the code. Updates #1401 * docs/CHANGELOG.md: refer to #4563 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
VictoriaMetrics uses per-day index for Closing this feature request as done. |
3 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is your feature request related to a problem? Please describe
TSID index is most vulnerable to high-churn rate issue. This not only make index bigger in high churn-rate environment but also makes
index/*
caches less efficient, as they rely on data blocks size on disk.Describe the solution you'd like
Would be nice to switch to per-day partitioning for TSID indexes. This may result in bigger disk usage in a long-run, but would significantly reduce memory usage and increase cache-hit rate for environments with high churn rate
Describe alternatives you've considered
No response
Additional information
No response
The text was updated successfully, but these errors were encountered: