TSID index: switch to per-day index instead of global one #4563

hagen1778 · 2023-07-03T08:55:22Z

Is your feature request related to a problem? Please describe

TSID index is most vulnerable to high-churn rate issue. This not only make index bigger in high churn-rate environment but also makes index/* caches less efficient, as they rely on data blocks size on disk.

Describe the solution you'd like

Would be nice to switch to per-day partitioning for TSID indexes. This may result in bigger disk usage in a long-run, but would significantly reduce memory usage and increase cache-hit rate for environments with high churn rate

Describe alternatives you've considered

No response

Additional information

No response

The text was updated successfully, but these errors were encountered:

hagen1778 · 2023-07-07T08:34:02Z

cc @valyala

indexDB rotation Previously, during indexDB dateMetricID cache was reseted and it caused a lot of new records creation. It may saturate memory usage, since lookups for exist entries were made. With new logic, daily index records will be pre-created at the 1 hour before indexDB rotation. There is no need to reset dateMetricID cache, since it belongs to indexDB. It greatly improves perforamnce. It should help to implement next feature #4563

during an hour before indexDB rotation start creating records at the next indexDB it must improve performance during switch for the next indexDB and remove ingestion issues. Since there is no need for creation new index records for timeseries already ingested into current indexDB #4563

* lib/storage: pre-create timeseries before indexDB rotation during an hour before indexDB rotation start creating records at the next indexDB it must improve performance during switch for the next indexDB and remove ingestion issues. Since there is no need for creation new index records for timeseries already ingested into current indexDB #4563 * lib/storage: further work on indexdb rotation optimization - Document the change at docs/CHAGNELOG.md - Move back various caches from indexDB to Storage. This makes the change less intrusive. The dateMetricIDCache now takes into account indexDB generation, so it stores (date, metricID) entries for both the current and the next indexDB. - Consolidate the code responsible for idbNext pre-filling into prefillNextIndexDB() function. This improves code readability and maintainability a bit. - Rewrite and simplify the code responsible for calculating the next retention timestamp. Add various tests for corner cases of this code. - Remove indexdb pre-filling from RegisterMetricNames() function, since this function is rarely called. It is OK to add indexdb entries on demand in this function. This simplifies the code. Updates #1401 * docs/CHANGELOG.md: refer to #4563 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>

valyala · 2023-07-22T22:25:33Z

The commit 7094fa3 adds per-day index for MetricName -> TSID. This commit will be included in the next release.

* lib/storage: pre-create timeseries before indexDB rotation during an hour before indexDB rotation start creating records at the next indexDB it must improve performance during switch for the next indexDB and remove ingestion issues. Since there is no need for creation new index records for timeseries already ingested into current indexDB #4563 * lib/storage: further work on indexdb rotation optimization - Document the change at docs/CHAGNELOG.md - Move back various caches from indexDB to Storage. This makes the change less intrusive. The dateMetricIDCache now takes into account indexDB generation, so it stores (date, metricID) entries for both the current and the next indexDB. - Consolidate the code responsible for idbNext pre-filling into prefillNextIndexDB() function. This improves code readability and maintainability a bit. - Rewrite and simplify the code responsible for calculating the next retention timestamp. Add various tests for corner cases of this code. - Remove indexdb pre-filling from RegisterMetricNames() function, since this function is rarely called. It is OK to add indexdb entries on demand in this function. This simplifies the code. Updates #1401 * docs/CHANGELOG.md: refer to #4563 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>

valyala · 2023-07-28T00:10:13Z

VictoriaMetrics uses per-day index for MetricName -> TSID mapping starting from v1.92.0.

Closing this feature request as done.

…f the indexdb rotation Updates #1401 Updates #4563

hagen1778 added enhancement New feature or request performance Performance-related issue labels Jul 3, 2023

hagen1778 assigned f41gh7 Jul 3, 2023

hagen1778 added the TBD To Be Done label Jul 7, 2023

f41gh7 mentioned this issue Jul 13, 2023

lib/storage: adds nextIndexDB to mitigate performance degradation on #4626

Closed

f41gh7 mentioned this issue Jul 17, 2023

lib/storage: pre-create timeseries before indexDB rotation #4652

Merged

valyala added a commit that referenced this issue Jul 22, 2023

Merge branch 'public-single-node' into gh-4563

e778188

valyala added a commit that referenced this issue Jul 22, 2023

Merge branch 'public-single-node' into gh-4563

dacef65

valyala added a commit that referenced this issue Jul 22, 2023

Merge branch 'public-single-node' into gh-4563

146eff0

valyala added a commit that referenced this issue Jul 22, 2023

docs/CHANGELOG.md: refer to #4563

3a3647e

valyala closed this as completed Jul 28, 2023

valyala removed the TBD To Be Done label Jul 28, 2023

valyala added a commit that referenced this issue Jul 29, 2023

lib/storage: update nextRotationTimestamp relative to the timestamp o…

9082a84

…f the indexdb rotation Updates #1401 Updates #4563

valyala added a commit that referenced this issue Jul 29, 2023

lib/storage: update nextRotationTimestamp relative to the timestamp o…

89ccf19

…f the indexdb rotation Updates #1401 Updates #4563

valyala mentioned this issue Sep 11, 2023

Some metrics are lost and I don't know how to debug it #4972

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TSID index: switch to per-day index instead of global one #4563

TSID index: switch to per-day index instead of global one #4563

hagen1778 commented Jul 3, 2023

hagen1778 commented Jul 7, 2023

valyala commented Jul 22, 2023

valyala commented Jul 28, 2023

TSID index: switch to per-day index instead of global one #4563

TSID index: switch to per-day index instead of global one #4563

Comments

hagen1778 commented Jul 3, 2023

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Describe alternatives you've considered

Additional information

hagen1778 commented Jul 7, 2023

valyala commented Jul 22, 2023

valyala commented Jul 28, 2023