Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to limit the maximum number of CPU cores used for queries #291

Closed
valyala opened this issue Jan 20, 2020 · 3 comments
Closed
Labels
enhancement New feature or request

Comments

@valyala
Copy link
Collaborator

valyala commented Jan 20, 2020

Is your feature request related to a problem? Please describe.
A single heavy query in VictoriaMetrics can occupy all the available CPU cores. This is good for returning query results ASAP. But this can negatively affect data ingestion pipeline, which may starve for CPU resources while heavy queries are performed. This is seen in production.

Describe the solution you'd like
It would be great adding -search.maxCPUs command-line flag for limiting the maximum number of CPUs that can be used by query pipeline. Then we can leave the guaranteed number of CPU cores for data ingestion pipeline. For instance, if a system has 64 CPU cores, then -search.maxCPUs=60 would guarantee that at least 4 CPU cores will be always available for data ingestion path.

Describe alternatives you've considered
An alternative is to reduce -search.maxConcurrentRequests. This reduces the number of concurrently running queries, leaving more chances for ingestion path to get the required CPU resources. Note that -search.maxConcurrentRequests doesn't limit the number of CPU cores that can be used by queries, since a single heavy query can take all the available CPU cores as described above.

@valyala valyala added the enhancement New feature or request label Jan 20, 2020
valyala added a commit that referenced this issue Jul 5, 2020
Heavy queries could result in the lack of CPU resources for processing the current data ingestion stream.
Prevent this by delaying queries' execution until free resources are available for data ingestion.

Expose `vm_search_delays_total` metric, which may be used in for alerting when there is no enough CPU resources
for data ingestion and/or for executing heavy queries.

Updates #291
valyala added a commit that referenced this issue Jul 5, 2020
Heavy queries could result in the lack of CPU resources for processing the current data ingestion stream.
Prevent this by delaying queries' execution until free resources are available for data ingestion.

Expose `vm_search_delays_total` metric, which may be used in for alerting when there is no enough CPU resources
for data ingestion and/or for executing heavy queries.

Updates #291
@valyala
Copy link
Collaborator Author

valyala commented Jul 8, 2020

FYI, VictoriaMetrics implements a mechanism for prioritizing data ingestion over querying starting from v1.38.0.

valyala added a commit that referenced this issue Jul 23, 2020
Prioritize also small merges over big merges.

Updates #291
Updates #648
valyala added a commit that referenced this issue Jul 23, 2020
Prioritize also small merges over big merges.

Updates #291
Updates #648
@valyala
Copy link
Collaborator Author

valyala commented Jul 24, 2020

FYI, the release v1.39.0 should improve further the prioritization of data ingestion over heavy queries.

valyala added a commit that referenced this issue Jan 16, 2023
…s during assisted merges

Updates #3647
Updates #3641
Updates #648
Updates #291
valyala added a commit that referenced this issue Jan 16, 2023
…s during assisted merges

Updates #3647
Updates #3641
Updates #648
Updates #291
valyala added a commit that referenced this issue Jan 26, 2024
…lable

- Maintain a separate worker pool per each part type (in-memory, file, big and small).
  Previously a shared pool was used for merging all the part types.
  A single merge worker could merge parts with mixed types at once. For example,
  it could merge simultaneously an in-memory part plus a big file part.
  Such a merge could take hours for big file part. During the duration of this merge
  the in-memory part was pinned in memory and couldn't be persisted to disk
  under the configured -inmemoryDataFlushInterval .

  Another common issue, which could happen when parts with mixed types are merged,
  is uncontrolled growth of in-memory parts or small parts when all the merge workers
  were busy with merging big files. Such growth could lead to significant performance
  degradataion for queries, since every query needs to check ever growing list of parts.
  This could also slow down the registration of new time series, since VictoriaMetrics
  searches for the internal series_id in the indexdb for every new time series.

  The third issue is graceful shutdown duration, which could be very long when a background
  merge is running on in-memory parts plus big file parts. This merge couldn't be interrupted,
  since it merges in-memory parts.

  A separate pool of merge workers per every part type elegantly resolves both issues:
  - In-memory parts are merged to file-based parts in a timely manner, since the maximum
    size of in-memory parts is limited.
  - Long-running merges for big parts do not block merges for in-memory parts and small parts.
  - Graceful shutdown duration is now limited by the time needed for flushing in-memory parts to files.
    Merging for file parts is instantly canceled on graceful shutdown now.

- Deprecate -smallMergeConcurrency command-line flag, since the new background merge algorithm
  should automatically self-tune according to the number of available CPU cores.

- Deprecate -finalMergeDelay command-line flag, since it wasn't working correctly.
  It is better to run forced merge when needed - https://docs.victoriametrics.com/#forced-merge

- Tune the number of shards for pending rows and items before the data goes to in-memory parts
  and becomes visible for search. This improves the maximum data ingestion rate and the maximum rate
  for registration of new time series. This should reduce the duration of data ingestion slowdown
  in VictoriaMetrics cluster on e.g. re-routing events, when some of vmstorage nodes become temporarily
  unavailable.

- Prevent from possible "sync: WaitGroup misuse" panic on graceful shutdown.

This is a follow-up for fa566c6 .
Thanks @misutoth to for the inspiration at #5212

Updates #5190
Updates #3790
Updates #3551
Updates #3337
Updates #3425
Updates #3647
Updates #3641
Updates #648
Updates #291
valyala added a commit that referenced this issue Jan 26, 2024
…lable

- Maintain a separate worker pool per each part type (in-memory, file, big and small).
  Previously a shared pool was used for merging all the part types.
  A single merge worker could merge parts with mixed types at once. For example,
  it could merge simultaneously an in-memory part plus a big file part.
  Such a merge could take hours for big file part. During the duration of this merge
  the in-memory part was pinned in memory and couldn't be persisted to disk
  under the configured -inmemoryDataFlushInterval .

  Another common issue, which could happen when parts with mixed types are merged,
  is uncontrolled growth of in-memory parts or small parts when all the merge workers
  were busy with merging big files. Such growth could lead to significant performance
  degradataion for queries, since every query needs to check ever growing list of parts.
  This could also slow down the registration of new time series, since VictoriaMetrics
  searches for the internal series_id in the indexdb for every new time series.

  The third issue is graceful shutdown duration, which could be very long when a background
  merge is running on in-memory parts plus big file parts. This merge couldn't be interrupted,
  since it merges in-memory parts.

  A separate pool of merge workers per every part type elegantly resolves both issues:
  - In-memory parts are merged to file-based parts in a timely manner, since the maximum
    size of in-memory parts is limited.
  - Long-running merges for big parts do not block merges for in-memory parts and small parts.
  - Graceful shutdown duration is now limited by the time needed for flushing in-memory parts to files.
    Merging for file parts is instantly canceled on graceful shutdown now.

- Deprecate -smallMergeConcurrency command-line flag, since the new background merge algorithm
  should automatically self-tune according to the number of available CPU cores.

- Deprecate -finalMergeDelay command-line flag, since it wasn't working correctly.
  It is better to run forced merge when needed - https://docs.victoriametrics.com/#forced-merge

- Tune the number of shards for pending rows and items before the data goes to in-memory parts
  and becomes visible for search. This improves the maximum data ingestion rate and the maximum rate
  for registration of new time series. This should reduce the duration of data ingestion slowdown
  in VictoriaMetrics cluster on e.g. re-routing events, when some of vmstorage nodes become temporarily
  unavailable.

- Prevent from possible "sync: WaitGroup misuse" panic on graceful shutdown.

This is a follow-up for fa566c6 .
Thanks @misutoth to for the inspiration at #5212

Updates #5190
Updates #3790
Updates #3551
Updates #3337
Updates #3425
Updates #3647
Updates #3641
Updates #648
Updates #291
@valyala
Copy link
Collaborator Author

valyala commented Feb 7, 2024

FYI, VictoriaMetrics supports -search.maxWorkersPerQuery command-line flag starting from v1.95.0 release - see this pull request for details.

This allows limiting the number of CPU cores, which can be used by a single query. VictoriaMetrics also provides an ability to configure the maximum number of concurrent queries via -search.maxConcurrentRequests command-line flag. See these docs for details. These two command-line flags allow limiting the number of CPU cores used for queries.

Closing the feature request as done.

@valyala valyala closed this as completed Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant