Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mutex contention in vmselect #5087

Closed
misutoth opened this issue Sep 28, 2023 · 11 comments
Closed

Mutex contention in vmselect #5087

misutoth opened this issue Sep 28, 2023 · 11 comments
Assignees
Labels
bug Something isn't working

Comments

@misutoth
Copy link

Describe the bug

Using VictoriaMetrics cluster version, and I choose to use vmselect v1.93.3-cluster, some queries execute fast but some takes orders of magnitude longer. See Varying executions
We are using HPA but apparently the system does not want to scale up. It seems there is some problem preventing parallelization. There are "only" 6 vmselect pods so it seems there is not a loack of CPU resources. And indeed, when I execute the following on vmselect:

$ go tool pprof -web http://localhost:53808/debug/pprof/goroutine

We can see that the goroutines are waiting on the mutex to access items InternString cache. Please see
Mutex contention.pdf
Goroutines waiting for mutex

When I disable caching with internStringDisableCache=true the contention disappears, vmselect instance count increases to 9, response time drops and stabilizes:
No contention.pdf
Response time drops

To Reproduce

We have a performance test environment to simulate the production load. We simulate about 430 million unique time series, stored with replication factor of 2. The ingestion speed is 6.5 million samples per second. We have about 700 recording rules generating queries quite frequently.

Version

/ # /vmselect-prod --version
vmselect-20230902-002725-tags-v1.93.3-cluster-0-gf78d8b994d

/ # /vmstorage-prod --version
vmstorage-20230902-002932-tags-v1.93.3-cluster-0-gf78d8b994d

/ # /vminsert-prod --version
vminsert-20230902-002549-tags-v1.93.3-cluster-0-gf78d8b994d

Logs

may not be needed

Screenshots

Screenshot 2023-09-28 at 17 31 38
Screenshot 2023-09-28 at 17 31 19
Screenshot 2023-09-28 at 17 30 56
Screenshot 2023-09-28 at 17 30 34
Screenshot 2023-09-28 at 17 29 49
Screenshot 2023-09-28 at 17 29 29
Screenshot 2023-09-28 at 17 28 28
Screenshot 2023-09-28 at 17 28 09
Mutex contention.pdf

Goroutines waiting for mutex
Varying executions
No contention.pdf

Used command-line flags

vmslect:
dedup.minScrapeInterval=1ms
envflag.enable=true
envflag.prefix=VM_
http.connTimeout=15s
http.maxGracefulShutdownDuration=2m
loggerFormat=json
memory.allowedBytes=40GiB
replicationFactor=2
search.maxConcurrentRequests=150
search.maxLookback=6m
search.maxQueryDuration=300s
search.maxSamplesPerQuery=5000000000
search.maxSeries=500000
search.maxStalenessInterval=6m
search.maxUniqueTimeseries=100000000
search.minStalenessInterval=5m
vmalert.proxyURL=http://alert-rules:8880
vmstorageDialTimeout=1s

vmstorage:
retentionPeriod=1
storageDataPath=/storage
envflag.enable=true
envflag.prefix=VM_
http.connTimeout=15s
loggerFormat=json
memory.allowedBytes=220GiB
search.maxUniqueTimeseries=100000000
smallMergeConcurrency=4

vminsert:
envflag.enable=true
envflag.prefix=VM_
http.connTimeout=15s
insert.maxQueueDuration=30s
loggerFormat=json
maxConcurrentInserts=50
maxInsertRequestSize=1GiB
maxLabelsPerTimeseries=70
memory.allowedBytes=20GiB
opentsdbhttp.maxInsertRequestSize=512MiB
replicationFactor=2
vmstorageDialTimeout=2s

Additional information

I have just realized as I was writing my report and was looking at the source code that I can disable this kind of caching using the internStringDisableCache flag. Originally I wanted to report it as a regression and compare it with v1.81.2.

Disabling this cache is a good workaround but probably the motivation was that it tolerates high concurrency. So this bug report may be useful still.

@misutoth misutoth added the bug Something isn't working label Sep 28, 2023
@zekker6 zekker6 self-assigned this Sep 29, 2023
@zekker6
Copy link
Contributor

zekker6 commented Sep 29, 2023

Hello @misutoth , thank you for the detailed report!

Currently, VictoriaMetrics uses Golang sync.Map which is optimized for this use-case(1):

The Map type is optimized for two common use cases: (1) when the entry for a given key is only ever written once but read many times, as in caches that only grow, or (2) when multiple goroutines read, write, and overwrite entries for disjoint sets of keys.

So there is not much we can optimize on our side without removing sync.Map.
Is it possible that during alert / recording rules calculation a lot of unique series is fetched?
In case it is true, there will be a lot of unique strings fetched at every evaluation, which makes it almost harmful to try to intern and cache all these values as values will not be reused later on.

@valyala Could you advise if it would make sense to try replacing sync.Map with fastcache here?

@hagen1778 hagen1778 added the TBD To Be Done label Oct 2, 2023
@f41gh7 f41gh7 assigned f41gh7 and unassigned zekker6 Oct 2, 2023
@misutoth
Copy link
Author

misutoth commented Oct 2, 2023

Hello all. Thanks for the quick response.

Is it possible that during alert / recording rules calculation a lot of unique series is fetched?

Yes, a lot of time series are used. However, there are not that many new time series on each execution of the queries.

I am thinking that the internStringCacheExpireDuration default of 6 minutes may be too short in our use case. We have around 8-12 instances of vmselect which are load balanced. Also, many of the rules have 3 minutes interval. All these increases the chances that the interned strings get evicted. That could lead to such inefficiency, couldnt it?

@hagen1778
Copy link
Collaborator

@misutoth if I'm not mistaken, it is a bug that interning is still used in vmselect. It shouldn't be there.
AFAIK, @f41gh7 is going to remove it as soon as he has time.

f41gh7 added a commit that referenced this issue Oct 3, 2023
previously lock contetion may happen on machine with big number of CPU due to enabled string interning. sync.Map was a choke point for all aggregation requests.
Now instead of interning, new string is created. It may increase CPU and memory usage for some cases.
#5087
@misutoth
Copy link
Author

misutoth commented Oct 4, 2023

@misutoth if I'm not mistaken, it is a bug that interning is still used in vmselect. It shouldn't be there. AFAIK, @f41gh7 is going to remove it as soon as he has time.

I havent reached that conclusion yet but I believe you. 😄 I was just thinking that maybe with a more generous eviction time it might be still useful even in an environment with high cpu cores and multiple vmselects. Because I guess in most deployments and in most of the times churn is supposed to be low. So the interned strings may be read multiple times throughout their lifecycle. If you give me a reasonable eviction time I can run the performance test in our environment with that.

f41gh7 added a commit that referenced this issue Oct 4, 2023
hagen1778 pushed a commit that referenced this issue Oct 9, 2023
previously lock contetion may happen on machine with big number of CPU due to enabled string interning. sync.Map was a choke point for all aggregation requests.
Now instead of interning, new string is created. It may increase CPU and memory usage for some cases.
#5087
hagen1778 pushed a commit that referenced this issue Oct 10, 2023
…5119)

reduce lock contention for heavy aggregation requests
previously lock contetion may happen on machine with big number of CPU due to enabled string interning. sync.Map was a choke point for all aggregation requests.
Now instead of interning, new string is created. It may increase CPU and memory usage for some cases.
#5087
hagen1778 pushed a commit that referenced this issue Oct 10, 2023
…5119)

reduce lock contention for heavy aggregation requests
previously lock contetion may happen on machine with big number of CPU due to enabled string interning. sync.Map was a choke point for all aggregation requests.
Now instead of interning, new string is created. It may increase CPU and memory usage for some cases.
#5087
valyala added a commit that referenced this issue Oct 15, 2023
… string when storing a value by map key

The assigned map key shouldn't change over time, otherwise the map won't work properly.

This is a follow-up for 1f91f22
Updates #5087
valyala pushed a commit that referenced this issue Oct 16, 2023
…5119)

reduce lock contention for heavy aggregation requests
previously lock contetion may happen on machine with big number of CPU due to enabled string interning. sync.Map was a choke point for all aggregation requests.
Now instead of interning, new string is created. It may increase CPU and memory usage for some cases.
#5087
valyala pushed a commit that referenced this issue Oct 16, 2023
…5119)

reduce lock contention for heavy aggregation requests
previously lock contetion may happen on machine with big number of CPU due to enabled string interning. sync.Map was a choke point for all aggregation requests.
Now instead of interning, new string is created. It may increase CPU and memory usage for some cases.
#5087
valyala added a commit that referenced this issue Oct 16, 2023
… string when storing a value by map key

The assigned map key shouldn't change over time, otherwise the map won't work properly.

This is a follow-up for 1f91f22
Updates #5087
valyala added a commit that referenced this issue Oct 16, 2023
… string when storing a value by map key

The assigned map key shouldn't change over time, otherwise the map won't work properly.

This is a follow-up for 1f91f22
Updates #5087
valyala added a commit that referenced this issue Oct 16, 2023
… string when storing a value by map key

The assigned map key shouldn't change over time, otherwise the map won't work properly.

This is a follow-up for 1f91f22
Updates #5087
@valyala
Copy link
Collaborator

valyala commented Oct 16, 2023

@misutoth , could you build vmselect from the commit 4278b00 according to these instructions and verify whether it resolves the mutex contention during querying under your workload?

@valyala valyala removed the TBD To Be Done label Oct 16, 2023
@valyala
Copy link
Collaborator

valyala commented Oct 16, 2023

FYI, the fix, which reduces mutex contention at vmselect on machines with big number of CPU cores has been included in VictoriaMetrics v1.93.6. @misutoth , could you verify whether this release really fixes the issue with high mutex contention at vmselect under your workload?

P.S. The fix will be also included in the upcoming v1.95.0 release.

@misutoth
Copy link
Author

All looks good with v1.93.6. Thank you all! ❤️

@valyala
Copy link
Collaborator

valyala commented Oct 18, 2023

@misutoth , thanks for the confirmation that the fix resolves scalability issues for vmselect running on systems with big number of CPU cores. FYI, there is yet another pull request, which can improve vmselect scalability on systems with big number of CPU cores further - #5036 .

@valyala
Copy link
Collaborator

valyala commented Oct 18, 2023

Closing the issue as fixed

@valyala valyala closed this as completed Oct 18, 2023
valyala added a commit that referenced this issue Oct 18, 2023
- The `-search.maxWorkersPerQuery` command-line flag doesn't limit resource usage,
  so move it from the `resource usage limits` to `troubleshooting` chapter at docs/Single-server-VictoriaMetrics.md

- Make more clear the description for the `-search.maxWorkersPerQuery` command-line flag

- Add the description of `-search.maxWorkersPerQuery` to docs/Cluster-VictoriaMetrics.md

- Limit the maximum value, which can be passed to `-search.maxWorkersPerQuery`, to GOMAXPROCS,
  because bigger values may worsen query performance and increase CPU usage

- Improve the the description of the change at docs/CHANGELOG.md. Mark it as FEATURE instead of BUGFIX,
  since it is closer to a feature than to a bugfix.

Updates #5087
valyala added a commit that referenced this issue Oct 18, 2023
* app/vmselect: limit the number of parallel workers by 32

The change should improve performance and memory usage during query processing
on machines with big number of CPU cores. The number of parallel workers for
query processing is controlled via `-search.maxWorkersPerQuery` command-line flag.
By default, the number of workers is limited by the number of available CPU cores,
but not more than 32. The limit can be increased via `-search.maxWorkersPerQuery`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

- The `-search.maxWorkersPerQuery` command-line flag doesn't limit resource usage,
  so move it from the `resource usage limits` to `troubleshooting` chapter at docs/Single-server-VictoriaMetrics.md

- Make more clear the description for the `-search.maxWorkersPerQuery` command-line flag

- Add the description of `-search.maxWorkersPerQuery` to docs/Cluster-VictoriaMetrics.md

- Limit the maximum value, which can be passed to `-search.maxWorkersPerQuery`, to GOMAXPROCS,
  because bigger values may worsen query performance and increase CPU usage

- Improve the the description of the change at docs/CHANGELOG.md. Mark it as FEATURE instead of BUGFIX,
  since it is closer to a feature than to a bugfix.

Updates #5087

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
valyala added a commit that referenced this issue Oct 26, 2023
* app/vmselect: limit the number of parallel workers by 32

The change should improve performance and memory usage during query processing
on machines with big number of CPU cores. The number of parallel workers for
query processing is controlled via `-search.maxWorkersPerQuery` command-line flag.
By default, the number of workers is limited by the number of available CPU cores,
but not more than 32. The limit can be increased via `-search.maxWorkersPerQuery`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

- The `-search.maxWorkersPerQuery` command-line flag doesn't limit resource usage,
  so move it from the `resource usage limits` to `troubleshooting` chapter at docs/Single-server-VictoriaMetrics.md

- Make more clear the description for the `-search.maxWorkersPerQuery` command-line flag

- Add the description of `-search.maxWorkersPerQuery` to docs/Cluster-VictoriaMetrics.md

- Limit the maximum value, which can be passed to `-search.maxWorkersPerQuery`, to GOMAXPROCS,
  because bigger values may worsen query performance and increase CPU usage

- Improve the the description of the change at docs/CHANGELOG.md. Mark it as FEATURE instead of BUGFIX,
  since it is closer to a feature than to a bugfix.

Updates #5087

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
valyala added a commit that referenced this issue Oct 26, 2023
* app/vmselect: limit the number of parallel workers by 32

The change should improve performance and memory usage during query processing
on machines with big number of CPU cores. The number of parallel workers for
query processing is controlled via `-search.maxWorkersPerQuery` command-line flag.
By default, the number of workers is limited by the number of available CPU cores,
but not more than 32. The limit can be increased via `-search.maxWorkersPerQuery`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

- The `-search.maxWorkersPerQuery` command-line flag doesn't limit resource usage,
  so move it from the `resource usage limits` to `troubleshooting` chapter at docs/Single-server-VictoriaMetrics.md

- Make more clear the description for the `-search.maxWorkersPerQuery` command-line flag

- Add the description of `-search.maxWorkersPerQuery` to docs/Cluster-VictoriaMetrics.md

- Limit the maximum value, which can be passed to `-search.maxWorkersPerQuery`, to GOMAXPROCS,
  because bigger values may worsen query performance and increase CPU usage

- Improve the the description of the change at docs/CHANGELOG.md. Mark it as FEATURE instead of BUGFIX,
  since it is closer to a feature than to a bugfix.

Updates #5087

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
valyala added a commit that referenced this issue Oct 26, 2023
* app/vmselect: limit the number of parallel workers by 32

The change should improve performance and memory usage during query processing
on machines with big number of CPU cores. The number of parallel workers for
query processing is controlled via `-search.maxWorkersPerQuery` command-line flag.
By default, the number of workers is limited by the number of available CPU cores,
but not more than 32. The limit can be increased via `-search.maxWorkersPerQuery`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

- The `-search.maxWorkersPerQuery` command-line flag doesn't limit resource usage,
  so move it from the `resource usage limits` to `troubleshooting` chapter at docs/Single-server-VictoriaMetrics.md

- Make more clear the description for the `-search.maxWorkersPerQuery` command-line flag

- Add the description of `-search.maxWorkersPerQuery` to docs/Cluster-VictoriaMetrics.md

- Limit the maximum value, which can be passed to `-search.maxWorkersPerQuery`, to GOMAXPROCS,
  because bigger values may worsen query performance and increase CPU usage

- Improve the the description of the change at docs/CHANGELOG.md. Mark it as FEATURE instead of BUGFIX,
  since it is closer to a feature than to a bugfix.

Updates #5087

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
@valyala
Copy link
Collaborator

valyala commented Oct 27, 2023

FYI, the next release of VictoriaMetrics will support -search.maxWorkersPerQuery command-line flag, which can be used for fine-tuning query performance on systems with more than 16 CPU cores for particular workloads. See the description of this flag in the Troubleshooting docs for more details.

AndrewChubatiuk pushed a commit to AndrewChubatiuk/VictoriaMetrics that referenced this issue Nov 15, 2023
…ictoriaMetrics#5119)

reduce lock contention for heavy aggregation requests
previously lock contetion may happen on machine with big number of CPU due to enabled string interning. sync.Map was a choke point for all aggregation requests.
Now instead of interning, new string is created. It may increase CPU and memory usage for some cases.
VictoriaMetrics#5087
AndrewChubatiuk pushed a commit to AndrewChubatiuk/VictoriaMetrics that referenced this issue Nov 15, 2023
… string when storing a value by map key

The assigned map key shouldn't change over time, otherwise the map won't work properly.

This is a follow-up for 1f91f22
Updates VictoriaMetrics#5087
AndrewChubatiuk pushed a commit to AndrewChubatiuk/VictoriaMetrics that referenced this issue Nov 15, 2023
…rics#5195)

* app/vmselect: limit the number of parallel workers by 32

The change should improve performance and memory usage during query processing
on machines with big number of CPU cores. The number of parallel workers for
query processing is controlled via `-search.maxWorkersPerQuery` command-line flag.
By default, the number of workers is limited by the number of available CPU cores,
but not more than 32. The limit can be increased via `-search.maxWorkersPerQuery`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

- The `-search.maxWorkersPerQuery` command-line flag doesn't limit resource usage,
  so move it from the `resource usage limits` to `troubleshooting` chapter at docs/Single-server-VictoriaMetrics.md

- Make more clear the description for the `-search.maxWorkersPerQuery` command-line flag

- Add the description of `-search.maxWorkersPerQuery` to docs/Cluster-VictoriaMetrics.md

- Limit the maximum value, which can be passed to `-search.maxWorkersPerQuery`, to GOMAXPROCS,
  because bigger values may worsen query performance and increase CPU usage

- Improve the the description of the change at docs/CHANGELOG.md. Mark it as FEATURE instead of BUGFIX,
  since it is closer to a feature than to a bugfix.

Updates VictoriaMetrics#5087

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
@valyala
Copy link
Collaborator

valyala commented Nov 15, 2023

The -search.maxWorkersPerQuery command-line flag is available starting from v1.95.0 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants