Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lock Contention #2060

Closed
maxsmythe opened this issue May 24, 2022 · 4 comments · Fixed by #2272
Closed

Lock Contention #2060

maxsmythe opened this issue May 24, 2022 · 4 comments · Fixed by #2272
Assignees
Labels
bug Something isn't working

Comments

@maxsmythe
Copy link
Contributor

What steps did you take and what happened:

When there is a config that syncs large numbers of objects with frequent updates, it can cause performance issues due to lock contention with the validation webhook.

We should find the lock contention (perhaps sync acquires a write lock and validation a read lock) and resolve it.

Here is where the write lock appears to be acquired:

https://github.com/open-policy-agent/frameworks/blob/9db6a6b27c9741bcf9c5f4d8c6e3ef230b49f41d/constraint/pkg/client/drivers/local/storages.go#L116-L123

Without changing how the storage backend works, perhaps the best solution is to throttle the rate at which write locks are acquired.

What did you expect to happen:

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • Gatekeeper version:
  • Kubernetes version: (use kubectl version):
@willbeason
Copy link
Member

Added a feature request to OPA. The three line change I benchmarked reduced time to add data to OPA storage by >99%.

@willbeason
Copy link
Member

open-policy-agent/opa#4709 is out for review.

We'll probably want to deepcopy the object before passing it to OPA storage, just to be safe, so for Gatekeeper the CPU time reduction will be 90%, but the time the transaction holds on the lock will be reduced 99%. This should make our max throughput (as far as the lock is concerned) about 100x higher if AddData is called in parallel.

@stale
Copy link

stale bot commented Aug 1, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Aug 1, 2022
@maxsmythe
Copy link
Contributor Author

This still needs to be integrated. It's a scaling bottleneck for syncing high-throughput kinds

@stale stale bot removed the stale label Aug 2, 2022
@maxsmythe maxsmythe assigned maxsmythe and unassigned willbeason Aug 2, 2022
maxsmythe added a commit to maxsmythe/frameworks that referenced this issue Sep 15, 2022
This disables the round-tripping of cached data between
serialized/deserialized JSON when caching in OPA. This
reduces the amount of time a lock is held on OPA's cache, which
helps mitigate the lock contention mentioned in
open-policy-agent/gatekeeper#2060

Signed-off-by: Max Smythe <smythe@google.com>
maxsmythe added a commit to maxsmythe/frameworks that referenced this issue Sep 15, 2022
This disables the round-tripping of cached data between
serialized/deserialized JSON when caching in OPA. This
reduces the amount of time a lock is held on OPA's cache, which
helps mitigate the lock contention mentioned in
open-policy-agent/gatekeeper#2060

Signed-off-by: Max Smythe <smythe@google.com>
maxsmythe added a commit to maxsmythe/frameworks that referenced this issue Sep 15, 2022
This disables the round-tripping of cached data between
serialized/deserialized JSON when caching in OPA. This
reduces the amount of time a lock is held on OPA's cache, which
helps mitigate the lock contention mentioned in
open-policy-agent/gatekeeper#2060

Signed-off-by: Max Smythe <smythe@google.com>
maxsmythe added a commit to maxsmythe/frameworks that referenced this issue Sep 16, 2022
This disables the round-tripping of cached data between
serialized/deserialized JSON when caching in OPA. This
reduces the amount of time a lock is held on OPA's cache, which
helps mitigate the lock contention mentioned in
open-policy-agent/gatekeeper#2060

Signed-off-by: Max Smythe <smythe@google.com>
maxsmythe added a commit to open-policy-agent/frameworks that referenced this issue Sep 16, 2022
This disables the round-tripping of cached data between
serialized/deserialized JSON when caching in OPA. This
reduces the amount of time a lock is held on OPA's cache, which
helps mitigate the lock contention mentioned in
open-policy-agent/gatekeeper#2060

Signed-off-by: Max Smythe <smythe@google.com>

Signed-off-by: Max Smythe <smythe@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants