Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clustermesh-apiserver: rework services synchronization to improve performance #25260

Merged

Conversation

giorio94
Copy link
Member

@giorio94 giorio94 commented May 4, 2023

This PR reworks the logic used by the clustermesh-apiserver to synchronize the services and the associated backends to the kvstore. Specifically, it introduces the SyncStore abstraction, which models the operations required to synchronize key/value pairs into a kvstore, and represents a simplified "write-only" version of the SharedStore abstraction currently available. The implementation relies on a workqueue to allow coalescence multiple updates of the same key into one operation only, as well as transparently handling temporary failures through retries. Notably, periodic update of all known keys is not performed, to reduce the load on the kvstore (it was already not performed for CiliumEndpoints and CiliumIdentities, which are also synchronized by the clustermesh-apiserver).

Please, refer to the commit descriptions for additional details about the different changes.

clustermesh-apiserver: rework services synchronization to improve performance

@giorio94 giorio94 added kind/performance There is a performance impact of this. area/clustermesh Relates to multi-cluster routing functionality in Cilium. release-note/misc This PR makes changes that have no direct user impact. labels May 4, 2023
@giorio94 giorio94 force-pushed the mio/clustermesh-apiserver-services-sync branch 2 times, most recently from a4d5321 to fa231e7 Compare May 5, 2023 08:40
// by SyncStore implementations.
type SyncStoreBackend interface {
// Update creates or updates a key.
Update(ctx context.Context, key string, value []byte, lease bool) error
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possibility here would be to leverage UpdateIfDifferent, which first performs a read operation and then the update if the key does not exist, the value is different, or the lease ID changed. Given that we already know through the local cache whether the entry is up-to-date or not (and we don't perform the update if not necessary), my opinion is that plain Update is more efficient, as we save an additional read operation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@giorio94
Copy link
Member Author

giorio94 commented May 5, 2023

/test

Job 'Cilium-PR-K8s-1.26-kernel-net-next' failed:

Click to show.

Test Name

K8sDatapathServicesTest Checks N/S loadbalancing With host policy Tests NodePort

Failure Output

FAIL: Can not connect to service "http://192.168.56.12:30228" from outside cluster (1/10)

Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.26-kernel-net-next/2091/

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.26-kernel-net-next so I can create one.

Then please upload the Jenkins artifacts to that issue.

@giorio94 giorio94 marked this pull request as ready for review May 5, 2023 08:47
@giorio94 giorio94 requested review from a team as code owners May 5, 2023 08:47
@giorio94 giorio94 requested review from tklauser and marseel May 5, 2023 08:47
pkg/kvstore/store/syncstore.go Outdated Show resolved Hide resolved
pkg/kvstore/store/syncstore.go Outdated Show resolved Hide resolved
// by SyncStore implementations.
type SyncStoreBackend interface {
// Update creates or updates a key.
Update(ctx context.Context, key string, value []byte, lease bool) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

pkg/kvstore/store/syncstore.go Outdated Show resolved Hide resolved
pkg/kvstore/store/syncstore.go Show resolved Hide resolved
pkg/kvstore/store/syncstore.go Outdated Show resolved Hide resolved
@giorio94 giorio94 force-pushed the mio/clustermesh-apiserver-services-sync branch from fa231e7 to 81c923a Compare May 5, 2023 12:35
@giorio94 giorio94 requested a review from a team as a code owner May 5, 2023 12:35
Comment on lines 25 to 33
// UpsertKey upserts a key/value pair into the kvstore.
UpsertKey(ctx context.Context, key Key) error

// DeleteLocalKey removes a key from the kvstore.
DeleteKey(ctx context.Context, key NamedKey) error
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerning the interface definition, I'm wondering whether to keep the context as parameter for the UpsertKey and DeleteKey methods. The current workqueue-based implementation ignores it, but it might make sense in a possible synchronous implementations.

@giorio94
Copy link
Member Author

giorio94 commented May 8, 2023

/ci-multicluster

@giorio94 giorio94 force-pushed the mio/clustermesh-apiserver-services-sync branch from 25697b8 to ebf9a4a Compare May 8, 2023 16:42
@giorio94 giorio94 requested review from a team as code owners May 8, 2023 16:42
@giorio94 giorio94 force-pushed the mio/clustermesh-apiserver-services-sync branch from ebf9a4a to 1270ba8 Compare May 9, 2023 08:14
pkg/kvstore/store/syncstore.go Outdated Show resolved Hide resolved
pkg/kvstore/store/syncstore.go Outdated Show resolved Hide resolved
@giorio94 giorio94 force-pushed the mio/clustermesh-apiserver-services-sync branch 2 times, most recently from ae17593 to ca6365f Compare May 9, 2023 10:38
@giorio94
Copy link
Member Author

giorio94 commented May 10, 2023

/test-1.26-net-next

Hit issues during VM provisioning: #24964

@giorio94 giorio94 requested a review from marseel May 10, 2023 07:53
@marseel
Copy link
Contributor

marseel commented May 10, 2023

The implementation looks great 👍
I would only try to improve tests to make it a little bit less coupled with implementation.

Thanks for this improvement!

Copy link
Member

@nebril nebril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, one nitpick left inline about metric value.

// processing this item.
defer func() {
wss.workqueue.Done(key)
wss.queued.Set(float64(wss.workqueue.Len()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a possibility of processNextItem, UpsertKey and DeleteKey running concurrently? If yes, maybe we should use wss.queued.Inc and wss.queued.Dec instead of Set to make sure that gauge value is correct?

Copy link
Member Author

@giorio94 giorio94 May 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UpsertKey/DeleteKey might instead run concurrently with processNextItem. We cannot use wss.queued.Inc and wss.queued.Dec though, since the fact that a new element was added to the workqueue does not necessarily mean that the size of the workqueue grew by one (since the same element might be already present). Using Set instead, we set the gauge to the current workqueue length (retrieving the length is thread-safe: https://github.com/cilium/cilium/blob/main/vendor/k8s.io/client-go/util/workqueue/queue.go#L144-L148).

@giorio94 giorio94 force-pushed the mio/clustermesh-apiserver-services-sync branch from 3483c1e to b94daea Compare May 10, 2023 11:30
@giorio94
Copy link
Member Author

The implementation looks great +1 I would only try to improve tests to make it a little bit less coupled with implementation.

Thanks for the feedback! I've updated the tests to check for the correct behavior, rather than the setting of the corresponding flags.

I've also added a new test to assert the correct update of the additional metric. In that respect, I've made a small change to the implementation, to correctly report the value once a new item has been extracted from the workqueue.

@giorio94
Copy link
Member Author

/test

@giorio94 giorio94 requested a review from marseel May 10, 2023 11:36
Copy link
Contributor

@marseel marseel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks great.

@giorio94 giorio94 force-pushed the mio/clustermesh-apiserver-services-sync branch from b94daea to aa53007 Compare May 10, 2023 12:49
@giorio94
Copy link
Member Author

/test

@giorio94
Copy link
Member Author

giorio94 commented May 11, 2023

Last push (actually next one, since GitHub had issues in the meanwhile) renamed the queued variable name for better clarity and consistency with another that will be added in a subsequent commit.

@giorio94 giorio94 force-pushed the mio/clustermesh-apiserver-services-sync branch from 774efa1 to b0e6e3f Compare May 11, 2023 14:54
This commit introduces the SyncStore abstraction, which models the
operations required to synchronize key/value pairs into a kvstore,
and represents a simplified "write-only" version of the SharedStore
abstraction currently available.

The implementation which is provided in this commit leverages a
workqueue to asynchronously handle the upsert/delete operations
against the kvstore, hence allowing to coalescence multiple updates
of the same key into one operation only, as well as transparently
handling temporary failures through retries.

Additional relevant aspects about the current implementation include:
* Differently from the SharedStore implementation, it does not include a
  periodic update of all known keys, to reduce the load on the kvstore.
* Upsert and delete operations are short-circuited in case they would be
  no-ops according to locally known state (i.e., in case the value did
  not change, or it had already been deleted). This is again to foster
  efficiency, assuming that the kvstore content is not modified externally.
* For consistency with the current implementation, every key is
  associated with a lease. Since it needs to be updated in case of
  restart, we always update the key in case we haven't done that so far,
  even if it might be already present in the kvstore (i.e., we do not
  perform an initial range operation to detect all the keys already
  present that would not require to be updated).
* Stale keys are not explicitly deleted following a restart, but they
  will eventually be removed when the associated lease expires. This is
  consistent with the current SharedStore implementation.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
This commit introduces a new `cilium_kvstore_sync_queue_size` metric,
which represents the number of elements queued for synchronization in
the kvstore through the SyncStore abstraction. This metric complements
the ones already existing for the kvstore subsystem, and it is labeled
by `scope` (matching the same format adopted for the other kvstore
metrics) and `source_cluster` (identifying the source cluster of the
information to be synchronized).

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
This commit updates the service synchronization logic to use the newly
introduced SyncStore implementation, to benefit from the underlying
workqueue to coalescence multiple updates concerning the same service
and automatically handle temporary failures concerning kvstore
operations.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
@giorio94 giorio94 force-pushed the mio/clustermesh-apiserver-services-sync branch from b0e6e3f to a814b9d Compare May 11, 2023 15:01
@giorio94
Copy link
Member Author

Last push fixed a log statement which didn't use the correct scoped log.

@giorio94
Copy link
Member Author

Travis CI hit known flake #25235. Restarting

Copy link
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat!

@giorio94
Copy link
Member Author

/test

Copy link
Contributor

@tommyp1ckles tommyp1ckles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metric changes LGTM

Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks.

Maybe change the release-note label to release-note/minor, given that there's a new metric?

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 16, 2023
@giorio94 giorio94 added release-note/minor This PR changes functionality that users may find relevant to operating Cilium. and removed release-note/misc This PR makes changes that have no direct user impact. labels May 16, 2023
@borkmann borkmann merged commit 087b502 into cilium:main May 16, 2023
57 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clustermesh Relates to multi-cluster routing functionality in Cilium. kind/performance There is a performance impact of this. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants