New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clustermesh-apiserver: rework services synchronization to improve performance #25260
clustermesh-apiserver: rework services synchronization to improve performance #25260
Conversation
a4d5321
to
fa231e7
Compare
// by SyncStore implementations. | ||
type SyncStoreBackend interface { | ||
// Update creates or updates a key. | ||
Update(ctx context.Context, key string, value []byte, lease bool) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another possibility here would be to leverage UpdateIfDifferent
, which first performs a read operation and then the update if the key does not exist, the value is different, or the lease ID changed. Given that we already know through the local cache whether the entry is up-to-date or not (and we don't perform the update if not necessary), my opinion is that plain Update
is more efficient, as we save an additional read operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
/test Job 'Cilium-PR-K8s-1.26-kernel-net-next' failed: Click to show.Test Name
Failure Output
Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.26-kernel-net-next/2091/ If it is a flake and a GitHub issue doesn't already exist to track it, comment Then please upload the Jenkins artifacts to that issue. |
// by SyncStore implementations. | ||
type SyncStoreBackend interface { | ||
// Update creates or updates a key. | ||
Update(ctx context.Context, key string, value []byte, lease bool) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
fa231e7
to
81c923a
Compare
pkg/kvstore/store/syncstore.go
Outdated
// UpsertKey upserts a key/value pair into the kvstore. | ||
UpsertKey(ctx context.Context, key Key) error | ||
|
||
// DeleteLocalKey removes a key from the kvstore. | ||
DeleteKey(ctx context.Context, key NamedKey) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concerning the interface definition, I'm wondering whether to keep the context as parameter for the UpsertKey
and DeleteKey
methods. The current workqueue-based implementation ignores it, but it might make sense in a possible synchronous implementations.
81c923a
to
25697b8
Compare
/ci-multicluster |
25697b8
to
ebf9a4a
Compare
ebf9a4a
to
1270ba8
Compare
ae17593
to
ca6365f
Compare
/test-1.26-net-next Hit issues during VM provisioning: #24964 |
The implementation looks great 👍 Thanks for this improvement! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, one nitpick left inline about metric value.
pkg/kvstore/store/syncstore.go
Outdated
// processing this item. | ||
defer func() { | ||
wss.workqueue.Done(key) | ||
wss.queued.Set(float64(wss.workqueue.Len())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a possibility of processNextItem
, UpsertKey
and DeleteKey
running concurrently? If yes, maybe we should use wss.queued.Inc
and wss.queued.Dec
instead of Set
to make sure that gauge value is correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UpsertKey/DeleteKey
might instead run concurrently with processNextItem
. We cannot use wss.queued.Inc
and wss.queued.Dec
though, since the fact that a new element was added to the workqueue does not necessarily mean that the size of the workqueue grew by one (since the same element might be already present). Using Set
instead, we set the gauge to the current workqueue length (retrieving the length is thread-safe: https://github.com/cilium/cilium/blob/main/vendor/k8s.io/client-go/util/workqueue/queue.go#L144-L148).
3483c1e
to
b94daea
Compare
Thanks for the feedback! I've updated the tests to check for the correct behavior, rather than the setting of the corresponding flags. I've also added a new test to assert the correct update of the additional metric. In that respect, I've made a small change to the implementation, to correctly report the value once a new item has been extracted from the workqueue. |
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks great.
b94daea
to
aa53007
Compare
/test |
Last push (actually next one, since GitHub had issues in the meanwhile) renamed the |
774efa1
to
b0e6e3f
Compare
This commit introduces the SyncStore abstraction, which models the operations required to synchronize key/value pairs into a kvstore, and represents a simplified "write-only" version of the SharedStore abstraction currently available. The implementation which is provided in this commit leverages a workqueue to asynchronously handle the upsert/delete operations against the kvstore, hence allowing to coalescence multiple updates of the same key into one operation only, as well as transparently handling temporary failures through retries. Additional relevant aspects about the current implementation include: * Differently from the SharedStore implementation, it does not include a periodic update of all known keys, to reduce the load on the kvstore. * Upsert and delete operations are short-circuited in case they would be no-ops according to locally known state (i.e., in case the value did not change, or it had already been deleted). This is again to foster efficiency, assuming that the kvstore content is not modified externally. * For consistency with the current implementation, every key is associated with a lease. Since it needs to be updated in case of restart, we always update the key in case we haven't done that so far, even if it might be already present in the kvstore (i.e., we do not perform an initial range operation to detect all the keys already present that would not require to be updated). * Stale keys are not explicitly deleted following a restart, but they will eventually be removed when the associated lease expires. This is consistent with the current SharedStore implementation. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
This commit introduces a new `cilium_kvstore_sync_queue_size` metric, which represents the number of elements queued for synchronization in the kvstore through the SyncStore abstraction. This metric complements the ones already existing for the kvstore subsystem, and it is labeled by `scope` (matching the same format adopted for the other kvstore metrics) and `source_cluster` (identifying the source cluster of the information to be synchronized). Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
This commit updates the service synchronization logic to use the newly introduced SyncStore implementation, to benefit from the underlying workqueue to coalescence multiple updates concerning the same service and automatically handle temporary failures concerning kvstore operations. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
b0e6e3f
to
a814b9d
Compare
Last push fixed a log statement which didn't use the correct scoped log. |
Travis CI hit known flake #25235. Restarting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat!
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Metric changes LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks.
Maybe change the release-note label to release-note/minor
, given that there's a new metric?
This PR reworks the logic used by the clustermesh-apiserver to synchronize the services and the associated backends to the kvstore. Specifically, it introduces the SyncStore abstraction, which models the operations required to synchronize key/value pairs into a kvstore, and represents a simplified "write-only" version of the SharedStore abstraction currently available. The implementation relies on a workqueue to allow coalescence multiple updates of the same key into one operation only, as well as transparently handling temporary failures through retries. Notably, periodic update of all known keys is not performed, to reduce the load on the kvstore (it was already not performed for CiliumEndpoints and CiliumIdentities, which are also synchronized by the clustermesh-apiserver).
Please, refer to the commit descriptions for additional details about the different changes.