clustermesh-apiserver: expose information about completion of initial synchronization through etcd #25388

giorio94 · 2023-05-11T15:29:20Z

This PR extends the SyncStore abstraction to add the support for synced canaries, which are used to convey the information that the initial synchronization from an external source (e.g., k8s or another kvstore) of the keys hosted at a different prefix has completed. This information is required when using an ephemeral etcd instance (e.g., the clustermesh-apiserver sidecar container), because from a client perspective the completion of the initial range operation is not sufficient in this case to guarantee that an up-to-date snapshot of the state has been retrieved. Specifically, this information will be leveraged in a subsequent PR to address #24740.

At the moment, "synced" canaries are enabled only for service entries. The introduction for the other resources depends on #25049, and I temporarily removed that part to allow reviewing both in parallel.

Please, refer to the individual commits for more details.

I've tagged this PR as release-note/minor since it introduces a new metric.

Related: #24740

clustermesh-apiserver: expose information about completion of initial synchronization through etcd

giorio94 · 2023-05-16T16:29:46Z

/test

pkg/clustermesh/types/types.go

tommyp1ckles

Changes make sense from a metrics perspective.

christarazi

I'm not sure I fully understand the role of the metric. The description says "Whether the initial synchronization from/to the kvstore has completed", but that to me sounds like it represents a boolean state. The metric is a gauge, so what is it actually representing?

giorio94 · 2023-05-17T07:35:42Z

I'm not sure I fully understand the role of the metric. The description says "Whether the initial synchronization from/to the kvstore has completed", but that to me sounds like it represents a boolean state. The metric is a gauge, so what is it actually representing?

The metric is currently of gauge type mainly considering the parallel usage when reading data from the kvstore (to be introduced in a subsequent commit not part of this PR). In that case, and specifically in the clustermesh context, we might watch the same (cluster, kvstore prefix) pair different times following a connection restart. The current implementation resets the metric to false any time we start listing the given prefix, and turns it to true once listing has completed. Another alternative might be to configure it to true only the first time, but that seemed less informative to me.

I'm more doubtful about the description, because I agree that initial might sound misleading. It is there because I would say that in general synchronization from/to the kvstore is an ongoing process which cannot complete. Only the initial synchronization (I couldn't come up with a better term, but I was referring to the list part of a list-and-watch process) can.

giorio94 · 2023-05-23T10:07:24Z

/test

YutaroHayakawa

LGTM 👍

giorio94 · 2023-05-24T06:52:29Z

/test-runtime

The link leads to a 404, I cannot tell why this failed.

giorio94 · 2023-05-24T06:52:44Z

/test-1.26-net-next

The link leads to a 404, I cannot tell why this failed.

giorio94 · 2023-05-24T06:55:51Z

A few Cilium Conformance E2E jobs failed during provisioning: #25483. Manually rerunning.

giorio94 · 2023-05-24T08:42:49Z

/test-runtime

Hit #25637

giorio94 · 2023-05-24T11:41:38Z

/test-1.26-net-next

Failed with known flakes: #25605, #25524

pkg/kvstore/kvstore.go

Introduce the new `cilium/synced` kvstore prefix, which will be used to convey the information that the initial synchronization from an external source (e.g., k8s or another kvstore) of the keys hosted at a different prefix has completed. For instance, `cilium/synced/foo/cilium/state/node/v1` means that synchronization from cluster `foo` of the keys hosted under `cilium/state/node/v1` has finished. This information is required when using an ephemeral etcd instance (e.g., the clustermesh-apiserver sidecar container), because from a client perspective the completion of the initial range operation is not sufficient in this case to guarantee that an up-to-date snapshot of the state has been retrieved. We use a separate prefix (rather than using a special key under the original path) to prevent compatibility issues, as it would not be interpreted correctly by older cilium agent versions. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>

Extend the CiliumClusterConfig struct to allow expressing cluster-specific capabilities. Currently, the only supported capability concerns the support of per-prefix "synced" canaries. When enabled, the clustermesh-apiserver writes keys in the form `cilium/synced/$name/$prefix` into etcd to state that the initial synchronization of the keys under the `$prefix` prefix for cluster `$name` has completed successfully. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>

giorio94 · 2023-05-24T16:12:26Z

/test

chancez

Metric looks good overall, had a question about whether or not we need the action label.

pkg/kvstore/store/syncstore.go

christarazi · 2023-05-24T16:51:48Z

This metric currently conveys two pieces of information, depending on the action type:

* `write`: the initial synchronization from an external source (which might be k8s or a kvstore) to the kvstore has completed (this is the part which is implemented in this PR). In other words, the initial list has completed, and all the returned elements have been successfully written into the kvstore. This is why I used _sync_ rather than _list_ for the name.

* `read`: this is currently implemented in [Fix missed deletion events when reconnecting to/disconnecting from remote clusters (nodes and services) #25499](https://github.com/cilium/cilium/pull/25499), and it is set by the agent when the initial list operation for a given prefix has completed, and all the entries received up to that moment have been processed.

Additionally, will this metric have any utility after the "initial list" completes? What is the role of the metric after the user has already "observed" the metric reach this state?

Depending on the action type:

* `write`: it can be used to infer the time required to complete the synchronization when the clustermesh-apiserver is restarted, which can be potentially large with the default rate-limiting settings if thousands of entries need to be sync'ed. This operation is always performed since the etcd database used by the clustermesh-apiserver is ephemeral, and it needs to be reinitialized from scratch after any restart.

* `read`: it conveys that the agent has completed the initial synchronization from the kvstore for a given cluster, and it is now observing an up-to-date state. The metric would again transition to false if the connection needs to be re-established (mainly because the remote clustermesh-apiserver has been restarted) or the agent restarted. In other words, when the metric is false, the agent is operating with stale data about the given remote cluster.

Thanks for the explanation! Could you include the description of these in the commit msg that introduces the metrics?

christarazi

Pending my comment above.

The SyncStore interface is extended with an additional `Synced()` method, which triggers the asynchronous creation of the appropriate "synced" key in the kvstore once all the previously started upsertion operations completed successfully. This information can be used by readers to confirm that the initial synchronization from an external source to the kvstore has completed. Hence, they can read an up-to-date state. Additional callbacks can be registered to be executed when the initial synchronization completed, which might be used for instance to gate other operations or to implement readiness handlers. Finally, the new `kvstore_initial_sync_completed` metric is introduced to convey the synchronization status. It is labeled by `scope` (matching the same format adopted for the other kvstore metrics), `source_cluster` (identifying the source cluster of the information) and `action` (either write or read, the SyncStore always does write operations). More specifically, the new metric conveys two different pieces of information, depending on the `action` type: * `write`: the initial synchronization from an external source (which might be k8s or a kvstore) to the kvstore has completed. In other words, the initial list has completed, and all the returned elements have been successfully written into the kvstore. It can be used to infer the time required to complete the synchronization when the clustermesh-apiserver is restarted. * `read` (to be supported in an upcoming commit): set by the agent when the initial list operation for a given prefix has completed, and all the entries received up to that moment have been processed. It conveys that the agent has completed the initial synchronization from the kvstore for a given cluster, and it is now observing an up-to-date state. The metric would again transition to false if the connection needs to be re-established (mainly because the remote clustermesh-apiserver has been restarted) or the agent restarted. In other words, when the metric is false, the agent is operating with stale data about the given remote cluster. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>

This commit includes the glue logic to write the "synced" canary when the initial synchronization from k8s completes. The corresponding capability is not yet turned on in the CiliumClusterConfig, because "synced" canaries need still to be enabled for the other resource types. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>

giorio94 · 2023-05-24T17:05:48Z

Thanks for the explanation! Could you include the description of these in the commit msg that introduces the metrics?

Sure, I've extended the commit message with the information mentioned above.

giorio94 · 2023-05-24T17:05:54Z

/test

giorio94 · 2023-05-25T06:56:59Z

/ci-l4lb

Failure tracked in #25663

giorio94 · 2023-05-25T07:03:11Z

k8s-1.26-kernel-net-next failed again due to known flake #25605. Given the currently high failure rate (>50%) it wouldn't be much useful to rerun it again. Marking this PR as ready to merge, since all the other tests are green, and all reviews are in.

giorio94 force-pushed the mio/clustermesh-apiserver-sync-canaries branch 4 times, most recently from 9f07b96 to 80a3ba7 Compare May 16, 2023 16:22

giorio94 marked this pull request as ready for review May 16, 2023 16:29

giorio94 requested review from a team as code owners May 16, 2023 16:29

giorio94 requested review from tommyp1ckles, aanm, YutaroHayakawa, christarazi, qmonnet and marseel May 16, 2023 16:29

tommyp1ckles reviewed May 16, 2023

View reviewed changes

pkg/clustermesh/types/types.go Outdated Show resolved Hide resolved

tommyp1ckles approved these changes May 16, 2023

View reviewed changes

christarazi reviewed May 16, 2023

View reviewed changes

giorio94 force-pushed the mio/clustermesh-apiserver-sync-canaries branch from 80a3ba7 to f8e2543 Compare May 17, 2023 07:58

giorio94 requested a review from christarazi May 23, 2023 17:01

YutaroHayakawa approved these changes May 24, 2023

View reviewed changes

aanm approved these changes May 24, 2023

View reviewed changes

pkg/kvstore/kvstore.go Outdated Show resolved Hide resolved

giorio94 added 2 commits May 24, 2023 18:11

giorio94 force-pushed the mio/clustermesh-apiserver-sync-canaries branch from 60be128 to bdbfca5 Compare May 24, 2023 16:11

chancez reviewed May 24, 2023

View reviewed changes

pkg/kvstore/store/syncstore.go Show resolved Hide resolved

chancez approved these changes May 24, 2023

View reviewed changes

christarazi approved these changes May 24, 2023

View reviewed changes

giorio94 added 2 commits May 24, 2023 18:54

giorio94 force-pushed the mio/clustermesh-apiserver-sync-canaries branch from bdbfca5 to 37b4940 Compare May 24, 2023 17:04

giorio94 mentioned this pull request May 25, 2023

CI: NAT46x64 - Failed to connect to endpoint due to connection refused #25663

Closed

giorio94 added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 25, 2023

squeed merged commit 751ab54 into cilium:main May 25, 2023
57 of 58 checks passed

This was referenced Jun 5, 2023

clustermesh-apiserver: write synced information into etcd #25905

Merged

Fix missed deletion events when reconnecting to/disconnecting from remote clusters (identities) #25677

Merged

hemanthmalla mentioned this pull request May 3, 2024

[1.14] operator: propagate CiliumClusterConfig when in kvstore mode #32349

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clustermesh-apiserver: expose information about completion of initial synchronization through etcd #25388

clustermesh-apiserver: expose information about completion of initial synchronization through etcd #25388

giorio94 commented May 11, 2023 •

edited

giorio94 commented May 16, 2023

tommyp1ckles left a comment

christarazi left a comment

giorio94 commented May 17, 2023 •

edited

giorio94 commented May 23, 2023

YutaroHayakawa left a comment

giorio94 commented May 24, 2023 •

edited

giorio94 commented May 24, 2023 •

edited

giorio94 commented May 24, 2023

giorio94 commented May 24, 2023 •

edited

giorio94 commented May 24, 2023 •

edited

giorio94 commented May 24, 2023

chancez left a comment

christarazi commented May 24, 2023

christarazi left a comment

giorio94 commented May 24, 2023

giorio94 commented May 24, 2023

giorio94 commented May 25, 2023 •

edited

giorio94 commented May 25, 2023

clustermesh-apiserver: expose information about completion of initial synchronization through etcd #25388

clustermesh-apiserver: expose information about completion of initial synchronization through etcd #25388

Conversation

giorio94 commented May 11, 2023 • edited

giorio94 commented May 16, 2023

tommyp1ckles left a comment

Choose a reason for hiding this comment

christarazi left a comment

Choose a reason for hiding this comment

giorio94 commented May 17, 2023 • edited

giorio94 commented May 23, 2023

YutaroHayakawa left a comment

Choose a reason for hiding this comment

giorio94 commented May 24, 2023 • edited

giorio94 commented May 24, 2023 • edited

giorio94 commented May 24, 2023

giorio94 commented May 24, 2023 • edited

giorio94 commented May 24, 2023 • edited

giorio94 commented May 24, 2023

chancez left a comment

Choose a reason for hiding this comment

christarazi commented May 24, 2023

christarazi left a comment

Choose a reason for hiding this comment

giorio94 commented May 24, 2023

giorio94 commented May 24, 2023

giorio94 commented May 25, 2023 • edited

giorio94 commented May 25, 2023

giorio94 commented May 11, 2023 •

edited

giorio94 commented May 17, 2023 •

edited

giorio94 commented May 24, 2023 •

edited

giorio94 commented May 24, 2023 •

edited

giorio94 commented May 24, 2023 •

edited

giorio94 commented May 24, 2023 •

edited

giorio94 commented May 25, 2023 •

edited