bf2fc6cc711aee1a0c2a · shawkins · Oct 17, 2022 · Feb 1, 2023 · Feb 1, 2023 · Feb 22, 2023
diff --git a/_ap/17/index.adoc b/_ap/17/index.adoc
@@ -0,0 +1,111 @@
+---
+num: 17
+category: "Fleet manager, fleet shard and instance operators"
+title: "Fleet shard status"
+status: "Accepted"
+authors:
+  - "Steven Hawkins"
+  - "Luca Burgazzoli"
+tags:
+  - "kafka"
+---
+
+## Intent
+
+Define a pattern for fleet shard status representation and interpretation.
+
+## Motivation
+
+A fleet shard must convey the status of the data plane resources it is responsible for back to the control plane.  This is typically done using mechanisms built around the Kubernetes status sub-resource.  There should be consistency around the design and usage of that status information that conforms to Kuberentes guidelines and so that control planes, SREs, and developers can quickly and appropriately reason over the status. 
+
+## Status Guidelines
+
+### Conditions
+
+Kubernetes provides a significant amount of guidance for https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#typical-status-properties[status conditions].
+
+Highlights:
+Condition types should be single word or short camel-cased names.  They should be unique - it’s more of a map of conditions by type, rather than just an array.
+The reason field should be a short word or short camel-cased phase.
+The message should be human readable.
+The status field should be a three-valued boolean.
+
+More specific to fleet shards:
+Since the fleet shard will typically send a full status object back to the control plane periodically or with every change, it is best to include only a minimally necessary set of conditions.  However that does not mean you can simply omit conditions.  If a condition is not present, it should be interpreted as having status Unknown.
+It is fine to have a control plane reason over status conditions - in particular the type, reason, and status fields.   However the parsing of a condition message should be avoided, and rather additional status fields should be used to provide specialized information.
+
+### Other Status Fields
+
+Beyond conditions, other status fields provide the ability to convey information via a custom schema.
+
+For example the ManagedKafka status once Ready will include route and other information, which are not part of a condition:
+
+[source,yaml]
+----
+status: 
+  conditions:
+    - lastTransitionTime: '2022-08-02T08:24:04.818994Z'
+      message: ''
+      status: 'True'
+      type: Ready
+  routes:
+    - name: admin-server
+      prefix: admin-server
+      router: ingresscontroller.kas.mk-0419-204008.t6kh.p1.openshiftapps.com
+    - name: bootstrap
+      prefix: ''
+      router: ingresscontroller.kas.mk-0419-204008.t6kh.p1.openshiftapps.com
+…  
+----
+
+### Interpretation
+
+The meaning of any condition or status field is up to the controller / resource and should be a documented part of the API contract.  
+
+For example the ManagedKafka status Ready condition refers to the operator’s view of the resource.  A ManagedKafka is only ready when all of the dependent resources are observed to be in their desired / ready state.  This will not always be the same notion of readiness from an end user’s observation of managed kafka service.  In particular, the restart of a broker pods, a temporary networking issue, etc. may not be reflected in ManagedKafka status.  It follows that a ManagedKafka status Ready condition alone is insufficient to show the user the state of their service.  ManagedKafka canary and other kafka metrics provide a more exact representation of cluster functioning.
+
+### Error States
+
+An operator in a fleetshard _usually_ lacks sufficient context to determine whether a valid resource is in a terminal state. The operator’s job is simply to implement the desired state whenever possible. For example, if at a given time additional resources are needed, other necessary system parts aren’t installed, etc. - it doesn’t mean the conditions won’t be correct at a later time as a result of some action outside the context of the operator.
+
+In cases where the operator does have sufficient context to determine terminality, for example violation of preconditions in installed versions of APIs, then it is acceptable for the operator and the status handling to assume the resource is in a terminal state.
+
+When status is conveying an error that is not known to be terminal, it follows that the error may resolve on its own with additional time.  Control planes should assume such errors are ephemeral and not immediately react as if a hard failure has occurred.  If there are further actions that the control plane may take, the condition reason or other status fields should make that easy to determine.  This allows the data plane logic to remain straightforward and not have to fully understand every possible error condition.
+
+For example with managed kafka and auto-scaling enabled it may take longer for a node to become available than Strimzi’s timeout allows.  That error is propagated onto the ManagedKafka status.  However there is no further action expected by the control plane - and  with additional time the auto-scaled node will be available and the placement will succeed.  For example from https://issues.redhat.com/browse/MGDSTRM-9288 the control plane observed:
+
+[source,yaml]
+----
+status: 
+  conditions:
+    - lastTransitionTime: '2022-08-02T08:24:04.818994Z'
+      message: 'Exceeded timeout of 420000ms while waiting for Pods resource ninth-zookeeper-0 in namespace kafka-cbfv5rnfnecdu9rb4gc0 to be ready'
+      status: 'False'
+      reason: 'Error'
+      type: Ready
+----
+
+Rather than interpreting this as a hard failure, it should be assumed that with additional time and reconciliations that it may resolve on its own.
+
+How much time to give an error to resolve is up to the control plane and relevant SLOs.  During installation this may include waiting until the SLO has been exceeded.  For other operations it may mean waiting at least 1 additional resolution time window if the operator or dependent operators uses a time based mechanism.
+
+### Current vs. Desired State
+
+Especially in instances where spec changes are rolled into dependent resources the desired state will not be fully realized until some point in the future.  It is appropriate for the status to provide additional information about this transition.  As per the Kubernetes guidelines that could be represented via a condition, or similar to the standard Deployment resource additional status fields such as ready or available replicas, allow for the control plane or a user to infer the progress of the transition.
+
+For example the upgrade of ManagedKafka kafka version requires a rolling update of brokers where each broker is taken off-line and replace with one that has an updated version.  This is a relatively long-running process.  The status ManagedKafka Ready condition during this time will have a reason of KafkaUpdating, rather than simply indicating Ready based upon the Strimzi Kafka resource being updated to the desired state.
+
+### generation and observedGeneration
+
+In kubernetes, each object should have a `metadata.generation` field which is defined as a sequence number - set by the system and monotonically increased per resource on a spec change. Some resources include a field named `status.observedGeneration`, which is the generation most recently observed by the component responsible for acting upon changes to the desired state of the resource. This can be used, for instance, to ensure that the reported status reflects the most recent desired state. The observedGeneration field is also part of metav1.Conditions and represents the spec generation that the condition was set based upon.
+
+Similar concepts with fields that mirror their kubernetes couterparts can be introduced to MAS resources to offer a standard way for clients and control planes to figure out when a specific desired state has been taken into account. 
+
+### Frequency Of Status Changes
+
+Rapid changes to a status should be avoided.  Each status change is an update that the local Kuberentes instance must process and is generally also relayed to the control plane, which can lead to a substantial amount of overhead for a large number of resources.  For controllers that implement fixed interval resolving this is generally not an issue.  Event driven controllers though should be designed to minimize unnecessary updates - in particular updates which do not imply a status change should leave the existing status unmodified.
+
+## Participants
+* Control Plane -- development team for the KAS Fleet Manager API.
+* Kafka Services -- team developing the kafka fleet shard operator.
+* MAS Connectors -- team developing the connector fleet shard operator.