diff --git a/modules/checking-mco-status.adoc b/modules/checking-mco-status.adoc index 153fb7e2d985..a0693741c0ba 100644 --- a/modules/checking-mco-status.adoc +++ b/modules/checking-mco-status.adoc @@ -6,43 +6,64 @@ [id="checking-mco-status_{context}"] = Checking machine config pool status -To see the status of the Machine Config Operator, its sub-components, and the resources it manages, use the following `oc` commands: +To see the status of the Machine Config Operator (MCO), its sub-components, and the resources it manages, use the following `oc` commands: .Procedure -. To see the number of MCO-managed nodes available on your cluster for each pool, type: +. To see the number of MCO-managed nodes available on your cluster for each machine config pool (MCP), run the following command: + [source,terminal] ---- $ oc get machineconfigpool -NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE -master rendered-master-dd… True False False 3 3 3 0 4h42m -worker rendered-worker-fde… True False False 3 3 3 0 4h42m ---- + -In the previous output, there are three master and three worker nodes. All machines are updated and none are currently updating. Because all nodes are `Updated` and `Ready` and none are `Degraded`, you can tell that there are no issues. +.Example output +[source,terminal] +---- +NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE +master rendered-master-06c9c4… True False False 3 3 3 0 4h42m +worker rendered-worker-f4b64… False True False 3 2 2 0 4h42m +---- ++ +-- +where: -. To see each existing `machineconfig`, type: +UPDATED:: The `True` status indicates that the MCO has applied the current machine config to the nodes in that MCP. The current machine config is specified in the `STATUS` field in the `oc get mcp` output. The `False` status indicates a node in the MCP is updating. +UPDATING:: The `True` status indicates that the MCO is applying the desired machine config, as specified in the `MachineConfigPool` custom resource, to at least one of the nodes in that MCP. The desired machine config is the new, edited machine config. Nodes that are updating might not be available for scheduling. The `False` status indicates that all nodes in the MCP are updated. +DEGRADED:: A `True` status indicates the MCO is blocked from applying the current or desired machine config to at least one of the nodes in that MCP, or the configuration is failing. Nodes that are degraded might not be available for scheduling. A `False` status indicates that all nodes in the MCP are ready. +MACHINECOUNT:: Indicates the total number of machines in that MCP. +READYMACHINECOUNT:: Indicates the total number of machines in that MCP that are ready for scheduling. +UPDATEDMACHINECOUNT:: Indicates the total number of machines in that MCP that have the current machine config. +DEGRADEDMACHINECOUNT:: Indicates the total number of machines in that MCP that are marked as degraded or unreconcilable. +-- + +In the previous output, there are three control plane (master) nodes and three worker nodes. The control plane MCP and the associated nodes are updated to the current machine config. The nodes in the worker MCP are being updated to the desired machine config. Two of the nodes in the worker MCP are updated and one is still updating, as indicated by the `UPDATEDMACHINECOUNT` being `2`. There are no issues, as indicated by the `DEGRADEDMACHINECOUNT` being `0` and `DEGRADED` being `False`. ++ +While the nodes in the MCP are updating, the machine config listed under `CONFIG` is the current machine config, which the MCP is being updated from. When the update is complete, the listed machine config is the desired machine config, which the MCP was updated to. ++ +[NOTE] +==== +If a node is being cordoned, that node is not included in the `READYMACHINECOUNT`, but is included in the `MACHINECOUNT`. Also, the MCP status is set to `UPDATING`. Because the node has the current machine config, it is counted in the `UPDATEDMACHINECOUNT` total: + +.Example output [source,terminal] ---- -$ oc get machineconfigs -NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE -00-master 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m -00-worker 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m -01-master-container-runtime 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m -01-master-kubelet 2c9371fbb673b97a6fe8b1c52… 3.2.0 5h18m -... -rendered-master-dde... 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m -rendered-worker-fde... 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m +NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE +master rendered-master-06c9c4… True False False 3 3 3 0 4h42m +worker rendered-worker-c1b41a… False True False 3 2 3 0 4h42m ---- -+ -Note that the `machineconfigs` listed as `rendered` are not meant to be changed or deleted. Expect them to be hidden at some point in the future. +==== -. Check the status of worker (or change to master) to see the status of that pool of nodes: +. To check the status of the nodes in an MCP by examining the `MachineConfigPool` custom resource, run the following command: +: + [source,terminal] ---- $ oc describe mcp worker +---- ++ +.Example output +[source,terminal] +---- ... Degraded Machine Count: 0 Machine Count: 3 @@ -52,13 +73,58 @@ $ oc describe mcp worker Updated Machine Count: 3 Events: ---- ++ +[NOTE] +==== +If a node is being cordoned, the node is not included in the `Ready Machine Count`. It is included in the `Unavailable Machine Count`: -. You can view the contents of a particular machine config (in this case, `01-master-kubelet`). The trimmed output from the following `oc describe` command shows that this `machineconfig` contains both configuration files (`cloud.conf` and `kubelet.conf`) and a systemd service -(Kubernetes Kubelet): +.Example output +[source,terminal] +---- +... + Degraded Machine Count: 0 + Machine Count: 3 + Observed Generation: 2 + Ready Machine Count: 2 + Unavailable Machine Count: 1 + Updated Machine Count: 3 +---- +==== + +. To see each existing `MachineConfig` object, run the following command: ++ +[source,terminal] +---- +$ oc get machineconfigs +---- ++ +.Example output +[source,terminal] +---- +NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE +00-master 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m +00-worker 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m +01-master-container-runtime 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m +01-master-kubelet 2c9371fbb673b97a6fe8b1c52… 3.2.0 5h18m +... +rendered-master-dde... 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m +rendered-worker-fde... 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m +---- ++ +Note that the `MachineConfig` objects listed as `rendered` are not meant to be changed or deleted. + +. To view the contents of a particular machine config (in this case, `01-master-kubelet`), run the following command: + [source,terminal] ---- $ oc describe machineconfigs 01-master-kubelet +---- ++ +The output from the command shows that this `MachineConfig` object contains both configuration files (`cloud.conf` and `kubelet.conf`) and a systemd service (Kubernetes Kubelet): ++ +.Example output +[source,terminal] +---- Name: 01-master-kubelet ... Spec: @@ -89,7 +155,7 @@ ExecStart=/usr/bin/hyperkube \ --config=/etc/kubernetes/kubelet.conf \ ... ---- -If something goes wrong with a machine config that you apply, you can always back out that change. For example, if you had run `oc create -f ./myconfig.yaml` to apply a machine config, you could remove that machine config by typing: +If something goes wrong with a machine config that you apply, you can always back out that change. For example, if you had run `oc create -f ./myconfig.yaml` to apply a machine config, you could remove that machine config by running the following command: [source,terminal] ----