Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
110 changes: 88 additions & 22 deletions modules/checking-mco-status.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,43 +6,64 @@
[id="checking-mco-status_{context}"]
= Checking machine config pool status

To see the status of the Machine Config Operator, its sub-components, and the resources it manages, use the following `oc` commands:
To see the status of the Machine Config Operator (MCO), its sub-components, and the resources it manages, use the following `oc` commands:

.Procedure
. To see the number of MCO-managed nodes available on your cluster for each pool, type:
. To see the number of MCO-managed nodes available on your cluster for each machine config pool (MCP), run the following command:
+
[source,terminal]
----
$ oc get machineconfigpool
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-dd… True False False 3 3 3 0 4h42m
worker rendered-worker-fde… True False False 3 3 3 0 4h42m
----
+
In the previous output, there are three master and three worker nodes. All machines are updated and none are currently updating. Because all nodes are `Updated` and `Ready` and none are `Degraded`, you can tell that there are no issues.
.Example output
[source,terminal]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could add an ".Example output" line before this one. For more information, see Code blocks, command syntax, and example output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pneedle-rh Thanks! I split a few command/output code blocks, but forgot to add the .Example output!

----
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-06c9c4… True False False 3 3 3 0 4h42m
worker rendered-worker-f4b64… False True False 3 2 2 0 4h42m
----
+
--
where:

. To see each existing `machineconfig`, type:
UPDATED:: The `True` status indicates that the MCO has applied the current machine config to the nodes in that MCP. The current machine config is specified in the `STATUS` field in the `oc get mcp` output. The `False` status indicates a node in the MCP is updating.
UPDATING:: The `True` status indicates that the MCO is applying the desired machine config, as specified in the `MachineConfigPool` custom resource, to at least one of the nodes in that MCP. The desired machine config is the new, edited machine config. Nodes that are updating might not be available for scheduling. The `False` status indicates that all nodes in the MCP are updated.
DEGRADED:: A `True` status indicates the MCO is blocked from applying the current or desired machine config to at least one of the nodes in that MCP, or the configuration is failing. Nodes that are degraded might not be available for scheduling. A `False` status indicates that all nodes in the MCP are ready.
MACHINECOUNT:: Indicates the total number of machines in that MCP.
READYMACHINECOUNT:: Indicates the total number of machines in that MCP that are ready for scheduling.
UPDATEDMACHINECOUNT:: Indicates the total number of machines in that MCP that have the current machine config.
DEGRADEDMACHINECOUNT:: Indicates the total number of machines in that MCP that are marked as degraded or unreconcilable.
--
+
In the previous output, there are three control plane (master) nodes and three worker nodes. The control plane MCP and the associated nodes are updated to the current machine config. The nodes in the worker MCP are being updated to the desired machine config. Two of the nodes in the worker MCP are updated and one is still updating, as indicated by the `UPDATEDMACHINECOUNT` being `2`. There are no issues, as indicated by the `DEGRADEDMACHINECOUNT` being `0` and `DEGRADED` being `False`.
+
While the nodes in the MCP are updating, the machine config listed under `CONFIG` is the current machine config, which the MCP is being updated from. When the update is complete, the listed machine config is the desired machine config, which the MCP was updated to.
+
[NOTE]
====
If a node is being cordoned, that node is not included in the `READYMACHINECOUNT`, but is included in the `MACHINECOUNT`. Also, the MCP status is set to `UPDATING`. Because the node has the current machine config, it is counted in the `UPDATEDMACHINECOUNT` total:

.Example output
[source,terminal]
----
$ oc get machineconfigs
NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE
00-master 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m
00-worker 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m
01-master-container-runtime 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m
01-master-kubelet 2c9371fbb673b97a6fe8b1c52… 3.2.0 5h18m
...
rendered-master-dde... 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m
rendered-worker-fde... 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-06c9c4… True False False 3 3 3 0 4h42m
worker rendered-worker-c1b41a… False True False 3 2 3 0 4h42m
----
+
Note that the `machineconfigs` listed as `rendered` are not meant to be changed or deleted. Expect them to be hidden at some point in the future.
====

. Check the status of worker (or change to master) to see the status of that pool of nodes:
. To check the status of the nodes in an MCP by examining the `MachineConfigPool` custom resource, run the following command:
:
+
[source,terminal]
----
$ oc describe mcp worker
----
+
.Example output
[source,terminal]
----
...
Degraded Machine Count: 0
Machine Count: 3
Expand All @@ -52,13 +73,58 @@ $ oc describe mcp worker
Updated Machine Count: 3
Events: <none>
----
+
[NOTE]
====
If a node is being cordoned, the node is not included in the `Ready Machine Count`. It is included in the `Unavailable Machine Count`:

. You can view the contents of a particular machine config (in this case, `01-master-kubelet`). The trimmed output from the following `oc describe` command shows that this `machineconfig` contains both configuration files (`cloud.conf` and `kubelet.conf`) and a systemd service
(Kubernetes Kubelet):
.Example output
[source,terminal]
----
...
Degraded Machine Count: 0
Machine Count: 3
Observed Generation: 2
Ready Machine Count: 2
Unavailable Machine Count: 1
Updated Machine Count: 3
----
====

. To see each existing `MachineConfig` object, run the following command:
+
[source,terminal]
----
$ oc get machineconfigs
----
+
.Example output
[source,terminal]
----
NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE
00-master 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m
00-worker 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m
01-master-container-runtime 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m
01-master-kubelet 2c9371fbb673b97a6fe8b1c52… 3.2.0 5h18m
...
rendered-master-dde... 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m
rendered-worker-fde... 2c9371fbb673b97a6fe8b1c52... 3.2.0 5h18m
----
+
Note that the `MachineConfig` objects listed as `rendered` are not meant to be changed or deleted.

. To view the contents of a particular machine config (in this case, `01-master-kubelet`), run the following command:
+
[source,terminal]
----
$ oc describe machineconfigs 01-master-kubelet
----
+
The output from the command shows that this `MachineConfig` object contains both configuration files (`cloud.conf` and `kubelet.conf`) and a systemd service (Kubernetes Kubelet):
+
.Example output
[source,terminal]
----
Name: 01-master-kubelet
...
Spec:
Expand Down Expand Up @@ -89,7 +155,7 @@ ExecStart=/usr/bin/hyperkube \
--config=/etc/kubernetes/kubelet.conf \ ...
----

If something goes wrong with a machine config that you apply, you can always back out that change. For example, if you had run `oc create -f ./myconfig.yaml` to apply a machine config, you could remove that machine config by typing:
If something goes wrong with a machine config that you apply, you can always back out that change. For example, if you had run `oc create -f ./myconfig.yaml` to apply a machine config, you could remove that machine config by running the following command:

[source,terminal]
----
Expand Down