From e2a975c630e6dd5fd1c725ad5f6397f96bddfd88 Mon Sep 17 00:00:00 2001 From: Steven Smith Date: Wed, 5 Jun 2024 14:05:13 -0400 Subject: [PATCH] Updates rollback to SDN procedure to add live migration --- .../how-the-live-migration-process-works.adoc | 54 ++++++++++++ .../nw-network-plugin-migration-process.adoc | 53 +---------- modules/nw-ovn-kubernetes-rollback-live.adoc | 88 +++++++++++++++++++ modules/nw-ovn-kubernetes-rollback.adoc | 52 +++++++++-- .../rollback-to-ovn-kubernetes.adoc | 2 +- .../rollback-to-openshift-sdn.adoc | 11 ++- 6 files changed, 200 insertions(+), 60 deletions(-) create mode 100644 modules/how-the-live-migration-process-works.adoc create mode 100644 modules/nw-ovn-kubernetes-rollback-live.adoc diff --git a/modules/how-the-live-migration-process-works.adoc b/modules/how-the-live-migration-process-works.adoc new file mode 100644 index 000000000000..3fd6101019b1 --- /dev/null +++ b/modules/how-the-live-migration-process-works.adoc @@ -0,0 +1,54 @@ +// Module included in the following assemblies: +// +// * networking/ovn_kubernetes_network_provider/migrate-from-openshift-sdn.adoc + +ifeval::["{context}" == "migrate-to-openshift-sdn"] +:sdn: OpenShift SDN +:previous-sdn: OVN-Kubernetes +:type: OpenShiftSDN +endif::[] +ifeval::["{context}" == "migrate-from-openshift-sdn"] +:sdn: OVN-Kubernetes +:previous-sdn: OpenShift SDN +:type: OVNKubernetes +endif::[] + +[id="how-the-live-migration-process-works_{context}"] += How the limited live migration process works + +The following table summarizes the limited live migration process by segmenting between the user-initiated steps in the process and the actions that the migration script performs in response. + +.Limited live migration to OVNKubernetes from OpenShiftSDN +[cols="1,1a",options="header"] +|=== +|User-initiated steps|Migration activity +ifdef::openshift-rosa,openshift-dedicated[] +| Add the `unsupported-red-hat-internal-testing` annotation to the cluster-level network configuration. +| The Cluster Network Operator (CNO) acknowledges the unsupported testing environment. +endif::[] + +| Patch the cluster-level networking configuration by changing the `networkType` from `OpenShiftSDN` to `OVNKubernetes`. +| +Cluster Network Operator (CNO):: ++ +-- +* Sets migration related fields in the `network.operator` custom resource (CR) and waits for routable MTUs to be applied to all nodes. +* Patches the `network.operator` CR to set the migration mode to `Live` for OVN-Kubernetes and deploys the OpenShift SDN network plugin in migration mode. +* Deploys OVN-Kubernetes with hybrid overlay enabled, ensuring that no racing conditions occur. +* Waits for the OVN-Kubernetes deployment and updates the conditions in the status of the `network.config` CR. +* Triggers the Machine Config Operator (MCO) to apply the new machine config to each machine config pool, which includes node cordoning, draining, and rebooting. +* OVN-Kubernetes adds nodes to the appropriate zones and recreates pods using OVN-Kubernetes as the default CNI plugin. +* Removes migration-related fields from the network.operator CR and performs cleanup actions, such as deleting OpenShift SDN resources and redeploying OVN-Kubernetes in normal mode with the necessary configurations. +* Waits for the OVN-Kubernetes redeployment and updates the status conditions in the `network.config` CR to indicate migration completion. If your migration is blocked, see "Checking limited live migration metrics" for information on troubleshooting the issue. +-- +|=== + +ifdef::sdn[] +:!sdn: +endif::[] +ifdef::previous-sdn[] +:!previous-sdn: +endif::[] +ifdef::type[] +:!type: +endif::[] \ No newline at end of file diff --git a/modules/nw-network-plugin-migration-process.adoc b/modules/nw-network-plugin-migration-process.adoc index c69d1123a464..8610d06915e3 100644 --- a/modules/nw-network-plugin-migration-process.adoc +++ b/modules/nw-network-plugin-migration-process.adoc @@ -15,11 +15,11 @@ ifeval::["{context}" == "migrate-from-openshift-sdn"] endif::[] [id="how-the-migration-process-works_{context}"] -= How the migration process works += How the offline migration process works The following table summarizes the migration process by segmenting between the user-initiated steps in the process and the actions that the migration performs in response. -.Migrating to {sdn} from {previous-sdn} +.Offline migration to {sdn} from {previous-sdn} [cols="1,1a",options="header"] |=== @@ -38,7 +38,7 @@ CNO:: Performs the following actions: -- * Destroys the {previous-sdn} control plane pods. * Deploys the {sdn} control plane pods. -* Updates the Multus objects to reflect the new network plugin. +* Updates the Multus daemon sets and config map objects to reflect the new network plugin. -- | @@ -48,51 +48,6 @@ Cluster:: As nodes reboot, the cluster assigns IP addresses to pods on the {sdn} |=== -ifeval::["{context}" == "migrate-from-openshift-sdn"] -If a rollback to OpenShift SDN is required, the following table describes the process. - -[IMPORTANT] -==== -You must wait until the migration process from OpenShift SDN to OVN-Kubernetes network plugin is successful before initiating a rollback. -==== - -.Performing a rollback to OpenShift SDN -[cols="1,1a",options="header"] -|=== - -|User-initiated steps|Migration activity - -|Suspend the MCO to ensure that it does not interrupt the migration. -|The MCO stops. - -| -Set the `migration` field of the `Network.operator.openshift.io` custom resource (CR) named `cluster` to `OpenShiftSDN`. Make sure the `migration` field is `null` before setting it to a value. -| -CNO:: Updates the status of the `Network.config.openshift.io` CR named `cluster` accordingly. - -|Update the `networkType` field. -| -CNO:: Performs the following actions: -+ --- -* Destroys the OVN-Kubernetes control plane pods. -* Deploys the OpenShift SDN control plane pods. -* Updates the Multus objects to reflect the new network plugin. --- - -| -Reboot each node in the cluster. -| -Cluster:: As nodes reboot, the cluster assigns IP addresses to pods on the OpenShift-SDN network. - -| -Enable the MCO after all nodes in the cluster reboot. -| -MCO:: Rolls out an update to the systemd configuration necessary for OpenShift SDN; the MCO updates a single machine per pool at a time by default, so the total time the migration takes increases with the size of the cluster. - -|=== -endif::[] - ifdef::sdn[] :!sdn: endif::[] @@ -101,4 +56,4 @@ ifdef::previous-sdn[] endif::[] ifdef::type[] :!type: -endif::[] +endif::[] \ No newline at end of file diff --git a/modules/nw-ovn-kubernetes-rollback-live.adoc b/modules/nw-ovn-kubernetes-rollback-live.adoc new file mode 100644 index 000000000000..ce0e9a4b5aee --- /dev/null +++ b/modules/nw-ovn-kubernetes-rollback-live.adoc @@ -0,0 +1,88 @@ +// Module included in the following assemblies: +// +// * networking/ovn_kubernetes_network_provider/rollback-to-openshift-sdn.adoc + +:_mod-docs-content-type: PROCEDURE +[id="nw-ovn-kubernetes-rollback-live_{context}"] += Using the limited live migration method to roll back to the OpenShift SDN network plugin + +As a cluster administrator, you can roll back to the OpenShift SDN Container Network Interface (CNI) network plugin by using the limited live migration method. During the migration with this method, nodes are automatically rebooted and service to the cluster is not interrupted. + +[IMPORTANT] +==== +You must wait until the migration process from OpenShift SDN to OVN-Kubernetes network plugin is successful before initiating a rollback. +==== + +If a rollback to OpenShift SDN is required, the following table describes the process. + +.Performing a rollback to OpenShift SDN +[cols="1,1a",options="header"] +|=== + +|User-initiated steps|Migration activity +ifdef::openshift-rosa,openshift-dedicated[] +| Add the `unsupported-red-hat-internal-testing` annotation to the cluster-level network configuration. +| The Cluster Network Operator (CNO) acknowledges the unsupported testing environment. +endif::[] + +| Patch the cluster-level networking configuration by changing the `networkType` from `OVNKubernetes` to `OpenShiftSDN`. +| +Cluster Network Operator (CNO):: Performs the following actions: ++ +-- +* Sets migration related fields in the `network.operator` custom resource (CR) and waits for routable MTUs to be applied to all nodes by the Machine Config Operator (MCO). +* Patches the `network.operator` CR to set the migration mode to `Live` for OpenShiftSDN and deploys the OpenShiftSDN network plugin in migration mode. +* Deploys OVN-Kubernetes with hybrid overlay enabled. +* Waits for both CNI plugins to be deployed and updates the conditions in the status of the `network.config` CR. +* Triggers the MCO to apply the new machine config to each machine config pool, which includes node cordoning, draining, and rebooting. +* Removes migration-related fields from the `network.operator` CR and performs cleanup actions, such as deleting OpenShift SDN resources and redeploying OVN-Kubernetes in normal mode with the necessary configurations. +* Waits for the OpenShiftSDN redeployment and updates the status conditions in the `network.config` CR to indicate migration completion. +-- + +|=== + +.Prerequisites + +* The {oc-first} is installed. +* Access to the cluster as a user with the cluster-admin role is available. +* The cluster is installed on infrastructure configured with the OVN-Kubernetes network plugin. +* A recent backup of the etcd database is available. +* A manual reboot can be triggered for each node. +* The cluster is in a known good state, without any errors. + +.Procedure + +. To initiate the rollback to OpenShift SDN, enter the following command: ++ +[source,terminal] +---- +$ oc patch Network.config.openshift.io cluster --type='merge' --patch '{"metadata":{"annotations":{"network.openshift.io/network-type-migration":""}},"spec":{"networkType":"OpenShiftSDN"}}' +---- + +. To watch the progress of your migration, enter the following command: ++ +[source,terminal] +---- +$ watch -n1 'oc get network.config/cluster -o json | jq ".status.conditions[]|\"\\(.type) \\(.status) \\(.reason) \\(.message)\"" -r | column --table --table-columns NAME,STATUS,REASON,MESSAGE --table-columns-limit 4; echo; oc get mcp -o wide; echo; oc get node -o "custom-columns=NAME:metadata.name,STATE:metadata.annotations.machineconfiguration\\.openshift\\.io/state,DESIRED:metadata.annotations.machineconfiguration\\.openshift\\.io/desiredConfig,CURRENT:metadata.annotations.machineconfiguration\\.openshift\\.io/currentConfig,REASON:metadata.annotations.machineconfiguration\\.openshift\\.io/reason"' +---- ++ +The command prints the following information every second: ++ +* The conditions on the status of the `network.config.openshift.io/cluster` object, reporting the progress of the migration. +* The status of different nodes with respect to the `machine-config-operator` resource, including whether they are upgrading or have been upgraded, as well as their current and desired configurations. + +. Complete the following steps only if the migration succeeds and your cluster is in a good state: + +.. Remove the `network.openshift.io/network-type-migration=` annotation from the `network.config` custom resource by entering the following command: ++ +[source,terminal] +---- +$ oc annotate network.config cluster network.openshift.io/network-type-migration- +---- + +.. Remove the OVN-Kubernetes network provider namespace by entering the following command: ++ +[source,terminal] +---- +$ oc delete namespace openshift-ovn-kubernetes +---- \ No newline at end of file diff --git a/modules/nw-ovn-kubernetes-rollback.adoc b/modules/nw-ovn-kubernetes-rollback.adoc index af0481f0c5cf..85d7cd13f16d 100644 --- a/modules/nw-ovn-kubernetes-rollback.adoc +++ b/modules/nw-ovn-kubernetes-rollback.adoc @@ -6,24 +6,60 @@ // This procedure applies to both a roll back and a migration :_mod-docs-content-type: PROCEDURE [id="nw-ovn-kubernetes-rollback_{context}"] -= Migrating to the OpenShift SDN network plugin += Using the offline migration method to roll back to the OpenShift SDN network plugin Cluster administrators can roll back to the OpenShift SDN Container Network Interface (CNI) network plugin by using the offline migration method. During the migration you must manually reboot every node in your cluster. With the offline migration method, there is some downtime, during which your cluster is unreachable. -ifeval::["{context}" == "rollback-to-openshift-sdn"] [IMPORTANT] ==== You must wait until the migration process from OpenShift SDN to OVN-Kubernetes network plugin is successful before initiating a rollback. ==== -endif::[] + +If a rollback to OpenShift SDN is required, the following table describes the process. + +.Performing a rollback to OpenShift SDN +[cols="1,1a",options="header"] +|=== + +|User-initiated steps|Migration activity + +|Suspend the MCO to ensure that it does not interrupt the migration. +|The MCO stops. + +| +Set the `migration` field of the `Network.operator.openshift.io` custom resource (CR) named `cluster` to `OpenShiftSDN`. Make sure the `migration` field is `null` before setting it to a value. +| +CNO:: Updates the status of the `Network.config.openshift.io` CR named `cluster` accordingly. + +|Update the `networkType` field. +| +CNO:: Performs the following actions: ++ +-- +* Destroys the OVN-Kubernetes control plane pods. +* Deploys the OpenShift SDN control plane pods. +* Updates the Multus objects to reflect the new network plugin. +-- + +| +Reboot each node in the cluster. +| +Cluster:: As nodes reboot, the cluster assigns IP addresses to pods on the OpenShift-SDN network. + +| +Enable the MCO after all nodes in the cluster reboot. +| +MCO:: Rolls out an update to the systemd configuration necessary for OpenShift SDN; the MCO updates a single machine per pool at a time by default, so the total time the migration takes increases with the size of the cluster. + +|=== .Prerequisites -* Install the OpenShift CLI (`oc`). -* Access to the cluster as a user with the `cluster-admin` role. -* A cluster installed on infrastructure configured with the OVN-Kubernetes network plugin. +* The {oc-first} is installed. +* Access to the cluster as a user with the cluster-admin role is available. +* The cluster is installed on infrastructure configured with the OVN-Kubernetes network plugin. * A recent backup of the etcd database is available. -* A reboot can be triggered manually for each node. +* A manual reboot can be triggered for each node. * The cluster is in a known good state, without any errors. .Procedure @@ -357,4 +393,4 @@ $ oc patch Network.operator.openshift.io cluster --type='merge' \ [source,terminal] ---- $ oc delete namespace openshift-ovn-kubernetes ----- +---- \ No newline at end of file diff --git a/networking/openshift_sdn/rollback-to-ovn-kubernetes.adoc b/networking/openshift_sdn/rollback-to-ovn-kubernetes.adoc index 661017548f4e..bdf2ec36b6c5 100644 --- a/networking/openshift_sdn/rollback-to-ovn-kubernetes.adoc +++ b/networking/openshift_sdn/rollback-to-ovn-kubernetes.adoc @@ -6,7 +6,7 @@ include::_attributes/common-attributes.adoc[] toc::[] -As a cluster administrator, you can rollback to the OVN-Kubernetes network plugin from the OpenShift SDN network plugin if the migration to OpenShift SDN is unsuccessful. +As a cluster administrator, you can roll back to the OVN-Kubernetes network plugin from the OpenShift SDN network plugin if the migration to OpenShift SDN is unsuccessful. include::snippets/sdn-deprecation-statement.adoc[] diff --git a/networking/ovn_kubernetes_network_provider/rollback-to-openshift-sdn.adoc b/networking/ovn_kubernetes_network_provider/rollback-to-openshift-sdn.adoc index 033c309d7673..6460cb629b87 100644 --- a/networking/ovn_kubernetes_network_provider/rollback-to-openshift-sdn.adoc +++ b/networking/ovn_kubernetes_network_provider/rollback-to-openshift-sdn.adoc @@ -6,8 +6,15 @@ include::_attributes/common-attributes.adoc[] toc::[] -As a cluster administrator, you can rollback to the OpenShift SDN from the OVN-Kubernetes network plugin only after the migration to the OVN-Kubernetes network plugin is completed and successful. +As a cluster administrator, you can roll back to the OpenShift SDN network plugin from the OVN-Kubernetes network plugin using either the _offline_ migration method, or the _limited live_ migration method. This can only be done after the migration to the OVN-Kubernetes network plugin has successfully completed. + +[NOTE] +==== +* If you used the offline migration method to migrate to the OpenShift SDN network plugin from the OVN-Kubernetes network plugin, you should use the offline migration rollback method. +* If you used the limited live migration method to migrate to the OpenShift SDN network plugin from the OVN-Kubernetes network plugin, you should use the limited live migration rollback method. +==== include::snippets/sdn-deprecation-statement.adoc[] -include::modules/nw-ovn-kubernetes-rollback.adoc[leveloffset=+1] \ No newline at end of file +include::modules/nw-ovn-kubernetes-rollback.adoc[leveloffset=+1] +include::modules/nw-ovn-kubernetes-rollback-live.adoc[leveloffset=+1] \ No newline at end of file