From d5bf349ad4d4b5be154c438f5bb679b0b3678bde Mon Sep 17 00:00:00 2001 From: Michelle Noorali Date: Thu, 1 Apr 2021 13:30:46 -0400 Subject: [PATCH] ref(docs): update uninstall guide (#3003) + update uninstall guide + update uninstall troubleshooting guide + resolves #2880 Signed-off-by: Michelle Noorali --- .../docs/install/uninstallation_guide.md | 134 +++++++++++++++--- docs/content/docs/troubleshooting/install.md | 31 ++++ .../content/docs/troubleshooting/uninstall.md | 19 +++ .../docs/troubleshooting/uninstall/_index.md | 22 --- docs/content/docs/troubleshooting/upgrade.md | 3 +- 5 files changed, 165 insertions(+), 44 deletions(-) create mode 100644 docs/content/docs/troubleshooting/install.md create mode 100644 docs/content/docs/troubleshooting/uninstall.md delete mode 100644 docs/content/docs/troubleshooting/uninstall/_index.md diff --git a/docs/content/docs/install/uninstallation_guide.md b/docs/content/docs/install/uninstallation_guide.md index e2fce05eb2..1a76be4dba 100644 --- a/docs/content/docs/install/uninstallation_guide.md +++ b/docs/content/docs/install/uninstallation_guide.md @@ -8,44 +8,136 @@ weight: 3 # Uninstallation Guide -This guide describes how to uninstall Open Service Mesh (OSM) from a Kubernetes cluster using the `osm` CLI. +This guide describes how to uninstall Open Service Mesh (OSM) from a Kubernetes cluster. This guide assumes there is a single OSM control plane (mesh) running. If there are multiple meshes in a cluster, repeat the process described for each control plane in the cluster before uninstalling any cluster wide resources at the end of the guide. Taking into consideration both the control plane and dataplane, this guide aims to walk through uninstalling all remnants of OSM with minimal downtime. ## Prerequisites - Kubernetes cluster with OSM installed -- The osm CLI +- The `kubectl` CLI +- The `osm` CLI -## Uninstall OSM +## Remove Envoy Sidecars from Application Pods and Envoy Secrets -Use the `osm` CLI to uninstall the OSM control plane from a Kubernetes cluster. +The first step to uninstalling OSM is to remove the Envoy sidecar containers from application pods. The sidecar containers enforce traffic policies. Without them, traffic will flow to and from Pods according in accordance with default Kubernetes networking unless there are [Kubernetes Network Policies](https://kubernetes.io/docs/concepts/services-networking/network-policies/) applied. -Run `osm mesh uninstall`. +OSM Envoy sidecars and related secrets will be removed in the following steps: + +1. [Disable automatic sidecar injection](#disable-automatic-sidecar-injection) +1. [Restart pods](#restart-pods) +1. [Update Ingress Resources](#update-ingress-resources) +1. [Delete Envoy bootsrap secrets](#delete-envoy-bootsrap-secrets) + +### Disable Automatic Sidecar Injection + +OSM Automatic Sidecar Injection is most commonly enabled by adding namespaces to the mesh via the `osm` CLI. Use the `osm` CLI to see which +namespaces have sidecar injection enabled. If there are multiple control planes installed, be sure to specify the `--mesh-name` flag. + +View namespaces in a mesh: ```console -# Uninstall osm control plane components -$ osm mesh uninstall -Uninstall OSM [mesh name: osm] ? [y/n]: y -OSM [mesh name: osm] uninstalled +$ osm namespace list --mesh-name= +NAMESPACE MESH SIDECAR-INJECTION + enabled + enabled ``` -Run `osm mesh uninstall --help` for more options. +Remove each namespace from the mesh: + +```console +$ osm namespace remove --mesh-name= +Namespace [] successfully removed from mesh [] +``` + +Alternatively, if sidecar injection is enabled via annotations on pods instead of per namespace, please modify the pod or deployment spec to remove the sidecar injection annotation. + +### Restart Pods + +Restart all pods running with a sidecar: + +```console +# If pods are running as part of a Kuberenetes deployment +# Can use this strategy for daemonset as well +$ kubectl rollout restart deployment -n + +# If pod is running standalone (not part of a deployment or replica set) +$ kubectl delete pod -n namespace +$ k apply -f # if pod is not restarted as part of replicaset +``` + +Now, there should be no OSM Envoy sidecar containers running as part of the applications that were once part of the mesh. Traffic is no +longer managed by the OSM control plane with the `mesh-name` used above. During this process, your applications may experience some downtime +as all the Pods are restarting. + +### Update Ingress Resources + +There may be ingress resources in the cluster that are configured to allow traffic from outside the cluster to an application that was +once part of the mesh. Identify these resources by using the `kubectl get ingress -n ` command in each namespace that was +removed from the mesh earlier. + +If there are ingress resources and they are configured to allow HTTPS traffic, SSL related annotations will need to be updated appropriately. + +Applications may be unavailable from outside the cluster for some time if ingress resources need to be reconfigured. + +### Delete Envoy Bootsrap Secrets -## Resource Management +Once the sidecar is removed, there is no need for the Envoy bootstrap config secrets OSM created. These are stored in the application namespace and can be deleted manually with `kubectl`. These secrets have the prefix `envoy-bootstrap-config` followed by some unique ID: `envoy-bootstrap-config-`. -The following sections detail which Kubernetes resources are cleaned up and which remain after uninstalling OSM. +## Uninstall OSM Control Plane and Remove User Provided Resources -### Removed during OSM uninstallation +The OSM control plane and related components will be uninstalled in the following steps: + +1. [Uninstall the OSM control plane](#uninstall-the-osm-control-plane) +1. [Remove User Provided Resources](#remove-user-provided-resources) +1. [Delete OSM Namespace](#delete-osm-namespace) + +### Uninstall the OSM control plane + +Use the `osm` CLI to uninstall the OSM control plane from a Kubernetes cluster. The following step will remove: 1. OSM controller resources (deployment, service, config map, and RBAC) 1. Prometheus, Grafana, Jaeger, and Fluentbit resources installed by OSM 1. Mutating webhook and validating webhook -### Remaining after OSM uninstallation +Run `osm mesh uninstall`: + +```console +# Uninstall osm control plane components +$ osm mesh uninstall --mesh-name= +Uninstall OSM [mesh name: ] ? [y/n]: y +OSM [mesh name: ] uninstalled +``` + +Run `osm mesh uninstall --help` for more options. + +### Remove User Provided Resources + +If any resources were provided or created for OSM at install time, they can be deleted at this point. + +For example, if [Hashicorp Vault](https://github.com/openservicemesh/osm/blob/main/docs/content/docs/tasks_usage/certificates.md#installing-hashi-vault) was deployed for the sole purpose of managing certificates for OSM, all related resources can be deleted. + +### Delete OSM Namespace + +When installing a mesh, the `osm` CLI creates the namespace the control plane is installed into if it does not already exist. However, when uninstalling the same mesh, the namespace it lives in does not automatically get deleted by the `osm` CLI. This behavior occurs because +there may be resources a user created in the namespace that they may not want automatically deleted. + +If the namespace was only used for OSM and there is nothing that needs to be kept around, the namespace can be deleted at this time with `kubectl`. + +```console +$ kubectl delete namespace +namespace "" deleted +``` + +Repeat the steps above for each mesh installed in the cluster. After there are no OSM control planes remaining, move to following step. + +## Remove OSM Cluster Wide resources + +OSM ensures that there the Service Mesh Interface Custom Resource Definitions(CRDs) exist in the cluster at install time. If they are not already installed, the `osm` CLI will install them before installing the rest of the control plane components. This is the same behavior when using the Helm charts to install OSM as well. CRDs are cluster-wide resources and may be used by other instances of OSM in the same cluster +or other service meshes running in the same cluster. If there are no other instances of OSM or other service meshes running in the same +cluster, these CRDs and instances of the SMI custom resources can be removed from the cluster using `kubectl`. When the CRD is deleted, all +instances of that CRD will also be deleted. + +Run the following `kubectl` commands: -1. Existing Envoy sidecar containers - - Redeploy application pods to delete sidecars -1. Envoy bootstrap config secrets (stored in the application namespace) -1. Namespace annotations, including but not limited to `openservicemesh.io/monitored-by` -1. Custom resource definitions ([CRDs](https://github.com/openservicemesh/osm/tree/release-v0.8/charts/osm/crds)) -1. Vault resources provided by the user -1. The namespace in which OSM was installed +kubectl delete -f [https://raw.githubusercontent.com/openservicemesh/osm/release-v0.8/charts/osm/crds/access.yaml](https://raw.githubusercontent.com/openservicemesh/osm/release-v0.8/charts/osm/crds/access.yaml) +kubectl delete -f [https://raw.githubusercontent.com/openservicemesh/osm/release-v0.8/charts/osm/crds/specs.yaml](https://raw.githubusercontent.com/openservicemesh/osm/release-v0.8/charts/osm/crds/specs.yaml) +kubectl delete -f [https://raw.githubusercontent.com/openservicemesh/osm/release-v0.8/charts/osm/crds/split.yaml](https://raw.githubusercontent.com/openservicemesh/osm/release-v0.8/charts/osm/crds/split.yaml) diff --git a/docs/content/docs/troubleshooting/install.md b/docs/content/docs/troubleshooting/install.md new file mode 100644 index 0000000000..d65f317965 --- /dev/null +++ b/docs/content/docs/troubleshooting/install.md @@ -0,0 +1,31 @@ +--- +title: "Install Troubleshooting" +description: "OSM Mesh Install Troubleshooting Guide" +type: docs +--- + +# OSM Mesh Install Troubleshooting Guide + +## Leaked Resources + +During an improper or incomplete uninstallation, it is possible that OSM resources could be left behind in a Kubernetes cluster. + +For example, if the Helm release, OSM controller, or their respective namespaces are deleted, then the `osm` CLI won't be able to uninstall any remaining resources, particularly if they are cluster scoped. + +As a result, one may see this error during a subsequent install of a new mesh with the same name but different namespace: + +```console +Error: rendered manifests contain a resource that already exists. Unable to continue with install: ClusterRole "" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-namespace" must equal "": current value is "" +``` + +In the case of this error, use the [cleanup script](https://github.com/openservicemesh/osm/blob/release-v0.8/scripts/cleanup/osm-cleanup.sh) located in the osm repository to delete any remaining resources. + +To run the script, create a `.env` environment variable file to set the values specified at the top of the script. These values should match the values used to deploy the mesh. + +In the root directory of the osm repository locally, run: + +```console +./scripts/cleanup/osm-cleanup.sh +``` + +Then, try installing OSM again on the cluster. diff --git a/docs/content/docs/troubleshooting/uninstall.md b/docs/content/docs/troubleshooting/uninstall.md new file mode 100644 index 0000000000..453eb6ced9 --- /dev/null +++ b/docs/content/docs/troubleshooting/uninstall.md @@ -0,0 +1,19 @@ +--- +title: "Uninstall Troubleshooting" +description: "OSM Mesh Uninstall Troubleshooting Guide" +type: docs +--- + +# OSM Mesh Uninstall Troubleshooting Guide + +## Unsuccessful Uninstall + +If for any reason, `osm uninstall` is unsuccessful, run the [cleanup script](https://github.com/openservicemesh/osm/blob/release-v0.8/scripts/cleanup/osm-cleanup.sh) which will delete any OSM related resources. + +To run the script, create a `.env` environment variable file to set the values specified at the top of the script. These values should match the values used to deploy the mesh. + +In the root directory of the osm repository locally, run: + +```console +./scripts/cleanup/osm-cleanup.sh +``` diff --git a/docs/content/docs/troubleshooting/uninstall/_index.md b/docs/content/docs/troubleshooting/uninstall/_index.md deleted file mode 100644 index 167e961fd9..0000000000 --- a/docs/content/docs/troubleshooting/uninstall/_index.md +++ /dev/null @@ -1,22 +0,0 @@ ---- -title: "Uninstall Troubleshooting" -description: "OSM Uninstall Troubleshooting Guide" -type: docs ---- - -# OSM Uninstall Troubleshooting Guide - -## Leaked Resources -If the [uninstallation guide](../../install/uninstallation_guide.md) was not followed, it is possible that resources could be leaked. - -If the Helm release, OSM controller, or their respective namespaces are deleted, then the `osm` CLI won't be able to uninstall any remaining resources, particularly if they are cluster scoped. - -These leaked resources result in an error when trying to install a new mesh with the same name but different namespace. - -``` -Error: rendered manifests contain a resource that already exists. Unable to continue with install: ClusterRole "osm" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-namespace" must equal "osm-system2": current value is "osm-system" -``` - -In the `./scripts/cleanup` directory we have included a helper script to delete those leaked resources: `./scripts/cleanup/osm-cleanup.sh` - -To run the script, create a `.env` environment variable file to set the values specified at the top of the script. These values should match the values used to deploy the mesh. diff --git a/docs/content/docs/troubleshooting/upgrade.md b/docs/content/docs/troubleshooting/upgrade.md index 2cf0b17a0c..6e4f5ad35f 100644 --- a/docs/content/docs/troubleshooting/upgrade.md +++ b/docs/content/docs/troubleshooting/upgrade.md @@ -1,12 +1,13 @@ --- title: "Upgrade Troubleshooting" -description: "OSM Upgrade Troubleshooting Guide" +description: "OSM Mesh Upgrade Troubleshooting Guide" type: docs --- # OSM Upgrade Troubleshooting Guide ## Server could not find requested resource + If the [upgrade CRD guide](../upgrade_guide.md##crd-upgrades) was not followed, it is possible that the installed CRDs are out of sync with the OSM controller. The OSM controller will then crash with errors similar to this: