From f56ca7b818430d7c541ac3f6b74fcf92bf6ee5f3 Mon Sep 17 00:00:00 2001 From: Matthew Christopher Date: Tue, 9 Apr 2024 14:28:26 -0700 Subject: [PATCH] Add best practices documentation and update other docs * Added general and security best practices sections. * Updated FAQ to discuss CVE patching. * Moved to use more visible warnings/notes in ASOv1 -> ASOv2 migration guide. This fixes #3261. --- docs/hugo/content/_index.md | 152 ++++++++++++------ docs/hugo/content/guide/_index.md | 4 + .../guide/asov1-asov2-migration/_index.md | 104 +++++++----- .../guide/asov1-asov2-migration/storage.md | 5 - .../content/guide/authentication/_index.md | 2 +- .../guide/authentication/credential-scope.md | 6 + .../content/guide/best-practices/_index.md | 55 +++++++ .../content/guide/best-practices/security.md | 83 ++++++++++ .../guide/frequently-asked-questions.md | 39 +++-- 9 files changed, 339 insertions(+), 111 deletions(-) create mode 100644 docs/hugo/content/guide/best-practices/_index.md create mode 100644 docs/hugo/content/guide/best-practices/security.md diff --git a/docs/hugo/content/_index.md b/docs/hugo/content/_index.md index 8aeef8d9581..e6e5ffe5c56 100644 --- a/docs/hugo/content/_index.md +++ b/docs/hugo/content/_index.md @@ -52,7 +52,52 @@ cert-manager-webhook-c4b5687dc-x66bj 1/1 Running 0 1m (Alternatively, you can wait for cert-manager to be ready with `cmctl check api --wait=2m` - see the [cert-manager documentation](https://cert-manager.io/docs/usage/cmctl/) for more information about `cmctl`.) -2. Create an Azure Service Principal. You'll need this to grant Azure Service Operator permissions to create resources in your subscription. +2. Install [the latest **v2+** Helm chart](https://github.com/Azure/azure-service-operator/tree/main/v2/charts): + +{{< tabpane text=true left=true >}} +{{% tab header="**Shell**:" disabled=true /%}} +{{% tab header="bash" %}} +``` bash +$ helm repo add aso2 https://raw.githubusercontent.com/Azure/azure-service-operator/main/v2/charts +$ helm upgrade --install aso2 aso2/azure-service-operator \ + --create-namespace \ + --namespace=azureserviceoperator-system \ + --set crdPattern='resources.azure.com/*;containerservice.azure.com/*;keyvault.azure.com/*;managedidentity.azure.com/*;eventhub.azure.com/*' +``` +Note: **bash** requires the value for `crdPattern` to be quoted with `'` to avoid expansion of the wildcards. +{{% /tab %}} +{{% tab header="PowerShell" %}} +``` powershell +PS> helm repo add aso2 https://raw.githubusercontent.com/Azure/azure-service-operator/main/v2/charts +PS> helm upgrade --install aso2 aso2/azure-service-operator ` + --create-namespace ` + --namespace=azureserviceoperator-system ` + --set crdPattern=resources.azure.com/*;containerservice.azure.com/*;keyvault.azure.com/*;managedidentity.azure.com/*;eventhub.azure.com/* +``` +{{% /tab %}} +{{% tab header="CMD" %}} +``` cmd +C:\> helm repo add aso2 https://raw.githubusercontent.com/Azure/azure-service-operator/main/v2/charts +C:\> helm upgrade --install aso2 aso2/azure-service-operator ^ + --create-namespace ^ + --namespace=azureserviceoperator-system ^ + --set crdPattern=resources.azure.com/*;containerservice.azure.com/*;keyvault.azure.com/*;managedidentity.azure.com/*;eventhub.azure.com/* +``` +{{% /tab %}} +{{< /tabpane >}} + +{{% alert title="Warning" color="warning" %}} +Make sure to set the `crdPattern` variable to include the CRDs you are interested in using. +You can use `--set crdPattern=*` to install all the CRDs, but be aware of the +[limits of the Kubernetes you are running](https://github.com/Azure/azure-service-operator/issues/2920). `*` is **not** +recommended on AKS Free-tier clusters. +See [CRD management](https://azure.github.io/azure-service-operator/guide/crd-management/) for more details. +{{% /alert %}} + + Alternatively you can install from the [release YAML directly](https://azure.github.io/azure-service-operator/guide/installing-from-yaml/). + +3. Create an Azure Service Principal. You'll need this to grant Azure Service Operator permissions to create resources + in your subscription. First, set the following environment variables to your Azure Tenant ID and Subscription ID with your values: @@ -78,13 +123,13 @@ C:\> SET AZURE_SUBSCRIPTION_ID= {{% /tab %}} {{< /tabpane >}} - You can find these values by using the Azure CLI: `az account show` +You can find these values by using the Azure CLI: `az account show` - Next, create a service principal with Contributor permissions for your subscription. +Next, create a service principal with Contributor permissions for your subscription. - You can optionally use a service principal with a more restricted permission set - (for example contributor to just a Resource Group), but that will restrict what you can - do with ASO. See [using reduced permissions](https://azure.github.io/azure-service-operator/guide/authentication/reducing-access/#using-a-credential-for-aso-with-reduced-permissions) for more details. +You can optionally use a service principal with a more restricted permission set +(for example contributor to just a Resource Group), but that will restrict what you can +do with ASO. See [using reduced permissions](https://azure.github.io/azure-service-operator/guide/authentication/reducing-access/#using-a-credential-for-aso-with-reduced-permissions) for more details. {{< tabpane text=true left=true >}} {{% tab header="**Shell**:" disabled=true /%}} @@ -144,63 +189,69 @@ C:\> SET AZURE_CLIENT_SECRET= {{% /tab %}} {{< /tabpane >}} -3. Install [the latest **v2+** Helm chart](https://github.com/Azure/azure-service-operator/tree/main/v2/charts): +Then create a secret named `aso-credential` in the namespace you'd like to create ASO resources in. {{< tabpane text=true left=true >}} {{% tab header="**Shell**:" disabled=true /%}} {{% tab header="bash" %}} -``` bash -$ helm repo add aso2 https://raw.githubusercontent.com/Azure/azure-service-operator/main/v2/charts -$ helm upgrade --install aso2 aso2/azure-service-operator \ - --create-namespace \ - --namespace=azureserviceoperator-system \ - --set azureSubscriptionID=$AZURE_SUBSCRIPTION_ID \ - --set azureTenantID=$AZURE_TENANT_ID \ - --set azureClientID=$AZURE_CLIENT_ID \ - --set azureClientSecret=$AZURE_CLIENT_SECRET \ - --set crdPattern='resources.azure.com/*;containerservice.azure.com/*;keyvault.azure.com/*;managedidentity.azure.com/*;eventhub.azure.com/*' +```bash +cat < helm repo add aso2 https://raw.githubusercontent.com/Azure/azure-service-operator/main/v2/charts -PS> helm upgrade --install aso2 aso2/azure-service-operator ` - --create-namespace ` - --namespace=azureserviceoperator-system ` - --set azureSubscriptionID=$AZURE_SUBSCRIPTION_ID ` - --set azureTenantID=$AZURE_TENANT_ID ` - --set azureClientID=$AZURE_CLIENT_ID ` - --set azureClientSecret=$AZURE_CLIENT_SECRET ` - --set crdPattern=resources.azure.com/*;containerservice.azure.com/*;keyvault.azure.com/*;managedidentity.azure.com/*;eventhub.azure.com/* +```powershell +@" +apiVersion: v1 +kind: Secret +metadata: + name: aso-credential + namespace: default +stringData: + AZURE_SUBSCRIPTION_ID: "$AZURE_SUBSCRIPTION_ID" + AZURE_TENANT_ID: "$AZURE_TENANT_ID" + AZURE_CLIENT_ID: "$AZURE_CLIENT_ID" + AZURE_CLIENT_SECRET: "$AZURE_CLIENT_SECRET" +"@ | kubectl apply -f - ``` {{% /tab %}} + {{% tab header="CMD" %}} -``` cmd -C:\> helm repo add aso2 https://raw.githubusercontent.com/Azure/azure-service-operator/main/v2/charts -C:\> helm upgrade --install aso2 aso2/azure-service-operator ^ - --create-namespace ^ - --namespace=azureserviceoperator-system ^ - --set azureSubscriptionID=%AZURE_SUBSCRIPTION_ID% ^ - --set azureTenantID=%AZURE_TENANT_ID% ^ - --set azureClientID=%AZURE_CLIENT_ID% ^ - --set azureClientSecret=%AZURE_CLIENT_SECRET% ^ - --set crdPattern=resources.azure.com/*;containerservice.azure.com/*;keyvault.azure.com/*;managedidentity.azure.com/*;eventhub.azure.com/* + +Create a file named `secret.yaml` with the following content. Replace each of the variables such as +`%AZURE_SUBSCRIPTION_ID%` with the subscription ID, `%AZURE_CLIENT_SECRET%` with the client secret and so on. + +``` +apiVersion: v1 +kind: Secret +metadata: + name: aso-credential + namespace: default +stringData: + AZURE_SUBSCRIPTION_ID: "%AZURE_SUBSCRIPTION_ID%" + AZURE_TENANT_ID: "%AZURE_TENANT_ID%" + AZURE_CLIENT_ID: "%AZURE_CLIENT_ID%" + AZURE_CLIENT_SECRET: "%AZURE_CLIENT_SECRET%" ``` -{{% /tab %}} -{{< /tabpane >}} -{{% alert title="Warning" color="warning" %}} -Make sure to set the `crdPattern` variable to include the CRDs you are interested in using. -You can use `--set crdPattern=*` to install all the CRDs, but be aware of the -[limits of the Kubernetes you are running](https://github.com/Azure/azure-service-operator/issues/2920). `*` is **not** -recommended on AKS Free-tier clusters. -See [CRD management](https://azure.github.io/azure-service-operator/guide/crd-management/) for more details. -{{% /alert %}} +Then run: `kubectl apply -f secret.yaml` - Alternatively you can install from the [release YAML directly](https://azure.github.io/azure-service-operator/guide/installing-from-yaml/). +{{% /tab %}} +{{< /tabpane >}} - To learn more about other authentication options, see the [authentication documentation](https://azure.github.io/azure-service-operator/guide/authentication/). +To learn more about other authentication options, see the +[authentication documentation](https://azure.github.io/azure-service-operator/guide/authentication/). ### Usage @@ -218,7 +269,8 @@ To view the logs for the running ASO controller, take note of the pod name shown $ kubectl logs -n azureserviceoperator-system azureserviceoperator-controller-manager-5b4bfc59df-lfpqf manager ``` -Let's create an Azure ResourceGroup in westcentralus with the name "aso-sample-rg". Create a file called `rg.yaml` with the following contents: +Let's create an Azure ResourceGroup in westcentralus with the name "aso-sample-rg". Create a file called `rg.yaml` +with the following contents: ```yaml apiVersion: resources.azure.com/v1api20200601 diff --git a/docs/hugo/content/guide/_index.md b/docs/hugo/content/guide/_index.md index 438d58d24e6..576c6c2c7ca 100644 --- a/docs/hugo/content/guide/_index.md +++ b/docs/hugo/content/guide/_index.md @@ -94,5 +94,9 @@ How does ASO manage a [large number of CRDs]( {{< relref "crd-management" >}})? ASO exposes [metrics]( {{< relref "metrics" >}}) for Prometheus. +### Best practices + +See [best practices]( {{< relref "best-practices" >}}) for ASO best practices. + {{< /card >}} {{< /cardpane >}} diff --git a/docs/hugo/content/guide/asov1-asov2-migration/_index.md b/docs/hugo/content/guide/asov1-asov2-migration/_index.md index 8b48c8f197d..e57ceb7c55d 100644 --- a/docs/hugo/content/guide/asov1-asov2-migration/_index.md +++ b/docs/hugo/content/guide/asov1-asov2-migration/_index.md @@ -33,23 +33,34 @@ kubectl api-resources --api-group cert-manager.io` ``` should show `v1` resources. -> ℹ️ **Note:** We strongly recommend ensuring that you're running the latest version of cert-manager -> (1.14.4 at the time this article was written). - -> ℹ️ **Note:** We also recommend ensuring that ASOv1 is configured to use `cert-manager.io/v1` resources. -> You can run `helm upgrade` and pass `--set certManagerResourcesAPIVersion=cert-manager.io/v1` to ensure ASOv1 is using -> the v1 versions of the cert-manager resources. +{{% alert title="Warning" color="warning" %}} +Ensure that you've verified each ASOv1 resource you delete has the `skipreconcile=true` annotation before you delete it +in Kubernetes or else the deletion will propagate to Azure and delete the underlying Azure resource as well, which you do not want. +{{% /alert %}} + +{{% alert title="Note" %}} +We strongly recommend ensuring that you're running the latest version of cert-manager +(1.14.4 at the time this article was written). +{{% /alert %}} + +{{% alert title="Note" %}} +We also recommend ensuring that ASOv1 is configured to use `cert-manager.io/v1` resources. +You can run `helm upgrade` and pass `--set certManagerResourcesAPIVersion=cert-manager.io/v1` to ensure ASOv1 is using +the v1 versions of the cert-manager resources. +{{% /alert %}} ### Install ASOv2 Follow the [standard instructions](../../#installation). We recommend you use the same credentials as ASOv1 is currently using. -> ℹ️ **Note:** Make sure to follow the [guidelines](../crd-management/) for setting the `--crdPattern`. Configure ASOv2 -> to only manage the CRDs which you need. -> -> For example: if you are using storage accounts, resource groups, redis cache's, cosmosdbs, eventhubs, and SQL Azure, -> then you would configure -> `--set crdPattern='resources.azure.com/*;cache.azure.com/*;documentdb.azure.com/*;eventhub.azure.com/*;sql.azure.com/*;storage.azure.com/*'` +{{% alert title="Note" %}} +Make sure to follow the [guidelines](../crd-management/) for setting the `--crdPattern`. Configure ASOv2 +to only manage the CRDs which you need. + +For example: if you are using storage accounts, resource groups, redis cache's, cosmosdbs, eventhubs, and SQL Azure, +then you would configure +`--set crdPattern='resources.azure.com/*;cache.azure.com/*;documentdb.azure.com/*;eventhub.azure.com/*;sql.azure.com/*;storage.azure.com/*'` +{{% /alert %}} ### Stop ASOv1 reconciliation @@ -68,8 +79,10 @@ kubectl annotate $(kubectl api-resources -o name | grep azure.microsoft.com | pa Once you have annotated the resources, double-check that they all have the `skipreconcile` annotation. -> ⚠️ **Important**: If you delete an ASOv1 resource that does not have this annotation it will delete the underlying -> Azure resource which may cause downtime or an outage. +{{% alert title="Warning" color="warning" %}} +If you delete an ASOv1 resource that does not have this annotation it _**will**_ delete the underlying +Azure resource which may cause downtime or an outage. +{{% /alert %}} ### Use `asoctl` to import the resources from Azure into ASOv2 @@ -147,13 +160,17 @@ spec: Once you have `resources.yaml` locally, examine it to ensure that it has the resources you expect. -> ℹ️ **Note**: `asoctl` is importing the resources from Azure. This means it will not preserve any Kubernetes labels/annotations -> you have on your ASOv1 resources. If your resources need certain labels or annotations, add those to `resources.yaml` -> manually or use the `--annotation` and `--label` arguments to `asoctl`. +{{% alert title="Note" %}} +`asoctl` is importing the resources from Azure. This means it cannot preserve any Kubernetes labels/annotations +you have on your ASOv1 resources. If your resources need certain labels or annotations, add those to `resources.yaml` +manually or use the `--annotation` and `--label` arguments to `asoctl`. +{{% /alert %}} -> ℹ️ **Note**: By default `asoctl` imports everything into the `default` namespace. Before applying any YAML -> created by `asoctl`, ensure that the namespaces for the resources are correct. You can do this by manually -> modifying the YAML, using a tool such as Kustomize, or using the `--namespace` argument to `asoctl`. +{{% alert title="Note" %}} +By default `asoctl` imports everything into the `default` namespace. Before applying any YAML +created by `asoctl`, ensure that the namespaces for the resources are correct. You can do this by manually +modifying the YAML, using a tool such as Kustomize, or using the `--namespace` argument to `asoctl`. +{{% /alert %}} ### Configure `resources.yaml` to export the secrets you need from Azure @@ -164,11 +181,13 @@ ASOv1 exports secrets from Azure according to the rules. ASOv1 always exports these secrets, there is no user configuration required on either the name of the secret or its values. In contrast, ASOv2 requires the user to opt-in to secret export, via the `spec.operatorSpec.secrets` property. -> ℹ️ **Note**: ASOv2 also splits secrets/configuration into two categories: Secrets provided by you, and secrets generated -> by Azure. Exporting secrets from Azure is done via the `spec.operatorSpec.secrets`, while supplying secrets to Azure -> is done by passing the reference to a secret key/value, such as via the -> [administratorLoginPassword field of Azure SQL](../../reference/sql/v1api20211101#sql.azure.com/v1api20211101.Server_Spec). -> This means that there may be cases where a single secret in ASOv1 becomes 2 secrets in ASOv2, one for inputs and one for outputs. +{{% alert title="Note" %}} +ASOv2 also splits secrets into two categories: Secrets provided by you, and secrets generated +by Azure. Exporting secrets from Azure is done via the `spec.operatorSpec.secrets`, while supplying secrets to Azure +is done by passing the reference to a secret key/value, such as via the +[administratorLoginPassword field of Azure SQL](../../reference/sql/v1api20211101#sql.azure.com/v1api20211101.Server_Spec). +This means that there may be cases where a single secret in ASOv1 becomes 2 secrets in ASOv2, one for inputs and one for outputs. +{{% /alert %}} You may need to configure `resouces.yaml` to export corresponding secrets. To determine if, for a given namespace, there are secrets being written by ASOv1 and consumed by your applications, check the following two things: @@ -206,13 +225,16 @@ Once you've identified the set of secrets which are exported by ASOv1 and which configure ASOv2 to export similar secrets by using the `spec.operatorSpec.secrets` field. See [examples](#examples) for examples of various resource types. -> ℹ️ **Note:** It is strongly recommended that you export the ASOv2 secrets to a _different_ secret name than the -> ASOv1 secret. ASOv2 will not allow you to overwrite an existing secret with `spec.operatorSpec.secrets`. -> You'll get an error that looks like -> "cannot overwrite Secret ns1/storageaccount-cutoverteststorage1 which is not owned by StorageAccount.storage.azure.com -> ns1/aso-migration-test-cutoverteststorage1". -> Instead of overwriting the same secret, create a different secret with ASOv2 (recommend including a suffix to -> identify it such as -asov2). Then, when you're ready, swap your deployment to use the new ASOv2 secrets. + +{{% alert title="Note" %}} +It is strongly recommended that you export the ASOv2 secrets to a _different_ secret name than the +ASOv1 secret. ASOv2 will not allow you to overwrite an existing secret with `spec.operatorSpec.secrets`. +You'll get an error that looks like +"cannot overwrite Secret ns1/storageaccount-cutoverteststorage1 which is not owned by StorageAccount.storage.azure.com +ns1/aso-migration-test-cutoverteststorage1". +Instead of overwriting the same secret, create a different secret with ASOv2 (recommend including a suffix to +identify it such as -asov2). Then, when you're ready, swap your deployment to use the new ASOv2 secrets. +{{% /alert %}} ### Configure `resources.yaml` to source secret values from Kubernetes @@ -246,11 +268,13 @@ This can be done automatically by `asoctl` during resource import with the `-annotation serviceoperator.azure.com/reconcile-policy=skip` argument or manually by modifying `resources.yaml` afterward. -> ⚠️ **Important:** Be _very_ careful issuing deletions of ASOv2 resources once you've imported them if they do not have -> the serviceoperator.azure.com/reconcile-policy=skip annotation. Just like ASOv1, -> ASOv2 will delete the underlying Azure resource by default! -> It is _strongly_ recommended that you use asoctl's `-annotation serviceoperator.azure.com/reconcile-policy=skip` flag -> and only remove that annotation a few resources at a time to ensure things are working as you expect. +{{% alert title="Warning" color="warning" %}} +Be _very_ careful issuing deletions of ASOv2 resources once you've imported them if they do not have +the `serviceoperator.azure.com/reconcile-policy=skip` annotation. Just like ASOv1, +ASOv2 _**will**_ delete the underlying Azure resource by default! +It is _strongly_ recommended that you use asoctl's `-annotation serviceoperator.azure.com/reconcile-policy=skip` flag +and only remove that annotation a few resources at a time to ensure things are working as you expect. +{{% /alert %}} ### Apply `resources.yaml` to your cluster @@ -276,8 +300,10 @@ to rely on the ASOv2 secrets. Once you've migrated the ASOv1 resources to ASOv2 and been running successfully for a while, you can delete the ASOv1 resources in Kubernetes with `kubectl delete`. -> ⚠️ **Important:** Make sure that each ASOv1 resource you delete has the `skipreconcile=true` annotation before you delete it -> in Kubernetes or else the deletion will propagate to Azure and delete the underlying Azure resource as well, which you do not want. +{{% alert title="Warning" color="warning" %}} +Make sure that each ASOv1 resource you delete has the `skipreconcile=true` annotation before you delete it +in Kubernetes or else the deletion will propagate to Azure and delete the underlying Azure resource as well, which you do not want. +{{% /alert %}} ## Examples diff --git a/docs/hugo/content/guide/asov1-asov2-migration/storage.md b/docs/hugo/content/guide/asov1-asov2-migration/storage.md index a9d96c3c291..ab753ad8826 100644 --- a/docs/hugo/content/guide/asov1-asov2-migration/storage.md +++ b/docs/hugo/content/guide/asov1-asov2-migration/storage.md @@ -39,8 +39,3 @@ spec: Once you've applied the above, make sure to update your applications to depend on the new secret written by ASOv2. - -TODO: How to work around connection string issues? -1. init container? https://stackoverflow.com/questions/77748779/how-to-prepare-environment-variables-with-init-containers-in-a-kubernetes-deploy -2. Support https://github.com/Azure/azure-service-operator/issues/3446? -3. Modify the command of the container itself to run a script first? diff --git a/docs/hugo/content/guide/authentication/_index.md b/docs/hugo/content/guide/authentication/_index.md index 6b66605d1ec..8917aa07931 100644 --- a/docs/hugo/content/guide/authentication/_index.md +++ b/docs/hugo/content/guide/authentication/_index.md @@ -22,7 +22,7 @@ Azure Service Operator supports four different styles of authentication today. Each supported credential type can be specified at one of three supported scopes: -1. [Global]( {{< relref "credential-scope#global-scope" >}} ) - The credential applies to all resources managed by ASO. +1. [Not recommended] [Global]( {{< relref "credential-scope#global-scope" >}} ) - The credential applies to all resources managed by ASO. 2. [Namespace]( {{< relref "credential-scope#namespace-scope" >}} ) - The credential applies to all resources managed by ASO in that namespace. 3. [Resource]( {{< relref "credential-scope#resource-scope" >}} ) - The credential applies to only the specific resource it is referenced on. diff --git a/docs/hugo/content/guide/authentication/credential-scope.md b/docs/hugo/content/guide/authentication/credential-scope.md index 6b15b016898..871b857c471 100644 --- a/docs/hugo/content/guide/authentication/credential-scope.md +++ b/docs/hugo/content/guide/authentication/credential-scope.md @@ -8,6 +8,12 @@ _resource scope_ takes precedence over _namespace scope_ which takes precedence ## Global scope +{{% alert title="Warning" color="warning" %}} +Be careful when using the global scope credential. A user in any namespace in your cluster will have the ability +to do everything that the global credential can do. For security best practice we recommend using namespace scoped +or resource scoped credentials. See [security best practices]({{< relref "/guide/best-practices/security" >}}) for more details. +{{% /alert %}} + The global credential resides in the `aso-controller-settings` secret deployed as part of operator deployment in operator's namespace. This is the scope configured when configuring credentials via the Helm chart installation. diff --git a/docs/hugo/content/guide/best-practices/_index.md b/docs/hugo/content/guide/best-practices/_index.md new file mode 100644 index 00000000000..cb88565a23a --- /dev/null +++ b/docs/hugo/content/guide/best-practices/_index.md @@ -0,0 +1,55 @@ +--- +title: "Best Practices" +linkTitle: "Best Practices" +--- + +## Managing multiple copies of the same resource + +### Transferring resources from one cluster to another + +There are two important tenets to remember when transferring resources between clusters: +1. Don't accidentally delete the resources in Azure during the transfer. +2. Don't have two instances of ASO fighting to reconcile the same resource to different states. + +If you want to migrate all of your ASO resources from cluster A to cluster B, we recommend the following +pattern: + +1. Annotate the resources in cluster A with + [serviceoperator.azure.com/reconcile-policy: skip]( {{< relref "annotations#serviceoperatorazurecomreconcile-policy" >}} ). + This prevents ASO in that cluster from updating or deleting those resources. +2. Ensure that cluster B has ASO installed. +3. `kubectl apply` the resources into cluster B. We strongly recommend an infrastructure-as-code approach where you + keep your original/goal-state ASO YAMLs around. + - Ensure these resources do not have the `serviceoperator.azure.com/reconcile-policy: skip` annotation set. +4. Delete the resources in cluster A. Note that because of the `skip` annotation, this will not delete the backing + Azure resources. + +### Transferring resources from one namespace to another + +See [above](#transferring-resources-from-one-cluster-to-another). The process is +the same for moving between namespaces. + +## Common cluster architectures + +It's easiest to reason about ASO and what it is managing in Azure if you set up a simple mapping between +Kubernetes entities (clusters and namespaces) and the Azure resources being managed (subscriptions and resource groups). +These architectures are often used with GitOps tools such as Flux or ArgoCD. + +See also: +- [Reducing access]({{< relref "reducing-access" >}}). +- [Security best practices]({{< relref "security" >}}). + +### Cluster per environment + +Environments like `dev`, `test`, and `prod` each have dedicated clusters with their own copy of ASO installed. ASO is +configured with a [global credential]({{< relref "credential-scope#global-scope" >}}) that has permissions to manage the +environment in question. + +### Namespace per environment + +Environments like `dev`, `test`, and `prod` each have dedicated namespaces within your Kubernetes cluster. Each namespace +has an `aso-credential` with permissions to manage the environment in question. + +A variant on this is **namespace per developer**, where each developer gets their own `dev-alice` or `dev-bob` namespace, +rather than the whole team sharing a single `dev` namespace. Each `dev` namespace can point to a separate dev +subscription or share the same dev subscription. diff --git a/docs/hugo/content/guide/best-practices/security.md b/docs/hugo/content/guide/best-practices/security.md new file mode 100644 index 00000000000..e1a062ae6f7 --- /dev/null +++ b/docs/hugo/content/guide/best-practices/security.md @@ -0,0 +1,83 @@ +--- +title: Security Best Practices +linktitle: Security +--- + +## Securing ASO in your cluster + +ASO has 3 levers that allow you to manage access to your Azure resources: + +1. Controlling [which CRDs are installed]({{< relref "crd-management" >}}). +2. Controlling the Azure identities used by ASO at each [scope]({{< relref "credential-scope" >}}), + including the Azure RBAC permissions assigned to those identities. +3. Controlling the Kubernetes identities that use the cluster and their Kubernetes RBAC permissions. + +We recommend making use of all 3 of these levers to fully secure a cluster running ASO. + +## Dos and Don'ts + +> ✅ DO adopt this pattern. +> +> ⛔ DO NOT adopt this pattern. + +### General guidance + +✅ DO use [Azure Workload Identity]( {{< relref "credential-format#azure-workload-identity" >}} ) for all +credentials. Other supported identity types are called out +[in the authentication documentation]( {{< relref "authentication#credential-type" >}} ). + +✅ DO use namespace-scoped ASO credentials, rather than global scope. Note that the global scope credential is _optional_ +and may be omitted when installing ASO. + +✅ DO follow the [principle of least privilege](https://learn.microsoft.com/entra/identity/role-based-access-control/best-practices#1-apply-principle-of-least-privilege) +when assigning roles to identities which will be used by ASO. Remember, users with access to the namespace the +ASO credential is in can do everything that credential can do via ASO. This means that if users in namespace `a` +are supposed to have broad permissions only to resources in resourceGroup `a`, then the ASO +identity for namespace `a` should have **Contributor** only on resourceGroup `a` and not the whole +subscription. See [reducing access]( {{< relref "reducing-access" >}} ) for more details on managing Azure access. + +✅ DO restrict access to sensitive namespaces in the cluster. On AKS, you can use a combination of +[AAD (now Entra) integration](https://learn.microsoft.com/en-us/azure/aks/enable-authentication-microsoft-entra-id), +[disabling local users](https://learn.microsoft.com/en-us/azure/aks/manage-local-accounts-managed-azure-ad), and +defining [JIT/Conditional access policies](https://learn.microsoft.com/en-us/azure/aks/access-control-managed-azure-ad). +We strongly recommend setting up conditional access policies for sensitive namespaces such as `production`. Doubly so +if the ASO credential for that namespace has broad scope. + +✅ DO use tools like ArgoCD or Flux to perform code review of changes to ASO CRs before allowing them to be +merged and applied. + +✅ DO only install the ASO CRDs you need, no more. + +⛔ DO NOT install the `RoleAssignment` CRD if you don't need it. This CRD can enable escalation of privilege if not +used carefully. If using the `RoleAssignment`, follow the other DOs in this guide to do it safely. + +## An Example Setup + +A possible setup might be a `dev` namespace set up as a development environment pointing at a development subscription, +and a `prod` namespace set up as a production env. The `dev` namespace might point to a development subscription +and the `prod` namespace to a production subscription. + +`dev` namespace has Azure credentials which are contributor on the `dev` subscription, +same for `prod` for the prod sub. Developers in `dev` are members of an Azure AD group with roles that give +access to CRUD ASO CRDs and other Kubernetes resources (Pods, etc) in the `dev` namespace, but _not_ the production +namespace. + +This means that developers can do basically whatever they want in the dev namespace, including assign roles to +themselves at the Azure level in the dev subscription. + +`prod` namespace _also_ has an Azure AD group with roles that give access to CRUD ASO CRDs and other +Kubernetes resources, but that group is by default empty. +Users can use [JIT/Conditional access policies](https://learn.microsoft.com/azure/aks/access-control-managed-azure-ad) +to escalate into that group. This means that by default, nobody can do anything in the `prod` namespace to +either the Kubernetes resources (Pods, etc) or the Azure resources via ASO or the portal. + +When a user needs ad-hoc access to the `prod` namespace they can go through the JIT process to get +access to the `prod` namespace. Standard deployments to `prod` should be done through a CI/CD tool +like Argo or Flux. This has the advantage of ensuring that proposed changes to `prod` need to first meet the merge bar +(pass through code review and other processes) to make it into the repo before Argo/Flux will deploy them to `prod`. +The conditional access/JIT is a break-glass used rarely. + +Note that the above is just _one_ way to lay things out. The same ideas can be applied to a `dev` and `prod` resource group +within a single sub and also to other more complex topologies. `test` or `int` can be added in the middle with a more +locked down set of rules than dev but less locked down than prod (or maybe `test` and `prod` have very similar +lockdowns to force the same procedures across both). diff --git a/docs/hugo/content/guide/frequently-asked-questions.md b/docs/hugo/content/guide/frequently-asked-questions.md index 4b1ca8b1b82..081646afc30 100644 --- a/docs/hugo/content/guide/frequently-asked-questions.md +++ b/docs/hugo/content/guide/frequently-asked-questions.md @@ -8,7 +8,27 @@ weight: -2 We ship updates to ASO as needed, with an eye towards releasing every 1-2 months. If there are urgent fixes, a release may happen more quickly than that. If there haven't been any major changes (or there are ongoing major changes that are taking a long time) a -release may happen more slowly. +release may happen more slowly. For an up-to-date plan check the +[milestone tracker](https://github.com/Azure/azure-service-operator/milestones). + +### How are CVEs dealt with? + +The ASO controller container is built on [distroless](https://github.com/GoogleContainerTools/distroless), and as such +has a relatively minimal surface area. Most CVEs that impact ASO are related to the Golang packages used by +the controller. + +We scan for new CVEs weekly using [Trivy](https://github.com/aquasecurity/trivy) and also get proactive updates via +[Dependabot](https://docs.github.com/code-security/dependabot/dependabot-alerts/about-dependabot-alerts). + +CVEs are triaged according to their severity (Low, Moderate, High, Critical) and whether they are exploitable in ASO. +Low and moderate severity issues will be fixed in the next minor release of ASO, high and critical severity CVEs that +can be exploited in ASO will have a patch released for them. + +Note that we cannot patch CVEs for which there is no upstream fix. Only once an upstream fix has been released will ASO +fix the CVE. + +Fixes are _not_ backported to older versions of ASO. If you're running v2.5.0 and a CVE is fixed in v2.7.0, you must +upgrade to v2.7.0 to get the fix. ### What is the support model? @@ -30,22 +50,9 @@ If the underlying Azure Resource doesn't support DR (or the story is more compli There's also a proposal for [more general upstream support](https://github.com/kubernetes/kubernetes/issues/10179) on this topic, although there hasn't been movement on it in a while. -### What is the best practice for transferring ASO resources from one cluster to another? - -There are two important tenets to remember when transferring resources between clusters: -1. Don't accidentally delete the resources in Azure during the transfer. -2. Don't have two instances of ASO fighting to reconcile the same resource to different states. - -Let's say that you want to migrate all of your ASO resources from cluster A to cluster B. We recommend the following pattern: - -1. Annotate the resources in cluster A with [serviceoperator.azure.com/reconcile-policy: skip]( {{< relref "annotations#serviceoperatorazurecomreconcile-policy" >}} ). This prevents ASO in that cluster from updating or deleting those resources. -2. Ensure that cluster B has ASO installed. -3. `kubectl apply` the resources into cluster B. We strongly recommend an infrastructure-as-code approach where you keep your original/goal-state ASO YAMLs around. -4. Delete the resources in cluster A. Note that because of the `skip` annotation, this will not delete the backing Azure resources. - -### What is the best practice for transferring ASO resources from one namespace to another? +### What are some ASO best practices? -See [above](#what-is-the-best-practice-for-transferring-aso-resources-from-one-cluster-to-another). The process is the same for moving between namespaces. +See [best practices]( {{< relref "best-practices" >}} ). ### Can I run ASO in active-active mode?