Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: refactor AKS installation instructions #23304

Merged
merged 1 commit into from Feb 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
71 changes: 12 additions & 59 deletions Documentation/gettingstarted/k8s-install-default.rst
Expand Up @@ -56,14 +56,7 @@ to create a Kubernetes cluster locally or using a managed Kubernetes service:

Please make sure to read and understand the documentation page on :ref:`taint effects and unmanaged pods<taint_effects>`.

.. group-tab:: AKS (BYOCNI)

.. note::

BYOCNI is the preferred way to run Cilium on AKS, however integration
with the Azure stack via the :ref:`Azure IPAM<ipam_azure>` is not
available. If you require Azure IPAM, refer to the AKS (Azure IPAM)
installation.
.. group-tab:: AKS

The following commands create a Kubernetes cluster using `Azure
Kubernetes Service <https://docs.microsoft.com/en-us/azure/aks/>`_ with
Expand All @@ -74,11 +67,6 @@ to create a Kubernetes cluster locally or using a managed Kubernetes service:
<https://docs.microsoft.com/en-us/azure/aks/use-byo-cni?tabs=azure-cli>`_
for more details about BYOCNI prerequisites / implications.

.. note::

BYOCNI requires the ``aks-preview`` CLI extension with version >=
0.5.55, which itself requires an ``az`` CLI version >= 2.32.0 .

.. code-block:: bash

export NAME="$(whoami)-$RANDOM"
Expand All @@ -94,43 +82,6 @@ to create a Kubernetes cluster locally or using a managed Kubernetes service:
# Get the credentials to access the cluster with kubectl
az aks get-credentials --resource-group "${AZURE_RESOURCE_GROUP}" --name "${NAME}"

.. group-tab:: AKS (Azure IPAM)

.. note::

:ref:`Azure IPAM<ipam_azure>` offers integration with the Azure stack
but is not the preferred way to run Cilium on AKS. If you do not
require Azure IPAM, we recommend you to switch to the AKS (BYOCNI)
installation.

The following commands create a Kubernetes cluster using `Azure
Kubernetes Service <https://docs.microsoft.com/en-us/azure/aks/>`_. See
`Azure Cloud CLI
<https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest>`_
for instructions on how to install ``az`` and prepare your account.

.. code-block:: bash

export NAME="$(whoami)-$RANDOM"
export AZURE_RESOURCE_GROUP="${NAME}-group"
az group create --name "${AZURE_RESOURCE_GROUP}" -l westus2

# Create AKS cluster
az aks create \
--resource-group "${AZURE_RESOURCE_GROUP}" \
--name "${NAME}" \
--network-plugin azure \
--node-count 2

# Get the credentials to access the cluster with kubectl
az aks get-credentials --resource-group "${AZURE_RESOURCE_GROUP}" --name "${NAME}"

.. attention::

Do NOT specify the ``--network-policy`` flag when creating the
cluster, as this will cause the Azure CNI plugin to install unwanted
iptables rules.

.. group-tab:: EKS

The following commands create a Kubernetes cluster with ``eksctl``
Expand Down Expand Up @@ -271,9 +222,7 @@ You can install Cilium on any Kubernetes cluster. Pick one of the options below:

cilium install

.. group-tab:: AKS (BYOCNI)

.. include:: ../installation/requirements-aks-byocni.rst
.. group-tab:: AKS

**Install Cilium:**

Expand All @@ -283,17 +232,21 @@ You can install Cilium on any Kubernetes cluster. Pick one of the options below:

cilium install --azure-resource-group "${AZURE_RESOURCE_GROUP}"

.. group-tab:: AKS (Azure IPAM)
The Cilium CLI will automatically install Cilium using one of the
following installation modes based on the ``--network-plugin``
configuration detected from the AKS cluster:

.. include:: ../installation/requirements-aks-azure-ipam.rst
.. include:: ../installation/requirements-aks.rst

**Install Cilium:**
.. tabs::

Install Cilium into the AKS cluster:
.. tab:: BYOCNI

.. code-block:: shell-session
.. include:: ../installation/requirements-aks-byocni.rst

cilium install --azure-resource-group "${AZURE_RESOURCE_GROUP}"
.. tab:: Legacy Azure IPAM

.. include:: ../installation/requirements-aks-azure-ipam.rst

.. group-tab:: EKS

Expand Down
104 changes: 55 additions & 49 deletions Documentation/installation/k8s-install-helm.rst
Expand Up @@ -81,73 +81,79 @@ Install Cilium
* Reconfigure kubelet to run in CNI mode
* Mount the eBPF filesystem

.. group-tab:: AKS (BYOCNI)
.. group-tab:: AKS

.. include:: requirements-aks-byocni.rst
.. include:: ../installation/requirements-aks.rst

**Install Cilium:**
.. tabs::

Deploy Cilium release via Helm:
.. tab:: BYOCNI

.. parsed-literal::
.. include:: ../installation/requirements-aks-byocni.rst

helm install cilium |CHART_RELEASE| \\
--namespace kube-system \\
--set aksbyocni.enabled=true \\
--set nodeinit.enabled=true
**Install Cilium:**

.. group-tab:: AKS (Azure IPAM)
Deploy Cilium release via Helm:

.. include:: requirements-aks-azure-ipam.rst
.. parsed-literal::

**Create a Service Principal:**
helm install cilium |CHART_RELEASE| \\
--namespace kube-system \\
--set aksbyocni.enabled=true \\
--set nodeinit.enabled=true

In order to allow cilium-operator to interact with the Azure API, a
Service Principal with ``Contributor`` privileges over the AKS cluster is
required (see :ref:`Azure IPAM required privileges <ipam_azure_required_privileges>`
for more details). It is recommended to create a dedicated Service
Principal for each Cilium installation with minimal privileges over the
AKS node resource group:
.. tab:: Legacy Azure IPAM

.. code-block:: shell-session
.. include:: ../installation/requirements-aks-azure-ipam.rst

AZURE_SUBSCRIPTION_ID=$(az account show --query "id" --output tsv)
AZURE_NODE_RESOURCE_GROUP=$(az aks show --resource-group ${RESOURCE_GROUP} --name ${CLUSTER_NAME} --query "nodeResourceGroup" --output tsv)
AZURE_SERVICE_PRINCIPAL=$(az ad sp create-for-rbac --scopes /subscriptions/${AZURE_SUBSCRIPTION_ID}/resourceGroups/${AZURE_NODE_RESOURCE_GROUP} --role Contributor --output json --only-show-errors)
AZURE_TENANT_ID=$(echo ${AZURE_SERVICE_PRINCIPAL} | jq -r '.tenant')
AZURE_CLIENT_ID=$(echo ${AZURE_SERVICE_PRINCIPAL} | jq -r '.appId')
AZURE_CLIENT_SECRET=$(echo ${AZURE_SERVICE_PRINCIPAL} | jq -r '.password')
**Create a Service Principal:**

.. note::
In order to allow cilium-operator to interact with the Azure API, a
Service Principal with ``Contributor`` privileges over the AKS cluster is
required (see :ref:`Azure IPAM required privileges <ipam_azure_required_privileges>`
for more details). It is recommended to create a dedicated Service
Principal for each Cilium installation with minimal privileges over the
AKS node resource group:

The ``AZURE_NODE_RESOURCE_GROUP`` node resource group is *not* the
resource group of the AKS cluster. A single resource group may hold
multiple AKS clusters, but each AKS cluster regroups all resources in
an automatically managed secondary resource group. See `Why are two
resource groups created with AKS? <https://docs.microsoft.com/en-us/azure/aks/faq#why-are-two-resource-groups-created-with-aks>`__
for more details.
.. code-block:: shell-session

This ensures the Service Principal only has privileges over the AKS
cluster itself and not any other resources within the resource group.
AZURE_SUBSCRIPTION_ID=$(az account show --query "id" --output tsv)
AZURE_NODE_RESOURCE_GROUP=$(az aks show --resource-group ${RESOURCE_GROUP} --name ${CLUSTER_NAME} --query "nodeResourceGroup" --output tsv)
AZURE_SERVICE_PRINCIPAL=$(az ad sp create-for-rbac --scopes /subscriptions/${AZURE_SUBSCRIPTION_ID}/resourceGroups/${AZURE_NODE_RESOURCE_GROUP} --role Contributor --output json --only-show-errors)
AZURE_TENANT_ID=$(echo ${AZURE_SERVICE_PRINCIPAL} | jq -r '.tenant')
AZURE_CLIENT_ID=$(echo ${AZURE_SERVICE_PRINCIPAL} | jq -r '.appId')
AZURE_CLIENT_SECRET=$(echo ${AZURE_SERVICE_PRINCIPAL} | jq -r '.password')

**Install Cilium:**
.. note::

Deploy Cilium release via Helm:
The ``AZURE_NODE_RESOURCE_GROUP`` node resource group is *not* the
resource group of the AKS cluster. A single resource group may hold
multiple AKS clusters, but each AKS cluster regroups all resources in
an automatically managed secondary resource group. See `Why are two
resource groups created with AKS? <https://docs.microsoft.com/en-us/azure/aks/faq#why-are-two-resource-groups-created-with-aks>`__
for more details.

.. parsed-literal::
This ensures the Service Principal only has privileges over the AKS
cluster itself and not any other resources within the resource group.

helm install cilium |CHART_RELEASE| \\
--namespace kube-system \\
--set azure.enabled=true \\
--set azure.resourceGroup=$AZURE_NODE_RESOURCE_GROUP \\
--set azure.subscriptionID=$AZURE_SUBSCRIPTION_ID \\
--set azure.tenantID=$AZURE_TENANT_ID \\
--set azure.clientID=$AZURE_CLIENT_ID \\
--set azure.clientSecret=$AZURE_CLIENT_SECRET \\
--set tunnel=disabled \\
--set ipam.mode=azure \\
--set enableIPv4Masquerade=false \\
--set nodeinit.enabled=true
**Install Cilium:**

Deploy Cilium release via Helm:

.. parsed-literal::

helm install cilium |CHART_RELEASE| \\
--namespace kube-system \\
--set azure.enabled=true \\
--set azure.resourceGroup=$AZURE_NODE_RESOURCE_GROUP \\
--set azure.subscriptionID=$AZURE_SUBSCRIPTION_ID \\
--set azure.tenantID=$AZURE_TENANT_ID \\
--set azure.clientID=$AZURE_CLIENT_ID \\
--set azure.clientSecret=$AZURE_CLIENT_SECRET \\
--set tunnel=disabled \\
--set ipam.mode=azure \\
--set enableIPv4Masquerade=false \\
--set nodeinit.enabled=true

.. group-tab:: EKS

Expand Down
67 changes: 22 additions & 45 deletions Documentation/installation/requirements-aks-azure-ipam.rst
@@ -1,65 +1,42 @@
To install Cilium on `Azure Kubernetes Service (AKS) <https://docs.microsoft.com/en-us/azure/aks/>`_
with Azure integration via :ref:`Azure IPAM<ipam_azure>`, perform the following
steps:

**Default Configuration:**

=============== =================== ==============
Datapath IPAM Datastore
=============== =================== ==============
Direct Routing Azure IPAM Kubernetes CRD
=============== =================== ==============

.. note::

:ref:`Azure IPAM<ipam_azure>` offers integration with the Azure stack but is
not the preferred way to run Cilium on AKS. If you do not require Azure IPAM,
we recommend you to switch to the AKS (BYOCNI) installation.

.. tip::

If you want to chain Cilium on top of the Azure CNI, refer to the guide
:ref:`chaining_azure`.

**Requirements:**

* The AKS cluster must be created with ``--network-plugin azure`` for
compatibility with Cilium. The Azure network plugin will be replaced with
Cilium by the installer.
* The AKS cluster must be created with ``--network-plugin azure``. The
Azure network plugin will be replaced with Cilium by the installer.

**Limitations:**

* All VMs and VM scale sets used in a cluster must belong to the same resource
group.
* All VMs and VM scale sets used in a cluster must belong to the same
resource group.

* Adding new nodes to node pools might result in application pods being
scheduled on the new nodes before Cilium is ready to properly manage them.
The only way to fix this is either by making sure application pods are not
scheduled on new nodes before Cilium is ready, or by restarting any unmanaged
pods on the nodes once Cilium is ready.
scheduled on the new nodes before Cilium is ready to properly manage
them. The only way to fix this is either by making sure application pods
are not scheduled on new nodes before Cilium is ready, or by restarting
any unmanaged pods on the nodes once Cilium is ready.

Ideally we would recommend node pools should be tainted with
``node.cilium.io/agent-not-ready=true:NoExecute`` to ensure application pods
will only be scheduled/executed once Cilium is ready to manage them (see
:ref:`Considerations on node pool taints and unmanaged pods <taint_effects>`
``node.cilium.io/agent-not-ready=true:NoExecute`` to ensure application
pods will only be scheduled/executed once Cilium is ready to manage them
(see :ref:`Considerations on node pool taints and unmanaged pods <taint_effects>`
for more details), however this is not an option on AKS clusters:

* It is not possible to assign custom node taints such as
``node.cilium.io/agent-not-ready=true:NoExecute`` to system node pools,
cf. `Azure/AKS#2578 <https://github.com/Azure/AKS/issues/2578>`_: only
``CriticalAddonsOnly=true:NoSchedule`` is available for our use case. To
make matters worse, it is not possible to assign taints to the initial node
pool created for new AKS clusters, cf.
``node.cilium.io/agent-not-ready=true:NoExecute`` to system node
pools, cf. `Azure/AKS#2578 <https://github.com/Azure/AKS/issues/2578>`_:
only ``CriticalAddonsOnly=true:NoSchedule`` is available for our use
case. To make matters worse, it is not possible to assign taints to
the initial node pool created for new AKS clusters, cf.
`Azure/AKS#1402 <https://github.com/Azure/AKS/issues/1402>`_.

* Custom node taints on user node pools cannot be properly managed at will
anymore, cf. `Azure/AKS#2934 <https://github.com/Azure/AKS/issues/2934>`_.
* Custom node taints on user node pools cannot be properly managed at
will anymore, cf. `Azure/AKS#2934 <https://github.com/Azure/AKS/issues/2934>`_.

* These issues prevent usage of our previously recommended scenario via
replacement of initial system node pool with
``CriticalAddonsOnly=true:NoSchedule`` and usage of additional user
node pools with ``node.cilium.io/agent-not-ready=true:NoExecute``.

We do not have a standard and foolproof alternative to recommend, hence the
only solution is to craft a custom mechanism that will work in your
environment to handle this scenario when adding new nodes to AKS clusters.
We do not have a standard and foolproof alternative to recommend, hence
the only solution is to craft a custom mechanism that will work in your
environment to handle this scenario when adding new nodes to AKS
clusters.
24 changes: 3 additions & 21 deletions Documentation/installation/requirements-aks-byocni.rst
@@ -1,23 +1,5 @@
To install Cilium on `Azure Kubernetes Service (AKS) <https://docs.microsoft.com/en-us/azure/aks/>`_
in `Bring your own CNI <https://docs.microsoft.com/en-us/azure/aks/use-byo-cni?tabs=azure-cli>`_
mode, perform the following steps:

**Default Configuration:**

=============== =================== ==============
Datapath IPAM Datastore
=============== =================== ==============
Encapsulation Cluster Pool Kubernetes CRD
=============== =================== ==============

.. note::

BYOCNI is the preferred way to run Cilium on AKS, however integration with
the Azure stack via the :ref:`Azure IPAM<ipam_azure>` is not available. If
you require Azure IPAM, refer to the AKS (Azure IPAM) installation.

**Requirements:**

* The AKS cluster must be created with ``--network-plugin none`` (BYOCNI). See
the `Bring your own CNI documentation <https://docs.microsoft.com/en-us/azure/aks/use-byo-cni?tabs=azure-cli>`_
for more details about BYOCNI prerequisites / implications.
* The AKS cluster must be created with ``--network-plugin none``. See the
`Bring your own CNI <https://docs.microsoft.com/en-us/azure/aks/use-byo-cni?tabs=azure-cli>`_
documentation for more details about BYOCNI prerequisites / implications.
14 changes: 14 additions & 0 deletions Documentation/installation/requirements-aks.rst
@@ -0,0 +1,14 @@
**Default Configuration:**

============================= =============== =================== ==============
Mode (``--network-plugin``) Datapath IPAM Datastore
============================= =============== =================== ==============
BYOCNI (``none``) Encapsulation Cluster Pool Kubernetes CRD
Legacy Azure IPAM (``azure``) Direct Routing Azure IPAM Kubernetes CRD
============================= =============== =================== ==============

Using `Bring your own CNI <https://docs.microsoft.com/en-us/azure/aks/use-byo-cni?tabs=azure-cli>`_
is the preferred way to run Cilium on `Azure Kubernetes Service (AKS) <https://docs.microsoft.com/en-us/azure/aks/>`_,
however integration with the Azure stack via the :ref:`Azure IPAM<ipam_azure>`
is not available and will only work with clusters not using BYOCNI. While still
maintained for now, this mode is considered legacy.
4 changes: 3 additions & 1 deletion Documentation/network/concepts/ipam/azure.rst
Expand Up @@ -12,7 +12,9 @@ Azure IPAM

.. note::

Azure IPAM is not compatible with AKS clusters created in BYOCNI mode.
While still maintained for now, Azure IPAM is considered legacy and is not
compatible with AKS clusters created in `Bring your own CNI <https://docs.microsoft.com/en-us/azure/aks/use-byo-cni?tabs=azure-cli>`_
mode. Using BYOCNI is the preferred way to install Cilium on AKS.

The Azure IPAM allocator is specific to Cilium deployments running in the Azure
cloud and performs IP allocation based on `Azure Private IP addresses
Expand Down