Skip to content

Commit

Permalink
feat: support cilium cni (#287)
Browse files Browse the repository at this point in the history
* feat: support cilium cin

* fix lint error

* set version in manifest file

* use helm for cilium

* fix vendir

* add vendir in hack

* set cilium tag

* update makefile for vendir install

* remove cni assertion

* add validation step for network drivers

* fix tests

* add executable perm to vendir install script

* fix functional test

* resolve conflict

* fix typo

* add zuul ci jobs for cilium network driver

* enable sessionAffinity in cilium

For users who run with kube-proxy (i.e. with Cilium's kube-proxy replacement
disabled), the ClusterIP service loadbalancing when a request is sent from a pod
running in a non-host network namespace is still performed at the pod network
interface (until cilium/cilium#16197 is fixed).
For this case the session affinity support is disabled by default.

* fix flake8 errors

* ignore the below test failure until the upstream issue fixed

Failure of `[sig-network] HostPort validates that there is no conflict between pods with same hostPort but different hostIP and protocol [LinuxOnly] [Conformance]` is expected until cilium/cilium#14287 is fixed

* use portmap chain mode

* fix lint error

---------

Co-authored-by: okozachenko1203 <okozachenko1203@users.noreply.github.com>
  • Loading branch information
okozachenko1203 and okozachenko1203 committed Jun 20, 2024
1 parent 5131a77 commit 4f922d0
Show file tree
Hide file tree
Showing 16 changed files with 236 additions and 37 deletions.
4 changes: 1 addition & 3 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
.direnv
__pycache__
dist
magnum_cluster_api/charts/*
!magnum_cluster_api/charts/.gitkeep
!magnum_cluster_api/charts/k8s-keystone-auth
magnum_cluster_api/charts/vendor/*
site
18 changes: 13 additions & 5 deletions Earthfile
Original file line number Diff line number Diff line change
@@ -1,15 +1,23 @@
VERSION 0.7

vendir:
FROM github.com/vexxhost/atmosphere/images/curl+image
ARG TARGETOS
ARG TARGETARCH
ARG VERSION=v0.40.0
RUN curl -Lo vendir https://github.com/carvel-dev/vendir/releases/download/${VERSION}/vendir-${TARGETOS}-${TARGETARCH}
RUN chmod +x vendir && ./vendir version
SAVE ARTIFACT vendir

build:
FROM github.com/vexxhost/atmosphere/images/magnum+build
COPY +vendir/vendir /usr/local/bin/vendir
COPY github.com/vexxhost/atmosphere/images/helm+binary/helm /usr/local/bin/helm
RUN helm repo add autoscaler https://kubernetes.github.io/autoscaler
RUN helm repo update
COPY --dir magnum_cluster_api/ pyproject.toml README.md /src
COPY --dir magnum_cluster_api/ pyproject.toml README.md vendir.yml /src
WORKDIR /src
RUN helm fetch autoscaler/cluster-autoscaler --version 9.29.1 --untar --untardir magnum_cluster_api/charts
RUN vendir sync
COPY hack/add-omt-to-clusterrole.patch /hack/
RUN patch -p0 magnum_cluster_api/charts/cluster-autoscaler/templates/clusterrole.yaml < /hack/add-omt-to-clusterrole.patch
RUN patch -p0 magnum_cluster_api/charts/vendor/cluster-autoscaler/templates/clusterrole.yaml < /hack/add-omt-to-clusterrole.patch
DO github.com/vexxhost/atmosphere/images/openstack-service+PIP_INSTALL --PACKAGES /src
SAVE ARTIFACT /var/lib/openstack venv

Expand Down
15 changes: 9 additions & 6 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
clean:
rm -rfv magnum_cluster_api/charts/cluster-autoscaler
rm -rfv magnum_cluster_api/charts/vendor

vendor: clean
helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm repo update
helm fetch autoscaler/cluster-autoscaler --version 9.29.1 --untar --untardir magnum_cluster_api/charts
patch -p0 magnum_cluster_api/charts/cluster-autoscaler/templates/clusterrole.yaml < hack/add-omt-to-clusterrole.patch
vendir:
curl -Lo vendir https://github.com/carvel-dev/vendir/releases/download/v0.40.0/vendir-linux-amd64
chmod +x vendir && ./vendir version
sudo mv vendir /usr/local/bin/vendir

vendor: clean vendir
vendir sync
patch -p0 magnum_cluster_api/charts/vendor/cluster-autoscaler/templates/clusterrole.yaml < hack/add-omt-to-clusterrole.patch

poetry:
pipx install poetry
Expand Down
4 changes: 3 additions & 1 deletion hack/run-integration-tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ source /opt/stack/openrc
OS_DISTRO=${OS_DISTRO:-ubuntu}
IMAGE_OS=${IMAGE_OS:-ubuntu-2204}
NODE_COUNT=${NODE_COUNT:-2}
NETWORK_DRIVER=${NETWORK_DRIVER:-calico}
SONOBUOY_VERSION=${SONOBUOY_VERSION:-0.56.16}
SONOBUOY_ARCH=${SONOBUOY_ARCH:-amd64}
DNS_NAMESERVER=${DNS_NAMESERVER:-1.1.1.1}
Expand Down Expand Up @@ -52,7 +53,7 @@ openstack coe cluster template create \
--master-lb-enabled \
--master-flavor m1.large \
--flavor m1.large \
--network-driver calico \
--network-driver ${NETWORK_DRIVER} \
--docker-storage-driver overlay2 \
--coe kubernetes \
--label kube_tag=${KUBE_TAG} \
Expand Down Expand Up @@ -113,6 +114,7 @@ RESULTS_FILE=$(./sonobuoy retrieve --filename sonobuoy-results.tar.gz)
# Print results
./sonobuoy results ${RESULTS_FILE}


# Fail if the Sonobuoy tests failed
if ! ./sonobuoy results --plugin e2e ${RESULTS_FILE} | grep -q "Status: passed"; then
echo "Sonobuoy tests failed"
Expand Down
23 changes: 23 additions & 0 deletions hack/setup-vendir.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/bin/bash -xe

# Copyright (c) 2024 VEXXHOST, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.

# Versions to test
VENDIR_VERSION=${VENDIR_VERSION:-v0.40.0}

# Install `vendir` CLI
curl -Lo /tmp/vendir https://github.com/carvel-dev/vendir/releases/download/${VENDIR_VERSION}/vendir-linux-amd64
chmod +x /tmp/vendir
sudo mv /tmp/vendir /usr/local/bin/vendir
5 changes: 4 additions & 1 deletion hack/stack.sh
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ MANILA_USE_SERVICE_INSTANCE_PASSWORD=True
[[post-config|/etc/magnum/magnum.conf]]
[cluster_template]
kubernetes_allowed_network_drivers = calico
kubernetes_allowed_network_drivers = calico,cilium
kubernetes_default_network_driver = calico
EOF

Expand All @@ -107,6 +107,9 @@ EOF
# Install CAPI/CAPO
./hack/setup-capo.sh

# Install vendir
./hack/setup-vendir.sh

# Vendor the chart
make vendor

Expand Down
4 changes: 4 additions & 0 deletions magnum_cluster_api/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,10 @@ class ClusterMasterCountEven(Exception):
pass


class UnsupportedCNI(Exception):
pass


class OpenstackFlavorInvalidName(exception.InvalidName):
message = _("Expected a flavor name but received flavor id %(flavor)s.")

Expand Down
103 changes: 86 additions & 17 deletions magnum_cluster_api/resources.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@

CONF = cfg.CONF
CALICO_TAG = "v3.24.2"
CILIUM_TAG = "v1.15.3"

CLUSTER_CLASS_VERSION = pkg_resources.require("magnum_cluster_api")[0].version
CLUSTER_CLASS_NAME = f"magnum-v{CLUSTER_CLASS_VERSION}"
Expand All @@ -55,6 +56,8 @@
AUTOSCALE_ANNOTATION_MIN = "cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size"
AUTOSCALE_ANNOTATION_MAX = "cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size"

DEFAULT_POD_CIDR = "10.100.0.0/16"


class ClusterAutoscalerHelmRelease:
def __init__(self, api, cluster) -> None:
Expand All @@ -72,7 +75,7 @@ def apply(self):
release_name=self.cluster.stack_id,
chart_ref=os.path.join(
pkg_resources.resource_filename("magnum_cluster_api", "charts"),
"cluster-autoscaler/",
"vendor/cluster-autoscaler/",
),
values={
"fullnameOverride": f"{self.cluster.stack_id}-autoscaler",
Expand Down Expand Up @@ -170,6 +173,7 @@ def get_object(self) -> pykube.ConfigMap:
"magnum_cluster_api", "manifests"
)
calico_version = self.cluster.labels.get("calico_tag", CALICO_TAG)
cilium_version = self.cluster.labels.get("cilium_tag", CILIUM_TAG)

repository = utils.get_cluster_container_infra_prefix(self.cluster)

Expand All @@ -190,15 +194,65 @@ def get_object(self) -> pykube.ConfigMap:
)
for manifest in glob.glob(os.path.join(manifests_path, "ccm/*.yaml"))
},
**{
"calico.yml": image_utils.update_manifest_images(
self.cluster.uuid,
os.path.join(manifests_path, f"calico/{calico_version}.yaml"),
repository=repository,
)
},
}

if self.cluster.cluster_template.network_driver == "cilium":
data = {
**data,
**{
"cilium.yml": helm.TemplateReleaseCommand(
namespace="kube-system",
release_name="cilium",
chart_ref=os.path.join(
pkg_resources.resource_filename(
"magnum_cluster_api", "charts"
),
"vendor/cilium/",
),
values={
"cni": {"chainingMode": "portmap"},
"image": {
"tag": cilium_version,
},
"operator": {
"image": {
"tag": cilium_version,
},
},
# NOTE(okozachenko1203): For users who run with kube-proxy (i.e. with Cilium's kube-proxy
# replacement disabled), the ClusterIP service loadbalancing when a
# request is sent from a pod running in a non-host network namespace
# is still performed at the pod network interface (until
# https://github.com/cilium/cilium/issues/16197 is fixed). For this
# case the session affinity support is disabled by default.
"sessionAffinity": "true",
"ipam": {
"operator": {
"clusterPoolIPv4PodCIDRList": [
self.cluster.labels.get(
"cilium_ipv4pool",
DEFAULT_POD_CIDR,
),
],
},
},
},
)(repository=repository)
},
}

if self.cluster.cluster_template.network_driver == "calico":
data = {
**data,
**{
"calico.yml": image_utils.update_manifest_images(
self.cluster.uuid,
os.path.join(manifests_path, f"calico/{calico_version}.yaml"),
repository=repository,
)
},
}

if cinder.is_enabled(self.cluster):
volume_types = osc.cinder().volume_types.list()
default_volume_type = osc.cinder().volume_types.default()
Expand Down Expand Up @@ -2316,10 +2370,17 @@ def __init__(

@property
def labels(self) -> dict:
cni_version = self.cluster.labels.get("calico_tag", CALICO_TAG)
labels = {
"cni": f"calico-{cni_version}",
}
labels = {}
if self.cluster.cluster_template.network_driver == "calico":
cni_version = self.cluster.labels.get("calico_tag", CALICO_TAG)
labels = {
"cni": f"calico-{cni_version}",
}
if self.cluster.cluster_template.network_driver == "cilium":
cni_version = self.cluster.labels.get("cilium_tag", CILIUM_TAG)
labels = {
"cni": f"cilium-{cni_version}",
}

return {**super().labels, **labels}

Expand All @@ -2331,6 +2392,18 @@ def get_or_none(self) -> objects.Cluster:
def get_object(self) -> objects.Cluster:
osc = clients.get_openstack_api(self.context)
default_volume_type = osc.cinder().volume_types.default()
pod_cidr = DEFAULT_POD_CIDR
if self.cluster.cluster_template.network_driver == "calico":
pod_cidr = self.cluster.labels.get(
"calico_ipv4pool",
DEFAULT_POD_CIDR,
)
if self.cluster.cluster_template.network_driver == "cilium":
pod_cidr = self.cluster.labels.get(
"cilium_ipv4pool",
DEFAULT_POD_CIDR,
)

return objects.Cluster(
self.api,
{
Expand All @@ -2347,11 +2420,7 @@ def get_object(self) -> objects.Cluster:
"dns_cluster_domain", "cluster.local"
),
"pods": {
"cidrBlocks": [
self.cluster.labels.get(
"calico_ipv4pool", "10.100.0.0/16"
)
],
"cidrBlocks": [pod_cidr],
},
"services": {
"cidrBlocks": [
Expand Down
5 changes: 5 additions & 0 deletions magnum_cluster_api/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -362,9 +362,14 @@ def validate_flavor_name(cli: clients.OpenStackClients, flavor: str):


def validate_cluster(ctx: context.RequestContext, cluster: magnum_objects.Cluster):
# Check network driver
if cluster.cluster_template.network_driver not in ["cilium", "calico"]:
raise mcapi_exceptions.UnsupportedCNI

# Check master count
if (cluster.master_count % 2) == 0:
raise mcapi_exceptions.ClusterMasterCountEven

# Validate flavors
osc = clients.get_openstack_api(ctx)
validate_flavor_name(osc, cluster.master_flavor_id)
Expand Down
13 changes: 13 additions & 0 deletions vendir.lock.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
apiVersion: vendir.k14s.io/v1alpha1
directories:
- contents:
- helmChart:
appVersion: 1.27.2
version: 9.29.1
path: cluster-autoscaler
- helmChart:
appVersion: 1.15.3
version: 1.15.3
path: cilium
path: magnum_cluster_api/charts/vendor
kind: LockConfig
18 changes: 18 additions & 0 deletions vendir.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
apiVersion: vendir.k14s.io/v1alpha1
kind: Config
directories:
- path: magnum_cluster_api/charts/vendor
excludePaths: k8s-keystone-auth
contents:
- path: cluster-autoscaler
helmChart:
name: cluster-autoscaler
version: 9.29.1
repository:
url: https://kubernetes.github.io/autoscaler
- path: cilium
helmChart:
name: cilium
version: 1.15.3
repository:
url: https://helm.cilium.io/
15 changes: 14 additions & 1 deletion zuul.d/jobs-flatcar.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,22 @@
vars:
kube_tag: v1.27.8

- job:
name: magnum-cluster-api-sonobuoy-flatcar-v1.27.8-calico
parent: magnum-cluster-api-sonobuoy-flatcar-v1.27.8
vars:
network_driver: calico

- job:
name: magnum-cluster-api-sonobuoy-flatcar-v1.27.8-cilium
parent: magnum-cluster-api-sonobuoy-flatcar-v1.27.8
vars:
network_driver: cilium

- project-template:
name: magnum-cluster-api-flatcar
check:
jobs:
- magnum-cluster-api-image-build-flatcar-v1.27.8
- magnum-cluster-api-sonobuoy-flatcar-v1.27.8
- magnum-cluster-api-sonobuoy-flatcar-v1.27.8-calico
- magnum-cluster-api-sonobuoy-flatcar-v1.27.8-cilium
15 changes: 14 additions & 1 deletion zuul.d/jobs-rockylinux-8.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,22 @@
vars:
kube_tag: v1.27.8

- job:
name: magnum-cluster-api-sonobuoy-rockylinux-8-v1.27.8-calico
parent: magnum-cluster-api-sonobuoy-rockylinux-8-v1.27.8
vars:
network_driver: calico

- job:
name: magnum-cluster-api-sonobuoy-rockylinux-8-v1.27.8-cilium
parent: magnum-cluster-api-sonobuoy-rockylinux-8-v1.27.8
vars:
network_driver: cilium

- project-template:
name: magnum-cluster-api-rockylinux-8
check:
jobs:
- magnum-cluster-api-image-build-rockylinux-8-v1.27.8
- magnum-cluster-api-sonobuoy-rockylinux-8-v1.27.8
- magnum-cluster-api-sonobuoy-rockylinux-8-v1.27.8-calico
- magnum-cluster-api-sonobuoy-rockylinux-8-v1.27.8-cilium
Loading

0 comments on commit 4f922d0

Please sign in to comment.