Skip to content

Commit

Permalink
Update the policy-controller release build process (#6672)
Browse files Browse the repository at this point in the history
We can't use the typical multiarch docker build with the proxy:
qemu-hosted arm64/arm builds take 45+ minutes before failing due to
missing tooling--specifically `protoc`. (While there is a `protoc`
binary available for arm64, there are no binaries available for 32-bit
arm hosts).

To fix this, this change updates the release process to cross-build the
policy-controller on an amd64 host to the target architecture. We
separate the policy-controller's dockerfiles as `amd64.dockerfile`,
`arm64.dockerfile`, and `arm.dockerfile`. Then, in CI we build and push
each of these images individually (in parallel, via a build matrix).
Once all of these are complete, we use the `docker manifest` CLI tools
to unify these images into a single multi-arch manifest.

This cross-building approach requires that we move from using
`native-tls` to `rustls`, as we cannot build against the platform-
appropriate native TLS libraries. The policy-controller is now feature-
flagged to use `rustls` by default, though it may be necessary to use
`native-tls` in local development, as `rustls` cannot validate TLS
connections that target IP addresses.

The policy-controller has also been updated to pull in `tracing-log` for
compatibility with crates that do not use `tracing` natively. This was
helpful while debugging connectivity issue with the Kubernetes cluster.

The `bin/docker-build-policy-controller` helper script now *only* builds
the amd64 variant of the policy controller. It fails when asked to build
multiarch images.
  • Loading branch information
olix0r committed Aug 13, 2021
1 parent 75774b9 commit 79a5849
Show file tree
Hide file tree
Showing 9 changed files with 344 additions and 16 deletions.
67 changes: 63 additions & 4 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
strategy:
matrix:
# Keep in sync with integration_tests.yaml matrix build
target: [proxy, controller, policy-controller, metrics-api, web, cni-plugin, debug, cli-bin, grafana, jaeger-webhook, tap]
target: [proxy, controller, metrics-api, web, cni-plugin, debug, cli-bin, grafana, jaeger-webhook, tap]
name: Docker build (${{ matrix.target }})
timeout-minutes: 30
steps:
Expand Down Expand Up @@ -68,6 +68,59 @@ jobs:
with:
name: image-archives
path: /home/runner/archives

policy_controller_build:
runs-on: ubuntu-20.04
timeout-minutes: 30
strategy:
matrix:
arch: [amd64, arm64, arm]
name: Policy controller build (${{ matrix.arch }})
steps:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@abe5d8f79a1606a2d3e218847032f3f2b1726ab0
- name: Checkout code
uses: actions/checkout@5a4ac9002d0be2fb38bd78e4b4dbde5606d7042f
- run: |
. bin/_tag.sh
echo "TAG=$(CI_FORCE_CLEAN=1 bin/root-tag)" >> $GITHUB_ENV
- name: Build ${{ matrix.arch }}
run: |
echo "${{ secrets.DOCKER_GHCR_PAT }}" | docker login ghcr.io -u "${{ secrets.DOCKER_GHCR_USERNAME }}" --password-stdin
docker buildx build --push ./policy-controller/ \
-f ./policy-controller/${{ matrix.arch }}.dockerfile \
-t ghcr.io/linkerd/policy-controller:$TAG-${{ matrix.arch }}
policy_controller_manifest:
runs-on: ubuntu-20.04
timeout-minutes: 30
needs: [policy_controller_build]
name: Policy controller manifest
steps:
- name: Checkout code
uses: actions/checkout@5a4ac9002d0be2fb38bd78e4b4dbde5606d7042f
- run: |
. bin/_tag.sh
echo "TAG=$(CI_FORCE_CLEAN=1 bin/root-tag)" >> $GITHUB_ENV
- name: Create multiarch manifest
run: |
docker manifest create ghcr.io/linkerd/policy-controller:${TAG} \
ghcr.io/linkerd/policy-controller:${TAG}-amd64 \
ghcr.io/linkerd/policy-controller:${TAG}-arm64 \
ghcr.io/linkerd/policy-controller:${TAG}-arm
- name: Annotate multiarch manifest
run: |
docker manifest annotate ghcr.io/linkerd/policy-controller:$TAG \
ghcr.io/linkerd/policy-controller:${TAG}-amd64 --os=linux --arch=amd64
docker manifest annotate ghcr.io/linkerd/policy-controller:$TAG \
ghcr.io/linkerd/policy-controller:${TAG}-arm64 --os=linux --arch=arm64
docker manifest annotate ghcr.io/linkerd/policy-controller:$TAG \
ghcr.io/linkerd/policy-controller:${TAG}-arm --os=linux --arch=arm
- name: Push multiarch manifest
run: |
echo "${{ secrets.DOCKER_GHCR_PAT }}" | docker login ghcr.io -u "${{ secrets.DOCKER_GHCR_USERNAME }}" --password-stdin
docker manifest push ghcr.io/linkerd/policy-controller:$TAG
# todo: Keep in sync with `integration_tests.yml`
windows_static_cli_tests:
name: Static CLI tests (windows)
Expand Down Expand Up @@ -105,7 +158,7 @@ jobs:
- upgrade-edge
- upgrade-stable
- cni-calico-deep
needs: [docker_build]
needs: [docker_build, policy_controller_manifest]
name: Integration tests (${{ matrix.integration_test }})
timeout-minutes: 60
runs-on: ubuntu-20.04
Expand All @@ -131,11 +184,12 @@ jobs:
# Validate the CLI version matches the current build tag.
[[ "$TAG" == "$($CMD version --short --client)" ]]
bin/tests --images preload --name ${{ matrix.integration_test }} "$CMD"
arm64_integration_tests:
name: ARM64 integration tests
timeout-minutes: 60
runs-on: ubuntu-20.04
needs: [docker_build]
needs: [docker_build, policy_controller_manifest]
steps:
- name: Checkout code
#if: startsWith(github.ref, 'refs/tags/stable')
Expand Down Expand Up @@ -180,12 +234,13 @@ jobs:
if: ${{ always() }}
# will fail if other steps didn't run, so ignore error
run: bin/test-cleanup "$CMD" || true

choco_pack:
# only runs for stable tags. The conditionals are at each step level instead of the job level
# otherwise the jobs below that depend on this one won't run
name: Pack Chocolatey release
timeout-minutes: 30
needs: [integration_tests, arm64_integration_tests]
needs: [integration_tests]
runs-on: windows-2019
steps:
- name: Checkout code
Expand All @@ -207,6 +262,7 @@ jobs:
with:
name: choco
path: ./linkerd.*.nupkg

gh_release:
name: Create GH release
timeout-minutes: 30
Expand Down Expand Up @@ -249,6 +305,7 @@ jobs:
./target/release/linkerd2-cli-*-linux-*
./target/release/linkerd2-cli-*-windows.*
./target/release/linkerd2-cli-*.nupkg
website_publish:
name: Linkerd website publish
timeout-minutes: 30
Expand All @@ -264,6 +321,7 @@ jobs:
token: ${{ secrets.RELEASE_TOKEN }}
repository: linkerd/website
event-type: release

website_publish_check:
name: Linkerd website publish check
timeout-minutes: 30
Expand Down Expand Up @@ -294,6 +352,7 @@ jobs:
echo "::error::The version '$TAG' was NOT found published in the website"
exit 1
fi
chart_deploy:
name: Helm chart deploy
timeout-minutes: 30
Expand Down
7 changes: 6 additions & 1 deletion bin/docker-build-policy-controller
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,11 @@ bindir=$( cd "${BASH_SOURCE[0]%/*}" && pwd )
# shellcheck source=_tag.sh
. "$bindir"/_tag.sh

if [ -n "$DOCKER_MULTIARCH" ]; then
echo "DOCKER_MULTIARCH may not be set with $0" >&2
exit 1
fi

tag=$(head_root_tag)
ROOTDIR=$( cd "$bindir"/../policy-controller && pwd )
docker_build policy-controller "$tag" "$ROOTDIR/Dockerfile"
docker_build policy-controller "$tag" "$ROOTDIR/amd64.dockerfile"
Loading

0 comments on commit 79a5849

Please sign in to comment.