Skip to content

Commit

Permalink
chore: Containers reduced to ~100MB total. ~30s installation. (#3487)
Browse files Browse the repository at this point in the history
TLDR: Want to test?
```
for img in aztec-sandbox cli noir; do docker pull aztecprotocol/${img}:cl_sandbox_cli_layer_share; done
```

More wonderful improvements:
* We build a common prod image (`yarn-project-prod`) shared between
sandbox and cli, so we only pull the data once for both images. The
actual cli/sandbox images now just specify how to actually run. This
gets the compressed image size down to about ~110mb. Took about 50s to
pull/extract sandbox, cli and nargo on crappy office connection (30s on
home connection).
* We introduce `buildx` for the creation of multiarch images, allowing
us to ditch the doubling of jobs and manifest creation phases of
multiarch images.
* This makes most sense for small containers. It doesn't replace the
traditional flow for large project builds such as `noir` as it uses
virtualisation.
* Multiarch is configured by a new `multiarch` property added to the
build manifest. If set to `buildx` it uses buildx virtualisation, if set
to `host` it's expected that two host machines with different arches do
the build (as traditionally).
* This also goes back to a pattern of there being *no* arch postfix by
default on image tags, instead we only apply the postfix to images
specifically where `multiarch: host` is enabled. The ecr manifest job is
then expected to immediately create the postfixless version.
* We separate some `calculate_content_hash` stuff into
`calculate_rebuild_files` to aid debugging rebuild pattern stuff.
* `deploy_dockerhub` is extensively simplified and sped up by using
`skopeo` to perform direct registry to registry multiarch image copies
and retags.
* The concept of commit messages with `[ci force-release]` is
introduced, and at present if enabled, images will pushed to dockerhub
with the normalised branch name as a tag. Npm stuff does nothing still.
* We separate the concept of `deploy` and `release`, where a release is
a publishing of artefacts, and a deploy involves deployment of services
etc in AWS.
* `noir` has stricter rebuild patterns specified to prevent aggressive
rebuilding.
* Change some sync fs calls to async.
  • Loading branch information
charlielye authored Dec 4, 2023
1 parent 63dd0c8 commit b49cef2
Show file tree
Hide file tree
Showing 27 changed files with 319 additions and 288 deletions.
144 changes: 56 additions & 88 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -346,6 +346,17 @@ jobs:
name: Build
command: build yarn-project | add_timestamps

yarn-project-prod:
machine:
image: ubuntu-2204:2023.07.2
resource_class: large
steps:
- *checkout
- *setup_env
- run:
name: Build
command: build yarn-project-prod | add_timestamps

yarn-project-formatting:
machine:
image: ubuntu-2204:2023.07.2
Expand All @@ -368,7 +379,7 @@ jobs:
name: Test
command: cond_spot_run_container yarn-project 64 test | add_timestamps

aztec-sandbox-x86_64:
aztec-sandbox:
machine:
image: ubuntu-2204:2023.07.2
resource_class: large
Expand All @@ -379,22 +390,7 @@ jobs:
name: "Build and test"
command: build aztec-sandbox

aztec-sandbox-arm64:
machine:
image: ubuntu-2204:2023.07.2
resource_class: arm.large
steps:
- *checkout
- *setup_env
- run:
name: "Build and test"
# We need to force not to use docker buildkit because for some reason on arm only, it ends up making a call
# out to eu-west2 despite the image being locally tagged, resulting in unauthorized 401. Weird docker bug?
command: |
echo "export DOCKER_BUILDKIT=" > $BASH_ENV
build aztec-sandbox
cli-x86_64:
cli:
machine:
image: ubuntu-2204:2023.07.2
resource_class: large
Expand All @@ -405,21 +401,6 @@ jobs:
name: "Build and test"
command: build cli

cli-arm64:
machine:
image: ubuntu-2204:2023.07.2
resource_class: arm.large
steps:
- *checkout
- *setup_env
- run:
name: "Build and test"
# We need to force not to use docker buildkit because for some reason on arm only, it ends up making a call
# out to eu-west2 despite the image being locally tagged, resulting in unauthorized 401. Weird docker bug?
command: |
echo "export DOCKER_BUILDKIT=" > $BASH_ENV
build cli
mainnet-fork:
machine:
image: ubuntu-2204:2023.07.2
Expand All @@ -442,21 +423,6 @@ jobs:
name: "Build and test"
command: build aztec-faucet | add_timestamps

ecr-manifest:
machine:
image: ubuntu-2204:2023.07.2
resource_class: large
steps:
- *checkout
- *setup_env
- run:
name: "Create ECR manifest"
command: |
create_ecr_manifest aztec-sandbox x86_64,arm64
create_ecr_manifest cli x86_64,arm64
create_ecr_manifest aztec-faucet x86_64
create_ecr_manifest mainnet-fork x86_64
boxes-blank-react:
machine:
image: ubuntu-2204:2023.07.2
Expand Down Expand Up @@ -938,63 +904,64 @@ jobs:
name: "Assemble benchmark summary from uploaded logs"
command: ./scripts/ci/assemble_e2e_benchmark.sh

# Deploy jobs.
deploy-mainnet-fork:
# Release jobs.
release-npm:
machine:
image: ubuntu-2204:2023.07.2
resource_class: medium
steps:
- *checkout
- *setup_env
- run:
name: "Deploy mainnet fork"
name: "yarn-project"
command: |
should_deploy || exit 0
deploy mainnet-fork
should_release || exit 0
yarn-project/deploy_npm.sh latest
deploy-contracts:
release-dockerhub:
machine:
image: ubuntu-2204:2023.07.2
resource_class: medium
steps:
- *checkout
- *setup_env
- run:
name: "Deploy L1 contracts to mainnet fork"
working_directory: l1-contracts
name: "Release to dockerhub"
command: |
should_deploy || exit 0
./scripts/ci_deploy_contracts.sh
should_release || exit 0
deploy_dockerhub noir
deploy_dockerhub aztec-sandbox
deploy_dockerhub cli
deploy_dockerhub aztec-faucet
deploy_dockerhub mainnet-fork
deploy-npm:
# Deploy jobs.
deploy-mainnet-fork:
machine:
image: ubuntu-2204:2023.07.2
resource_class: medium
steps:
- *checkout
- *setup_env
- run:
name: "yarn-project"
name: "Deploy mainnet fork"
command: |
should_deploy || exit 0
yarn-project/deploy_npm.sh latest
deploy mainnet-fork
deploy-dockerhub:
deploy-contracts:
machine:
image: ubuntu-2204:2023.07.2
resource_class: medium
steps:
- *checkout
- *setup_env
- run:
name: "Deploy to dockerhub"
name: "Deploy L1 contracts to mainnet fork"
working_directory: l1-contracts
command: |
should_deploy || exit 0
deploy_dockerhub noir x86_64,arm64
deploy_dockerhub aztec-sandbox x86_64,arm64
deploy_dockerhub cli x86_64,arm64
deploy_dockerhub aztec-faucet x86_64
deploy_dockerhub mainnet-fork x86_64
./scripts/ci_deploy_contracts.sh
deploy-devnet:
machine:
Expand Down Expand Up @@ -1036,6 +1003,11 @@ defaults_yarn_project: &defaults_yarn_project
- yarn-project
<<: *defaults

defaults_yarn_project_prod: &defaults_yarn_project_prod
requires:
- yarn-project-prod
<<: *defaults

defaults_deploy: &defaults_deploy
requires:
- end
Expand Down Expand Up @@ -1122,42 +1094,37 @@ workflows:
requires:
- yarn-project-base
<<: *defaults
- yarn-project-prod: *defaults_yarn_project
- yarn-project-formatting: *defaults_yarn_project
- yarn-project-tests: *defaults_yarn_project
- end-to-end: *defaults_yarn_project
- build-docs: *defaults_yarn_project
- aztec-sandbox-x86_64: *defaults_yarn_project
- aztec-sandbox-arm64: *defaults_yarn_project
- cli-x86_64: *defaults_yarn_project
- cli-arm64: *defaults_yarn_project
- aztec-faucet: *defaults_yarn_project
- ecr-manifest:
requires:
- aztec-sandbox-x86_64
- aztec-sandbox-arm64
- cli-x86_64
- cli-arm64
<<: *defaults

# Artifacts
- aztec-sandbox: *defaults_yarn_project_prod
- cli: *defaults_yarn_project_prod
- aztec-faucet: *defaults_yarn_project_prod

# Boxes.
- boxes-blank-react:
requires:
- aztec-sandbox-x86_64
- aztec-sandbox
<<: *defaults
- boxes-blank:
requires:
- aztec-sandbox-x86_64
- aztec-sandbox
<<: *defaults
- boxes-token:
requires:
- aztec-sandbox-x86_64
- aztec-sandbox
<<: *defaults

# End to end tests.
- e2e-join:
requires:
- end-to-end
- ecr-manifest
- aztec-sandbox
- cli
<<: *defaults
- e2e-2-pxes: *e2e_test
- e2e-deploy-contract: *e2e_test
Expand Down Expand Up @@ -1240,12 +1207,14 @@ workflows:
- bench-process-history
<<: *defaults

# Production deployment
- deploy-dockerhub: *defaults_deploy
- deploy-npm: *defaults_deploy
# Production releases.
- release-dockerhub: *defaults_deploy
- release-npm: *defaults_deploy

# Production deployment.
- deploy-mainnet-fork:
requires:
- deploy-dockerhub
- release-dockerhub
<<: *defaults_deploy
- deploy-contracts:
requires:
Expand All @@ -1255,4 +1224,3 @@ workflows:
requires:
- deploy-contracts
<<: *defaults_deploy

65 changes: 47 additions & 18 deletions build-system/scripts/build
Original file line number Diff line number Diff line change
Expand Up @@ -86,30 +86,59 @@ if [ -d $ROOT_PATH/$PROJECT_DIR/terraform ]; then
popd
fi

# For each dependency, pull in the latest image and give it correct tag.
# For each dependency, substitute references to the dependency in dockerfile, with the relevent built image uri.
# We have to perform a bit of probing to determine which actual image we want to use.
# When we used buildx to create a multiarch image, there will be no images with "-$ARCH" suffixes (normalise this?).
# Also we sometimes build an arm image from an x86 parent, so there won't always be an arm parent, and we fallback.
for PARENT_REPO in $(query_manifest dependencies $REPOSITORY); do
PARENT_IMAGE_URI=$(calculate_image_uri $PARENT_REPO)
echo "Pulling dependency $PARENT_IMAGE_URI..."
if ! fetch_image $PARENT_IMAGE_URI; then
# This is a *bit* of a hack maybe. Some of our arm images can be built from x86 dependents.
# e.g. node projects are architecture independent.
# This may not hold true if we start introducing npm modules that are backed by native code.
# But for now, to avoid building some projects twice, we can fallback onto x86 variant.
PARENT_IMAGE_URI=$(calculate_image_uri $PARENT_REPO x86_64)
echo "Falling back onto x86 build. Pulling dependency $PARENT_IMAGE_URI..."
fetch_image $PARENT_IMAGE_URI
# We want the parent image tag without any arch suffix.
PARENT_IMAGE_TAG=$(calculate_image_tag $PARENT_REPO "")

# Attempt to locate multiarch image.
if ! image_exists $PARENT_REPO $PARENT_IMAGE_TAG; then
# Attempt to locate our specific arch image.
PARENT_IMAGE_TAG=$(calculate_image_tag $PARENT_REPO)
if ! image_exists $PARENT_REPO $PARENT_IMAGE_TAG; then
# Finally attempt to locate x86_64 image tag, as sometimes we build arch specific images from x86_64 images.
PARENT_IMAGE_TAG=$(calculate_image_tag $PARENT_REPO x86_64)
if ! image_exists $PARENT_REPO $PARENT_IMAGE_TAG; then
echo "Failed to locate multiarch image, arch specific image, or x86_64 image. Aborting."
exit 1
fi
fi
fi
# Tag it to look like an official release as that's what we use in Dockerfiles.
TAG=$ECR_DEPLOY_URL/$PARENT_REPO
docker tag $PARENT_IMAGE_URI $TAG

# Substitute references to parent repo, with the relevent built image uri.
DEPLOY_URI=$ECR_DEPLOY_URL/$PARENT_REPO
PARENT_IMAGE_URI=$ECR_URL/$PARENT_REPO:$PARENT_IMAGE_TAG
awk '{if ($1 == "FROM" && $2 == "'$DEPLOY_URI'") $2 = "'$PARENT_IMAGE_URI'"; print $0}' $DOCKERFILE > _temp && mv _temp $DOCKERFILE
done

COMMIT_TAG_VERSION=$(extract_tag_version $REPOSITORY false)
echo "Commit tag version: $COMMIT_TAG_VERSION"

# Build the actual image and give it a commit tag.
IMAGE_COMMIT_URI=$(calculate_image_uri $REPOSITORY)
echo "Building image: $IMAGE_COMMIT_URI"
docker build -t $IMAGE_COMMIT_URI -f $DOCKERFILE --build-arg COMMIT_TAG=$COMMIT_TAG_VERSION --build-arg ARG_CONTENT_HASH=$CONTENT_HASH .
echo "Pushing image: $IMAGE_COMMIT_URI"
retry docker push $IMAGE_COMMIT_URI > /dev/null 2>&1

MULTIARCH=$(query_manifest multiarch $REPOSITORY)

# Build the image.
if [ "$MULTIARCH" == "buildx" ]; then
# We've requested to use buildx. This will build both arch containers on the host machine using virtualization.
# The result is a single image tag that supports multiarch.
# This is the simplest approach for build jobs that are not too intensive.
docker buildx create --name builder --use
docker buildx inspect --bootstrap
docker buildx build -t $IMAGE_COMMIT_URI -f $DOCKERFILE --build-arg COMMIT_TAG=$COMMIT_TAG_VERSION --build-arg ARG_CONTENT_HASH=$CONTENT_HASH --platform linux/amd64,linux/arm64 . --push
else
# If multiarch is set to "host", the assumption is that we're doing multiple builds on different machine architectures
# in parallel, and that there is a another job that runs afterwards to combine them into a manifest.
# In this case we need to augment the image tag with the hosts architecture to ensure its uniqueness.
if [ "$MULTIARCH" == "host" ]; then
IMAGE_COMMIT_URI=$(calculate_image_uri $REPOSITORY host)
fi

docker build -t $IMAGE_COMMIT_URI -f $DOCKERFILE --build-arg COMMIT_TAG=$COMMIT_TAG_VERSION --build-arg ARG_CONTENT_HASH=$CONTENT_HASH .
echo "Pushing image: $IMAGE_COMMIT_URI"
retry docker push $IMAGE_COMMIT_URI > /dev/null 2>&1
fi
2 changes: 1 addition & 1 deletion build-system/scripts/build_local
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ for E in "${PROJECTS[@]}"; do
echo -e "${YELLOW}Project or dependency has local modifications! Building...${RESET}"
docker build ${ADDITIONAL_ARGS:-} --build-arg ARG_COMMIT_HASH=$COMMIT_HASH -f $DOCKERFILE -t $DEPLOY_IMAGE_URI .
else
if [ -z "$NO_CACHE" ] && docker image ls --format "{{.Repository}}:{{.Tag}}" | grep -q -w $CACHE_IMAGE_URI; then
if [ -z "$NO_CACHE" ] && docker image ls --format "{{.Repository}}:{{.Tag}}" | grep -q -w "$CACHE_IMAGE_URI$"; then
echo -e "${GREEN}Image exists locally. Tagging as $DEPLOY_IMAGE_URI${RESET}"
docker tag $CACHE_IMAGE_URI $DEPLOY_IMAGE_URI
else
Expand Down
21 changes: 4 additions & 17 deletions build-system/scripts/calculate_content_hash
Original file line number Diff line number Diff line change
@@ -1,21 +1,8 @@
#!/bin/bash

[ -n "${BUILD_SYSTEM_DEBUG:-}" ] && set -x # conditionally trace
set -eu

REPOSITORY=$1
COMMIT_HASH=${2:-${COMMIT_HASH:-$(git rev-parse HEAD)}}

# Get list of rebuild patterns, concat them with regex 'or' (|), and double escape \ for awk -v.
AWK_PATTERN=$(query_manifest rebuildPatterns $REPOSITORY | tr '\n' '|' | sed 's/\\/\\\\/g')
# Remove the trailing '|'.
AWK_PATTERN=${AWK_PATTERN%|}

cd "$(git rev-parse --show-toplevel)"
set -euo pipefail

# an example line is
# An example line is:
# 100644 da9ae2e020ea7fe3505488bbafb39adc7191559b 0 yarn-project/world-state/tsconfig.json
# this format is beneficial as it grabs the hashes from git efficiently
# we will next filter by our rebuild patterns
# then we pipe the hash portion of each file to git hash-object to produce our content hash
git ls-tree -r $COMMIT_HASH | awk -v pattern="($AWK_PATTERN)" '$4 ~ pattern {print $3}' | git hash-object --stdin
# Extract the hashes and pipe the hash portion of each file to git hash-object to produce our content hash.
calculate_rebuild_files "$@" | awk '{print $3}' | git hash-object --stdin
Loading

0 comments on commit b49cef2

Please sign in to comment.