Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ci): use Depot for Docker builds in CI #10002

Merged
merged 8 commits into from
Jun 7, 2022

Conversation

jacobwgillespie
Copy link
Contributor

@jacobwgillespie jacobwgillespie commented May 25, 2022

(context for this is an email exchange with @timgl, who requested this PR)

Problem / Background

Faster Docker builds!

When building Docker images in GitHub Actions, the lack of a native Docker build cache means builds can be quite slow, and even options like cache-to / cache-from can eat into build time since they must serialize / deserialize the cache to tarballs which comes at a performance cost. And Actions currently only offers hosted Intel environments, if you need to build Arm images, you need to use emulation/QEMU, which can be slow.

Depot is a hosted Docker build service that we've created to make this faster - we spawn managed VMs running BuildKit on-demand to handle container builds. Those VMs have 4 CPUs (2 more than Actions today), 8GB of memory, and a persistent 50GB local SSD disk. This allows all of BuildKit's native cache features to just work and persist across builds. Our builders start in just a few seconds with the SSD disk already ready, no need to load the cache before the build can start.

We have a CLI, depot build, that embeds a copy of docker buildx build that routes builds to our infrastructure. Since it's using Buildx as a library, it's typically just a matter of swapping docker build for depot build to send builds to the remote builders.

We've been benchmarking several open-source projects in GitHub Actions that you can find on our homepage, and PostHog is one of them. Every commit to master here, we've been running a docker/build-push-action job and a depot build job and saving the timings. We've been seeing build times with Depot of <5 minutes, often around 2.5 minutes, and sometimes as fast as 20 seconds!

@kylegalbraith reached out to Tim to see if you all might be interested in those speedups here in this repo, Tim asked for a PR, and here we are 🙂

Changes

To use Depot here, I've made the following changes:

  1. The Depot CLI is installed with depot/setup-action
  2. docker/build-push-action is replaced with depot/build-push-action, which implements the same inputs but executes depot build instead of docker buildx build
  3. There's a depot.json file with an ID - this is the ID of your project, an abstraction around a builder VM and cache disk
  4. Authentication with your Depot organization is done via a DEPOT_TOKEN secret - this secret does not yet exist, you will have to generate it and save it here (I'll ping Tim about this via email too)

I'll add a few code comments below for some other minor details.

The tl;dr for how all this works:

  • Previously docker/setup-buildx-action launched a local copy of Buildx, and docker/build-push-action used the docker buildx build CLI to interact with it to build images
  • Now, depot/setup-action installs the Depot CLI, called by depot/build-push-action, which embeds the docker buildx build client to tunnel to a remote builder VM that's launched on demand, with a persistent SSD cache

If you have any questions, feel free to ping me (@jacobwgillespie) or Kyle (@kylegalbraith), we'd be happy to help in any way!

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

How did you test this code?

Yes I've run depot build . locally against this project ID. I have not yet tested these specific workflows as they don't execute for repo forks, and the DEPOT_TOKEN secret doesn't exist yet in this repo.


- name: Login to DockerHub
if: github.repository == 'PostHog/posthog'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole workflow is already guarded with if: github.repository == 'PostHog/posthog', so I've removed the ifs from individual steps.

@@ -10,6 +10,7 @@ on:
jobs:
build-push:
name: Build & push Docker release image
if: github.repository == 'PostHog/posthog'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this one, it was missing the if guard for the whole workflow like some of the others, so I've added it to match, though let me know if this was in error and I can revert.

depot.json Outdated
@@ -0,0 +1 @@
{"id":"x19jffd9zf"}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the ID of your Depot project in the PostHog org - Kyle has sent @timgl an invite to the org, and he can invite anyone else who needs access 👍

with:
push: false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

push: false is the default (for both Depot's and Docker's build-push-action)

@timgl
Copy link
Collaborator

timgl commented May 26, 2022

Thanks @jacobwgillespie! I have added the secret token. @fuziontech any thoughts/feedback?

@jacobwgillespie
Copy link
Contributor Author

I think we might have a workflow issue to solve first - workflows triggered via pull_request don't have access to Actions secrets, intentionally so for security reasons, and the docker-image-test.yml workflow runs on the pull_request event.

I think we have two options:

  1. We could move that workflow back to use local Docker/Buildx - that would unblock this PR, Depot would be used after PRs merged. But PR builds would not experience the same speedups as the rest.
  2. We can accept GitHub's OIDC token as authentication for depot build - that would let us securely validate that the build was originating from GitHub, including for pull_request events. We'd likely want to run those builds against a different project ID, so that they ran on separate infrastructure from your production build.

I think (2) is preferable, let me sync with Kyle on that idea, and definitely let me know if you have thoughts as well.

(and thanks for your patience, this is our first OSS project integration)

@jacobwgillespie
Copy link
Contributor Author

Alright, we have implemented option (2), we now support acquiring temporary Depot build tokens by exchanging GitHub Actions OIDC tokens, assuming they match the project trust relationship configuration. 🎉

I've created a new posthog-pull-request project and set it up to trust pull_request workflows run from this repo (you can see this in the project settings), and the docker-image-test.yml workflow now uses that project ID specifically.

And we've released a new version of depot/build-push-action that supports OIDC.

This should be good to re-test. 👍


If this works and we like the approach, we could update the other workflows to use OIDC, then you'd have one less static API token to manage. 🧹

@posthog-bot
Copy link
Contributor

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week.

@jacobwgillespie
Copy link
Contributor Author

I've updated the branch with the latest from master, hopefully that fixes the E2E tests.

The Docker image build job failed with Unable to exchange GitHub OIDC token for temporary Depot token - I'm not seeing any token exchange API request in the logs around that time, so I've published v1.1.1 of the depot/build-push-action with more logging around OIDC exchange failures, I suspect it was a local network issue given that the API request did not make it to the backend.

@jacobwgillespie
Copy link
Contributor Author

Alright, well it appears OIDC does not work with pull_request workflows when they are submitted from forked repositories. 😞 It's a security feature on GitHub's part, they restrict the permissions from forks to id-token: read, which is functionally equivalent to id-token: none (both prevent access to the OIDC token).

I've pushed a commit to revert the docker-image-test.yml workflow back to using local Buildx, implementing option (1) from above. PRs from forks will run using local Docker, which will be slower, but after merged, all other workflows run using Depot for the speedups and better caching.

I've also pushed a commit to use OIDC on all the other workflows, there's no reason you should need to embed a static access token anymore. I'll add a comment on one bit of Actions code below.

with:
context: .
file: prod.web.Dockerfile
load: true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, this change reflects the previous behavior, ordering the steps like so:

  1. Build the image, loading the result in local Docker
  2. Push the image to ECR
  3. Extract static assets by copying out of the built container

With Depot, I think it would be faster to reorder it such that the image it pushed to ECR during build (via the remote builder), then pulled locally for asset extraction:

  1. Build the image and push to ECR from the remote builder
  2. docker pull the image from ECR
  3. Extract static assets by copying out of the built container

That would add some additional network transfer costs, since you'd be pulling the container from ECR, if that's a concern, hence why I did not make that change. There might also be a way to extract and push the static assets as part of the Docker build itself, to avoid needing to pull the container again at all.

@posthog-bot posthog-bot removed the stale label Jun 7, 2022
@@ -0,0 +1 @@
{ "id": "x19jffd9zf" }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some non-blocking (tiny) feedback, I'm always annoyed when a service forces us to add a file with their name into our main repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally understand, I feel the same! I've started using VS Code's file nesting feature just to hide those files I can't remove from the repo root. 🙂

We support three ways of getting the project ID to depot:

  1. The depot.json file with an id key
  2. A DEPOT_PROJECT_ID environment variable
  3. Passing it as a CLI option: depot --project xxxxxx ...

The benefit to (1) is mostly that the CLI will find that file up from the current directory, so you can cd into the project, run depot and everything works with no additional config.

But if Depot is mostly useful in CI and/or you have some other way of providing env vars in your local shell (direnv, etc.), we could drop the config file for specifying it that way instead, it'd save one additional file in your repo root.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'd prefer either env variable or CLI I think!

Again not blocking though. Just getting a final look from someone in our platform team but then we're good to merge. Thanks for your work here!

@timgl timgl merged commit be605ee into PostHog:master Jun 7, 2022
@timgl
Copy link
Collaborator

timgl commented Jun 7, 2022

@jacobwgillespie
Copy link
Contributor Author

Interesting, it failed to find the project token - let me take a look...

@jacobwgillespie
Copy link
Contributor Author

@timgl fix is here: #10182. It's trying to build PostHog/posthog-cloud, which doesn't have a depot.json, so I've updated it with an explicit project ID now.

timgl added a commit that referenced this pull request Jun 7, 2022
jacobwgillespie added a commit to depot/posthog that referenced this pull request Jun 7, 2022
timgl added a commit that referenced this pull request Jul 4, 2022
* Revert "Revert "feat(ci): use Depot for Docker builds in CI (#10002)" (#10179)"

This reverts commit efb1eb2.

* Set posthog-cloud project

Co-authored-by: Jacob Gillespie <jacobwgillespie@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants