Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always deploy the correct Docker image version (tag) #813

Closed
seanh opened this issue Oct 25, 2022 · 2 comments · Fixed by #814
Closed

Always deploy the correct Docker image version (tag) #813

seanh opened this issue Oct 25, 2022 · 2 comments · Fixed by #814
Assignees

Comments

@seanh
Copy link
Contributor

seanh commented Oct 25, 2022

The deploy.yml workflow always deploys the latest image from Docker Hub, rather than the image corresponding to the workflow run's Git commit.

Details

The deploy.yml workflow calls the shared dockerhub.yml workflow. dockerhub.yml generates a version number Docker tag:

    run: |
      GIT_SHORT=$(git rev-parse --short HEAD)
      GIT_COMMIT_TIMESTAMP=$(git --no-pager show -s --format=%ct $GIT_SHORT)
      GIT_TS_CONVERTED=$(date -d @$GIT_COMMIT_TIMESTAMP +"%Y%m%d")
      VERSION=$(echo "$GIT_TS_CONVERTED-g$GIT_SHORT")
      export TAG=${VERSION//+/-}
      echo "::set-env name=TAG::$TAG"
      git archive HEAD | docker build -t "hypothesis/${name}:${TAG}" -

You can see the generated tags on Hypothesis's Docker Hub account. The tag names contain the date (YYYYMMDD) followed by the letter g followed by the first few characters of the Git commit SHA. For example: 20221013-g3604296. (The date is the date of the Git commit, not the date when the Docker image was built, so if you rebuild the Docker image from the same Git commit it should get the same date again.)

The deploy.yml caller workflow then deploys the app to its QA and production environments using the hard-coded Docker tag latest. For example this is the deploy.yml job for the production public Via - notice the Version: latest:

prod-via:
needs: qa-via
name: ${{ github.event.repository.name }}
uses: hypothesis/workflows/.github/workflows/eb-update.yml@main
with:
Application: ${{ github.event.repository.name }}
Environment: prod
Region: us-west-1
Operation: deploy
Version: latest
secrets: inherit

Why is this a problem?

In most cases the Docker image for the deploy.yml workflow run's Git commit will be the same thing as the latest image and all will be well: the workflow run just built the image for its commit and published it to Docker Hub so that image is now the latest. But I think the workflow could deploy the wrong version of the Docker image to QA and/or production in various edge cases. For example:

  1. Developer A merges pull request A, triggering deploy.yml workflow run A

  2. Workflow run A runs the tests, builds Docker image A, and publishes the image to Docker Hub. At this point the latest image on Docker Hub is image A

  3. Workflow run A deploys the latest image to QA and then stops and waits for approval. Developer A starts testing PR A on QA

  4. Meanwhile another developer merges pull request B, triggering deploy.yml workflow run B

  5. Workflow run B runs the tests, builds Docker image B, and publishes the image to Docker Hub. At this point the latest image on Docker Hub is image B

  6. Developer A finishes testing PR A on QA and approves the PR to be deployed to production

  7. Workflow run A now continues and deploys the latest Docker image to production. latest is Docker image B, not A, so the wrong image has been deployed. B hasn't been tested on QA yet but is now deployed to production.

    GitHub's various UIs relating to workflows/deployments/environments will display that commit A has been deployed to production, whereas in fact B was.

I believe there are other cases where the wrong commit might be deployed as well. For example:

  • When re-running the deploy.yml workflow for an old commit. This is something that the GitHub UI lets you do. It'll deploy the latest image, not the image for the commit whose workflow you just ran. But GitHub will think you deployed the old commit's image
  • In cases where deploying to production fails due to some random error and a developer comes along later and re-runs the workflow's failed deploy job(s). Again: it's not going to deploy that workflow run's Docker image, it's going to deploy the latest image from Docker Hub, and they might not always be the same thing
  • I think there's a race condition where even though the worklow has uploaded its image to Docker Hub maybe the Docker Hub API isn't serving that image for downloads yet by the time Elastic Beanstalk tries to download the image. In this case we would want it to crash because the image can't be found. Instead, it will deploy the wrong image: the latest image will be the one for the previous commit
    • Similarly if publishing to Docker Hub seems to succeed and the workflow continues, but actually the image somehow wasn't published
  • Maybe other edge cases I'm not thinking of

Solution

Instead of Version: latest all the QA and production deployment jobs in deploy.yml need to say Version: ${{ docker_tag }} where ${{ docker_tag }} is the 20221013-g3604296. Unfortunately this tag name is generated in the shared dockerhub.yml workflow not in the deploy.yml workflow, so we're going to have to find a way to pass it back.

@indigobravo
Copy link
Member

I agree with the synopsis, there are some edge case here that could catch us out.

The like the idea of using the true ${{ docker_tag }} to lock deployments. At the time of writing I am struggling to think of reasons that could catch us out.

Thinking further, we could use ${{ docker_tag }} to validate the deploy has worked correctly and that version is running in Elastic Beanstalk, but we could choose to do that further down the line.

Wondering if we might be able to find a solution with GHA outputs...

https://docs.github.com/en/actions/using-jobs/defining-outputs-for-jobs

@seanh
Copy link
Contributor Author

seanh commented Oct 25, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants