Skip to content

Conversation

@glasnt
Copy link
Collaborator

@glasnt glasnt commented Jul 23, 2023

description originally from #125, updated for this variation

This repo is designed to be a single Terraform module that deploys a complete sample application. In this case, there are database migrations and firebase deployments that also need to happen. These have been containerised into Cloud Run jobs, but the Terraform resource for Cloud Run jobs only creates them, it doesn't execute them.

So, we have to use Terraform in unusual ways to get this execution happening.

A way we previously did this was to deploy a Compute Engine instance, with a metadata startup script that performs actions for us, ensuring this is provisioned at the end of the apply process through use of resource dependencies.

However, given the amount of regions this solution could be deployed to, there's no one-size compute engine instance that would work in all available regions. Thus we encountered issues in testing and deployments where the compute engine instance couldn't be successfully deployed.

Compute Engine isn't required in this solution, just the ability to start jobs and call APIs.

This PR implements this functionality by creating Cloud Build triggers, and using data "http" to execute them. Given we aren't connecting these triggers to a codebase, we use a placeholder pubsub topic as the event type. We don't execute the triggers with this topic, we call the triggers via the API (as you would clicking the "Run" button in the console). These triggers can perform any arbitrary tasks we want.

Terraform normally presumes that any data calls are inert and don't have side effects. In this case, any time terraform inspects the state (including on destroy), these triggers are started. Earlier iterations in #123 had failing test cases where on terraform destroy, terraform tried to delete the jobs, but since terraform also created executions of these jobs, the delete would fail, as there were active running executions. This isn't an issue when the jobs are executed within a Compute Engine metadata startup script, because this is part of a defined resource, and isn't started as part of Terraform's state inspection process.

But Terraform can't delete resources that it doesn't know about.


This PR replaces the Compute Engine, Sub/Network, and VPC with Cloud Build triggers, that:

  • use the created placeholder image as a step in the cloud build trigger, to apply the placeholder firebase image, and
  • create a cloud run job for the database setup within the trigger itself, then execute it, then perform other setup steps.

The database setup requires an authorised connection to the Cloud SQL instance, which is complicated (but possible) to do in Cloud Build, but it's simpler to perform in a Cloud Run job.

The placeholder image execution doesn't require this complexity, so instead of creating a job to run this container, it can be run directly as steps in the Cloud Build trigger itself.

The client job could also use this method, but the terraform infra tests wait for this job to execute, so to reduce complexity in having to change tests, this was retained.

The migrate job still exists, but isn't used by the deployment process. It is referenced in walkthrough tutorials, and is useful as helper functionality.

Additionally, a new fast-follow trigger "activate gcb" (gcb == google cloud build) has been created, to help optimize the changes of success with the deployment. This and additional wait times ensure that IAM propagation, async API enabling, and other processes have time to settle before applying changes. The testing infra for this project creates new projects and applies the Terraform, so these processes would not be required for a human-provisioned project, but they help ensure success in this automated setup.

glasnt added 30 commits July 19, 2023 14:37
these steps weren't tested; you can't do interactive steps in distroless

ubuntu latest is sufficiently small, allows installing/executing utils
@glasnt glasnt changed the title temporary: restore testing in grayside experiements feat: replace Compute Engine metadata startup scripts with Cloud Build Jul 24, 2023
@glasnt glasnt marked this pull request as ready for review July 24, 2023 23:10
@glasnt glasnt requested review from a team and donmccasland as code owners July 24, 2023 23:10
Copy link
Contributor

@grayside grayside left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥳

@glasnt glasnt merged commit 6f80860 into main Jul 25, 2023
@glasnt glasnt deleted the feat/postjsstrigger-grayside-glasnt branch July 25, 2023 00:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants