Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IaC: Terraform and ACM configurations for Multi-Autopilot GKE environment setup with CI/CD, Anthos & CloudSQL #1006

Merged

Conversation

aablsk
Copy link
Contributor

@aablsk aablsk commented Oct 10, 2022

Background

Initially the fork that this PR is based on was created to showcase Google Cloud's CICD story encompassing Cloud Build / Cloud Deploy / kustomize / skaffold. In order to maximize value from the work done for the showcase we decided to automate the setup with terraform utilising Cloud Foundation Toolkit.

Change Summary

Added configuration that includes:

  • Terraform for all infrastructure (with exception of Github repo connection which is a one-time manual step)
  • CI per team
    • CloudBuild /w skaffold (includes caching)
  • CD (staging, production) per team
    • CloudDeploy /w skaffold profiles per environment
  • 3 Environments:
    • Development (skaffold dev only)
      • GKE
      • Autopilot
      • ASM
      • ACM for service accounts, namespaces, rolebindings…
      • Workload identity
      • Namespace: bank-of-anthos-development
    • Staging (CD deploy)
      • GKE
      • Autopilot
      • ASM
      • ACM for service accounts, namespaces, rolebindings…
      • Workload identity
      • Namespace: bank-of-anthos-staging
      • CloudSQL
    • Production (CD deploy)
      • GKE
      • Autopilot
      • ASM
      • ACM for service accounts, namespaces, rolebindings…
      • Workload Identity
      • Namespace: bank-of-anthos-production
      • CloudSQL
  • Heavy use of kustomize components to stay DRY
  • Minimized service account permissions
  • Uses CloudFoundation Toolkit for Kubernetes (which restricts modularization of environments due to kubernetes provider circular dependency)

Additional Notes

This PR does NOT include the changes to the src/ folder, which include kustomizations for microservices, skaffold profiles and cloudbuild.yaml definitions for individual teams.
The terraform code included in this PR does however reference these files, which leads to this IaC not being immediately compatible with the current application source.

Testing Procedure

Follow steps here using the prepared shell scripts for easiest setup.

Please note that script 4 and 5 will log errors as they require the second PR (which includes changes to src/ folder)

Related PRs or Issues

I stole shamelessly from these PRs:
#881
#792

Changes to src/ can be seen in this fork

@aablsk aablsk requested a review from a team as a code owner October 10, 2022 14:33
@snippet-bot
Copy link

snippet-bot bot commented Oct 10, 2022

Here is the summary of changes.

You are about to add 4 region tags.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

@aablsk
Copy link
Contributor Author

aablsk commented Oct 10, 2022

@bourgeoisor What's our requirement on empty lines at end of file? (need to fix this still)

@bourgeoisor bourgeoisor self-assigned this Oct 17, 2022
Copy link
Member

@bourgeoisor bourgeoisor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Alex for the PR, added a handful of comments and questions :)

@@ -0,0 +1,48 @@
# Copyright 2020 Google LLC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these ACM-managed files "copies" or "moved" of e.g. https://github.com/GoogleCloudPlatform/bank-of-anthos/blob/main/kubernetes-manifests/config.yaml

If copied: how will we make sure they're kept in sync?

If moved: how will non-TF and/or non-ACM deployments deploy the required ConfigMap (for example)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose to 'move' these files instead of 'copying' them. Since this base configuration is a kustomize component, it can be easily added to a skaffold configuration. As discussed I have prepared a skaffold profile ('development') which is compatible with a vanilla k8s cluster for the next PR that will include this.

Source to include these manifests specifically can be found here.

@@ -0,0 +1,19 @@
# Copyright 2022 Google LLC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does ACM work with Kustomize?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 24 to 27
patch: |-
- op: add
path: /metadata/annotations/iam.gke.io~1gcp-service-account
value: gke-workload-development@bank-of-anthos-aablsk.iam.gserviceaccount.com
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a leftover "aablsk", is that intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was previously replaced by the 01-patch-acm-config.sh script. As a proposal I have removed all bash scripts and instead added step-by-step instructions for the initial (one-time) setup to the README.md

metadata:
name: cloud-sql-admin
data:
connectionName: bank-of-anthos-aablsk:europe-west1:bank-of-anthos-db-production
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was previously replaced by the 01-patch-acm-config.sh script. As a proposal I have removed all bash scripts and instead added step-by-step instructions for the initial (one-time) setup to the README.md

patch: |-
- op: add
path: /metadata/annotations/iam.gke.io~1gcp-service-account
value: gke-workload-production@bank-of-anthos-aablsk.iam.gserviceaccount.com
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was previously replaced by the 01-patch-acm-config.sh script. As a proposal I have removed all bash scripts and instead added step-by-step instructions for the initial (one-time) setup to the README.md

@@ -0,0 +1,28 @@
# Copyright 2022 Google LLC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super sure about this script -- I think I'd prefer if we'd hardcode (or default a variable) to this current GCP project / region and not have this script at all. I'd rather git commands (besides git clone) not be involved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps instead of a script, it could be part of the README, where in the prereq you state that if you're deploying this TF outside the context of BoA maintainers, you should:

  • fork the repo
  • substitute some stuff like the GCP project name and commit those in your now forked repo (perhaps that bit can be a quick script, but it should not be part of the TF family of script) -- though I'd personally lean towards simplifying and not scripting it at all... Just stating it in the README
  • THEN you can run the TF scripts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. These scripts were mostly created to support me in testing reproducibility of the setup with little effort. For this PR I have therefore removed all bash scripts and instead added step-by-step instructions for the initial (one-time) setup to the README.md.

Comment on lines 16 to 19
/* backend "gcs" {
bucket = "tf-state-boa-tf"
prefix = "bank-of-anthos"
} */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the comment and added instructions for setting this up to the README.md

iac/tf-multienv-cicd-anthos-autopilot/env-staging.tf Outdated Show resolved Hide resolved
1. Terraform will set up your GCP infrastructure
1. Waits for provisioning of managed ASM to complete
1. Deploys jobs to staging and production k8s clusters which initialise the Cloud SQL databases.
1. Triggers CICD for accounts, frontend & ledger teams
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means BoA gets fully deployed as well, correct? Does it get these things? https://github.com/GoogleCloudPlatform/bank-of-anthos/tree/main/.github/prod-cluster

Ideally we'd dogfood this entire TF flow for real when we develop on BoA and the prod cluster would be the https://bank-of-anthos.xyz/ cluster :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I was not aware of this configuration, but happy to add it with the second PR. (will include terraform for CloudArmor, IP... and adapted production skaffold profile with a fitting kustomization)

@aablsk
Copy link
Contributor Author

aablsk commented Oct 30, 2022

I've responded to most questions and have made some adjustments in line with your comments. I'm also happy make myself available for a synchronous meeting from November 3rd if that would be helpful.

@bourgeoisor Could you please provide me with the following values that you'd like to use for your setup, so I can include them in the configuration:

  1. Google Cloud ProjectId in which you are planning to deploy this
  2. Region in which the resources should be deployed (if applicable)
  3. Name of the GCS bucket in which you'd like to store the shared terraform state

Thanks in advance!

Please note that work on the second PR is still in progress:

"iam.googleapis.com",
"sqladmin.googleapis.com"
]
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Praise: I like that we're dedicating a separate apis.tf file for GCP API enablement. This is clean!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙈 Thanks for the kind words, Nim!

]
}

# we cannot use a custom service account with autopilot clusters in terraform https://github.com/hashicorp/terraform-provider-google/issues/9505 🤷
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought: After merging this pull-request, we should consider tracking our desire to use a custom service account here in a GitHub issue (i.e., create a new GitHub Issue to add visibility).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed!

Copy link
Member

@bourgeoisor bourgeoisor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bourgeoisor bourgeoisor merged commit 5ca01c4 into GoogleCloudPlatform:main Nov 23, 2022
@aablsk aablsk deleted the iac-multienv-cicd-anthos-autopilot branch November 24, 2022 07:22
big-appled pushed a commit to big-appled/bank-of-anthos that referenced this pull request Jun 9, 2023
…ment setup with CI/CD, Anthos & CloudSQL (GoogleCloudPlatform#1006)

* add ACM base

* add ACM overlays for development, staging, production environments

* add tf infrastructure definition

* add cicd pipeline custom tf module

* add shell scripts for easy setup

* add README documenting how to setup multienv-cicd-anthos-autopilot

* substitute projectId and region references in ACM config

* substitute projectId and region references in ACM config

* add configurable sync_branch

* remove obsolete file

* fix license header

* replace hard-coded script names with variable

* make EOF consistent empty line

* remove commented master_authorized_networks

* substitute projectId and region references in ACM config

* remove ZONE variable

* move targets, teams to tf.locals

* remove shell scripts, update README.md to contain manual steps

* add instructions for terraform state bucket

* fix terraform, update README.md, reintroduce zone for CloudSQL master instance

Co-authored-by: Olivier Bourgeois <3271352+bourgeoisor@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants