We describe the shortest sequence of steps required to bootstrap into an infrastructure-as-code tool so that further configuration of the Google Cloud Platform organization is managed through the tool.
During my research into all this, I used Terraform; although I did stumble onto some mentions of Pulumi, it was only after I completed the Terraform setup three times and wrote it up that I realize that Pulumi is actually an alternative to Terraform - and I like it better ;)
With Pulumi, instead of a Terraform-specific language (HCL), I can use my preferred language (Scala) and my preferred build tool (Gradle) to build the description of the state I want and update the state to the desired state.
To some extent, I agree that configuration languages should not be Turing complete - e.g., Dhall; and that build file should be data, not program; but having the full power of my programming language at my disposal when describing the desired state of my cloud infrastructure feels great ;)
Later still, I found out that there exists a Scala Pulumi SDK: Besom; although young, the project is already quite capable, is being actively developed by a team of very smart and friendly people. Besom facilitates writing Pulumi code in a native Scala style and integrates with the effects systems!
Unlike Terraform, Pulumi does not have a Google Workspace provider, so some things can not be managed using Pulumi.
Creation and deletion of workspace users and changing their settings can not be done with Pulumi, but I do not miss this functionality.
Creation and deletion of groups and management of group membership can be done using the GCP provider (which Pulumi, of course, does have).
Changing group settings and aliases can not be done with Pulumi at this point, and I do miss this functionality.
In the sample Terraform files, fragments that rely on the Google Workspace provider
are commented out and marked with using Workspace provider
.
There is a number of UI 'consoles' that are used to manage the configuration:
Manual steps that can be done either in various UI consoles or using command line are given in the command line form.
The tools that need to be installed:
For interacting with Google Cloud: gcloud
(google-cloud-cli
) from cloud.google.com/sdk.
For key management etc.: direnv
from direnv.net.
Pulumi
from pulumi.com.
Besom
from Besom.
Terraform
- if you want to use it - from hashicorp.com (or google-cloud-cli-terraform-tools
).
Sample setup for both Pulumi and Terraform is in the domain-infra folder.
In the following and in the sample files for Pulumi and Terraform,
-
domain.tld
is the domain of the organization involved -
admin@domain.tld
is the super-admin of the domain
In Google Domains:
-
transfer Workspace subscription from Google Domains to Google Workspace
-
GCP organization gets auto-created upon login (?)
-
start GCP trial if applicable
-
set up billing
In Admin Console:
-
set up billing
-
turn off automatic Google Workspace licensing
-
activate Google Groups for Business (optional)
Here we:
-
create project
-
enable services in it
-
create service account
-
assign roles to it
# log in as a super-admin
gcloud auth login admin@domain.tld
# create project
gcloud projects create "domain-infra" \
--name="Domain Cloud Infrastructure" --no-enable-cloud-apis
# find out billing `ACCOUNT_ID` (and `NAME`)
$ gcloud beta billing accounts list
# link the project to the billing account
$ gcloud beta billing projects link "domain-infra" \
--billing-account ACCOUNT_ID
$ gcloud config set project "domain-infra"
# enable APIs used by Pulumi or Terraform
$ gcloud services list --available # all
$ gcloud services list # enabled
# "Cloud Billing API": for working with billing accounts
$ gcloud services enable cloudbilling.googleapis.com
# "Cloud Resource Manager API": for project operations
$ gcloud services enable cloudresourcemanager.googleapis.com
# "Identity and Access Management (IAM) API": for Service Account creation
# also enables iamcredentials.googleapis.com
$ gcloud services enable iam.googleapis.com
# "Service Usage API": listing/enabling/disabling services
$ gcloud services enable serviceusage.googleapis.com
# create a Service Account for running Pulumi or Terraform
$ gcloud iam service-accounts create terraform \
--display-name="terraform" --description="Service Account for Terraform"
# obtain the organization id (org_id)
$ gcloud organizations list
# grant the Service Account roles needed to bootstrap the rest
# for working with billing accounts
$ gcloud organizations add-iam-policy-binding org_id \
--member="serviceAccount:terraform@domain-infra.iam.gserviceaccount.com" \
--role="roles/billing.admin"
# for Service Account creation
$ gcloud organizations add-iam-policy-binding org_id \
--member="serviceAccount:terraform@domain-infra.iam.gserviceaccount.com" \
--role="roles/iam.serviceAccountAdmin"
# for project operations
$ gcloud organizations add-iam-policy-binding org_id \
--member="serviceAccount:terraform@domain-infra.iam.gserviceaccount.com" \
--role="roles/resourcemanager.organizationAdmin"
# remove default roles from the domain
$ gcloud organizations remove-iam-policy-binding org_id \
--member=domain:domain.tld \
--role=roles/billing.creator
$ gcloud organizations remove-iam-policy-binding org_id \
--member=domain:domain.tld \
--role=roles/resourcemanager.projectCreator
Create and retrieve service account key:
$ gcloud iam service-accounts keys create \
/path/to/terraform-domain-infra.json \
--iam-account=terraform@domain-infra.iam.gserviceaccount.com
In addition to running pulumi
or terraform
from the command line locally,
it should be possible to run it from gradle
and from GitHub Actions.
Giving the service account key to the tool in an environment variable should enable
all the scenarios of running it.
On a local machine, we use .envrc
file in the project repository
that direnv
processes to set the appropriate environment variables;
see .envrc.
In GitHub Actions, environment variables are set from secrets.
To be able to work with subdomain-like Google Storage Buckets like state.domain.tld
,
service account terraform@domain-infra.iam.gserviceaccount.com
has to be added to the owners of the domain.tld
in Google Search Central at
https://www.google.com/webmasters/verification/details?hl=en&domain=domain.tld.
This is required even with the domain in Google Cloud Domains.
To be able to do this, one needs to first add the property in the
Google Search Console - which is not a bad idea regardless,
and is also needed to later create organization, account and properties in the
Google Marketing Platform.
If using Google Workspace Terraform provider to manage users and groups,
assign "User Management Admin" and "Group Admin" roles to
the Terraform service account terraform@domain-infra.iam.gserviceaccount.com
in Admin Console.
Pulumi does not have a provider for Google Workspace, so this step does not apply :)
Since Pulumi setup uses Gradle, appropriate Gradle files need to be added to the project:
-
gradle/wrapper/gradle-wrapper.jar
-
gradle/wrapper/gradle-wrapper.properties
-
gradlew
-
gradlew.bat
Setup also requires Gradle build files for the project:
In build.gradle
, we declare dependencies:
-
Scala standard library
-
Pulumi helper classes (
org.podval.tools:org.podval.tools.pulumi
) published from this repository
If using Besom:
-
Besom ('org.virtuslab:besom-core')
-
Besom Google Cloud Platform provider ('org.virtuslab:besom-gcp')
If using Pulumi:
-
Pulumi (
com.pulumi:pulumi
) -
Pulumi Google Cloud Platform provider (
com.pulumi:gcp
)
Also, we need to add Pulumi project file Pulumi.yaml and stack file Pulumi.dev.yaml.
The latter specifies the Google Cloud Platform project id of the infrastructure project; the former specifies the Google Cloud Storage bucket to use to store Pulumi state - until the state migrates into the bucket, those lines need to be commented out.
The code is packaged as an application
with the tld.domain.infra.Main
as a main class:
pulumi
command detects the presence of Gradle build file and runs
the application with gradlew run --console=plain
.
Sample Pulumi code is in the domain-infra/src folder; all of it is contained in one Scala file - tld/domain/infra/MainBesom.scala if using Besom or tld/domain/infra/MainPulumi.scala if not. The code uses Pulumi helper classes.
Sample Terraform files are in the domain-infra/terraform folder.
No additional setup is needed - just run terraform
command in that folder.
Looping approach using for_each
borrowed from a blog post
by Yevgeniy Brikman.
Sample files:
-
.gitignore - do not check the state in
-
main.tf - overall setup
-
project-infra.tf - project and its services
-
sa-terraform.tf - service account and its roles
-
group-gcp-organization-admins.tf - administrators group and its roles
-
user-admin.tf - administrator
-
bucket-state.domain.tld.tf - bucket to store state
In main.tf
, we specify the Google Cloud Storage bucket to use to store Terraform state -
until the state migrates into the bucket, those lines need to be commented out.
Now we are ready to initialize Pulumi:
$ pulumi login --local
$ pulumi stack init dev --secrets-provider=passphrase
$ pulumi config set gcp:project domain-infra
Now, we import existing resources: TODO any differences between Besom and Pulumi?
# project
$ pulumi import "gcp:organizations/project:Project" "project:domain-infra" "projects/domain-infra"
# project services
$ pulumi import "gcp:projects/service:Service" \
"project:domain-infra/service:cloudbilling" "domain-infra/cloudbilling.googleapis.com"
$ pulumi import "gcp:projects/service:Service" \
"project:domain-infra/service:cloudresourcemanager" "domain-infra/cloudresourcemanager.googleapis.com"
$ pulumi import "gcp:projects/service:Service" \
"project:domain-infra/service:iam" "domain-infra/iam.googleapis.com"
$ pulumi import "gcp:projects/service:Service" \
"project:domain-infra/service:serviceusage" "domain-infra/serviceusage.googleapis.com"
# service account
$ pulumi import "gcp:serviceAccount/account:Account" "serviceAccount:terraform@domain-infra" "projects/domain-infra/serviceAccounts/terraform@domain-infra.iam.gserviceaccount.com"
# service account roles
$ pulumi import "gcp:organizations/iAMMember:IAMMember" \
"serviceAccount:terraform@domain-infra/role:billing.admin" \
"<ORG ID> roles/billing.admin serviceAccount:terraform@domain-infra.iam.gserviceaccount.com"
$ pulumi import "gcp:organizations/iAMMember:IAMMember" \
"serviceAccount:terraform@domain-infra/role:iam.serviceAccountAdmin" \
"<ORG ID> roles/iam.serviceAccountAdmin serviceAccount:terraform@domain-infra.iam.gserviceaccount.com"
$ pulumi import "gcp:organizations/iAMMember:IAMMember" \
"serviceAccount:terraform@domain-infra/role:resourcemanager.organizationAdmin" \
"<ORG ID> roles/resourcemanager.organizationAdmin serviceAccount:terraform@domain-infra.iam.gserviceaccount.com"
TODO
-
project billing info
-
service account keys (create new service account keys via Pulumi and delete the old ones?)
Now, the state described by the state is applied:
$ pulumi up
Now that the state bucket exists, we migrate the state into it:
-
export the state:
$ pulumi stack export --show-secrets --file dev.stack.json
-
in
Pulumi.yaml
, uncomment the state bucket configuration -
initialize and import the stack:
$ pulumi stack init $ pulumi stack import --file dev.stack.json
Now we are ready to initialize Terraform:
$ cd terraform
$ terraform init
Existing Google Cloud Platform resources can be bulk-exported in Terraform format if desired:
$ gcloud beta resource-config bulk-export --path=entire-tf-output \
--organization=org_id --resource-format=terraform
Now, we import existing resources:
# project
$ terraform import google_project.infra "projects/domain-infra"
# service account
$ terraform import google_service_account.terraform \
"projects/domain-infra/serviceAccounts/terraform@domain-infra.iam.gserviceaccount.com"
# if using Workspace provider to manage Google Workspace user(s)
$ terraform import googleworkspace_user.admin admin@domain.tld
Instead of importing enabled services of the infrastructure project individually like this:
$ terraform import google_project_service.cloudbilling_googleapis_com \
domain-infra/cloudbilling.googleapis.com
I rely on the idempotency and just Terraform the whole
map google_project_service.project["…"]
over;
as a result, initial terraform apply
might fail
and will need to be repeated - depending on the order of modifications.
The same applies to the service account roles.
Now, the state described by the state is applied:
$ terraform apply
Now that the state bucket exists, we migrate the state into it:
In main.tf
, uncomment backend "gcs" {…}
.
Then, move the state to the bucket (see documentation):
$ terraform init -migrate-state
Domains can be imported from Google Domains into Cloud Domains by the owner of the domains (not by the Terraform Service Account). Prices in Cloud Domains are the same as in Google Domains. Domains can be exported out of the Cloud Domains.
Once imported, domain disappears from Google Domains' list,
but is visible at https://domains.google.com/registrar?d=domain.tld
,
and can be added back by clicking "Add Project".
Website forwarding can still be setup in the Google Domains UI even if the domain is managed by Google Cloud Domains.
Google Terraform provider does not support Cloud Domains - but it does support management of the DNS records for the domains configured to use Google Cloud DNS. For each such domain a zone must be Terraformed and then associated with the domain. I do not see enough benefits in using Cloud DNS.
Google Domains goes away at the end of 2023, and all the domains from Cloud Domains go with it, so I am not sure if it makes sense to move the domains from Google Domains to Cloud Domains either - but I think I’ll do it just in case, and once the domains move, I’ll look into the benefits of managing DNS as code again.
$ gcloud auth login admin@domain.tld
$ gcloud domains registrations list-importable-domains
$ gcloud domains registrations import domain.tld
# assuming zones are terraformed:
$ gcloud domains registrations configure dns domain.tld \
--cloud-dns-zone=domain-tld
# TODO import a zone into Terraform:
$ terraform import google_dns_managed_zone.domain_tld \
projects/domain-infra/managedZones/domain-tld
# disable DNSSEC
$ gcloud domains registrations configure dns domain.tld \
--disable-dnssec
# switch back from Google Cloud DNF to Google Domains
$ gcloud domains registrations configure dns domain.tld \
--use-google-domains-dns
With Pulumi GCP provider upgrade from 6.x to 7.x,
serviceAccount
got renamed to serviceaccount
,
which broke my existing stacks,
and the only way I found to fix the breakage requires manual
local changes to the Pulumi state of the stack:
# brin the stack to the local machine:
$ pulumi stack export --show-secrets --file dev.stack.json
# delete the stack and thus its state files from the state bucket
# in `Pulumi.yaml`, comment out the state bucket configuration
# tell Pulumi to place its files under `.pulumi`
$ pulumi login file://.
$ pulumi stack init dev
$ pulumi stack import --file dev.stack.json
# fix up the state file:
# - change the GCP provider version
# - fix up the `gcp:serviceAccount` to `gcp:serviceaccount`
# once `pulumi up` works again, move the state back to the bucket:
$ pulumi stack export --show-secrets --file dev.stack.json
# in `Pulumi.yaml`, uncomment the state bucket configuration
$ pulumi stack rm --force dev
$ pulumi stack init dev
# restore whatever configuration disappeared from the `Pulumi.dev.yaml` file
$ pulumi stack import --file dev.stack.json
Cloud Setup Checklist creates some groups that we do not need right now; here is the record of them.
"Billing administrators are responsible for setting up billing accounts and monitoring their usage"
Roles:
-
billing.admin
-
billing.creator
-
resourcemanager.organizationViewer
"Security administrators are responsible for establishing and managing security policies for the entire organization, including access management and organization constraint policies"
Roles:
-
compute.viewer
-
container.viewer
-
iam.organizationRoleViewer
-
iam.securityReviewer
-
logging.configWriter
-
logging.privateLogViewer
-
orgpolicy.policyAdmin
-
resourcemanager.folderIamAdmin
-
securitycenter.admin
"Network administrators are responsible for creating networks, subnets, firewall rules, and network devices such as cloud routers, Cloud VPN instances, and load balancers"
Roles:
-
compute.networkAdmin
-
compute.securityAdmin
-
compute.xpnAdmin
-
resourcemanager.folderViewer
"Monitoring administrators have access to use and configure all features of Cloud Monitoring"
Roles:
-
monitoring.admin
"Logging administrators have access to all features of Cloud Logging"
Roles:
-
logging.admin
"Logging viewers have read-only access to a specific subset of logs ingested into Cloud Logging"
"DevOps practitioners create or manage end-to-end pipelines that support continuous integration and delivery, monitoring, and system provisioning"
Roles:
-
resourcemanager.folderViewer
In Admin Console: - activate Cloud Identity Free (optional)
References: - Cloud Identity - Identity Setup
In the olden days of GSuite, it was possible to:
- add an *@domain.tld
email alias for the user responsible for the mis-addressed messages
- configure Apps | Google Workspace | Settings for Gmail | Routing | Catch-All
Nowadays, the procedure is as described in Get misaddressed email in a catch-all mailbox.
It would be nice - but not pressing - to use groups for this.
Allegedly, there are pre-defined groups postmaster
and abuse
(at least when the domain is handled by Cloud Domains/DNS).
Those groups are invisible as Workspace groups and in https://admin.google.com/ac/groups.
They are visible to the Cloud Identity API - if the service account has Group Admin Role:
$ gcloud identity groups search --customer=... \
--labels="cloudidentity.googleapis.com/groups.discussion_forum"
$ gcloud identity groups describe postmaster@domain.tld
Attempt to add user to such group:
-
fails in Terraform
-
fails in Google Cloud Console with
permission denied
-
succeeds in the Google Groups
I can make a group for this purpose (not postmaster
nor abuse
; say, catch-all
)
and configure it as a catch-all mailbox, but I need to configure this group to accept email from outside the organization, and that requires changing a default setting for the Groups application in the Admin Console…
To remove one more UI-based step, I tried to use Terraform to assign _GROUPS_ADMIN_ROLE and _USER_MANAGEMENT_ADMIN_ROLE roles to the Terraform Service Account; even if it worked, it is probably easier to use the Admin Console - but it didn’t work:
$ gcloud auth application-default login \
--scopes "https://www.googleapis.com/auth/admin.directory.rolemanagement"
results in:
This app is blocked
This app tried to access sensitive info in your Google Account.
To keep your account safe, Google blocked this access.
and terraform apply
(with all the scopes enabled in the Google Workspace provider!) of
data "googleworkspace_role" "groups-admin" {
name = "_GROUPS_ADMIN_ROLE"
}
resource "googleworkspace_role_assignment" "terraform-groups-admin" {
role_id = data.googleworkspace_role.groups-admin.id
assigned_to = google_service_account.terraform.unique_id
scope_type = "CUSTOMER"
}
data "googleworkspace_role" "user-management-admin" {
name = "_USER_MANAGEMENT_ADMIN_ROLE"
}
resource "googleworkspace_role_assignment" "terraform-user-management-admin" {
role_id = data.googleworkspace_role.user-management-admin.id
assigned_to = google_service_account.terraform.unique_id
scope_type = "CUSTOMER"
}
results in:
Error: googleapi: Error 403: Request had insufficient authentication scopes.
Details:
[{
"@type": "type.googleapis.com/google.rpc.ErrorInfo",
"domain": "googleapis.com",
"metadata": {
"method": "ccc.hosted.frontend.directory.v1.DirectoryRoles.List",
"service": "admin.googleapis.com"
},
"reason": "ACCESS_TOKEN_SCOPE_INSUFFICIENT"
}]
Insufficient Permission ... in data "googleworkspace_role" "groups-admin"
References: