-
Notifications
You must be signed in to change notification settings - Fork 52
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add documentation and machine type variables for gcp.
- Loading branch information
1 parent
0c225da
commit abee88f
Showing
9 changed files
with
239 additions
and
32 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,35 +1,187 @@ | ||
# TODO | ||
# Native Link's Terraform Deployment | ||
This directory contains a reference/starting point on creating a full GCP | ||
[terraform](https://www.terraform.io/downloads) deployment of Native Link's | ||
cache and remote execution system. | ||
|
||
Documentation coming soon. | ||
## Prerequisites | ||
|
||
1. Google Compute Cloud project with billing enabled. | ||
2. A domain where name servers can be pointed to Google DNS Cloud. | ||
|
||
## Terraform Setup | ||
|
||
Setup is done in two configurations, a **global** configuration and **dev** | ||
configuration. The dev configuration depends on the global configuration. | ||
Global configuration is a one-time setup which requires an out-of-bound step | ||
of updating registrar managed name servers. This step is required for | ||
certificate manager authorization to generate certificate chain. | ||
|
||
### Global Setup | ||
|
||
Setup basic configurations for DNS, certificates, Compute API and terraform | ||
state storage bucket. The global setup should be a one-time process, once | ||
properly configured it does not need to be redone. | ||
|
||
It is important to note that after these configurations are applied the | ||
managed name servers for the DNS zone need to be configured. If the certificate | ||
management fails to generate the entire process might need to be redone. | ||
|
||
After this is applied goto the | ||
[Cloud DNS settings](https://console.cloud.google.com/net-services/dns/zones), | ||
click into the domain zone of type NS and use the domains listed in the `Data` | ||
field for the managed name servers in the domains registrar configuration page. | ||
For example, google managed domains has a section called DNS where you can | ||
choose "Custom name servers", the Data field will have NS in naming format of | ||
`ns-cloud-XX.googledomains.com`, enter those four domains into the owning | ||
domains registrar configuration page. | ||
|
||
Confirm certificates are generated by checking the | ||
[Certificate Manager](https://cloud.google.com/certificate-manager/docs/overview) | ||
page in [Google Cloud Console](https://console.cloud.google.com) that the status | ||
is Active before moving onto generating the dev step. | ||
|
||
# TL;Dr | ||
```sh | ||
PROJECT_ID=example-sandbox | ||
DNS_ZONE=example-sandbox.example.com | ||
REGION=us-central1 | ||
ZONE=us-central1-a | ||
PREFIX=exdev | ||
|
||
# First we need to apply the global config. This config | ||
# is unlikely to change much. The "dev" section below | ||
# depends on this "global" section to be applied first. | ||
# It is done this way to reduce cost of development, since | ||
# SSL certs costs ~$20 every time they are generated, so we | ||
# generate them only once and keep using the same one. | ||
# | ||
# Important: Once it is applied, you need to immediately | ||
# create a "NS" record to the domain specified in "gcp_dns_zone" | ||
# in the whatever DNS service you are using and point it to the | ||
# NS record specified by the GCP DNS zone it created. | ||
cd deployment-examples/terraform/GCP/deployments/global | ||
|
||
terraform init | ||
terraform apply \ | ||
-var gcp_project_id=project-name-goes-here \ | ||
-var gcp_dns_zone=my-domain.example.com \ | ||
-var gcp_region=us-central1 \ | ||
-var gcp_zone=us-central1-a | ||
-var gcp_project_id=$PROJECT_ID \ | ||
-var gcp_dns_zone=$DNS_ZONE \ | ||
-var gcp_region=$REGION \ | ||
-var gcp_zone=$ZONE \ | ||
-var project_prefix=$PREFIX | ||
``` | ||
|
||
# After "global" is applied we need to apply the "dev" section. | ||
# This is the majority of the configuration. | ||
cd deployment-examples/terraform/GCP/deployments/dev | ||
### Dev Setup | ||
|
||
Setup and deploy the `native-link` servers and dependencies. The general | ||
configuration is laid out similar to | ||
[Native Link AWS Terraform Diagram](https://user-images.githubusercontent.com/1831202/176286845-ff683266-3f23-489c-b58a-3eda49e484be.png) | ||
from | ||
[AWS deployment example](https://github.com/TraceMachina/native-link/blob/main/deployment-examples/terraform/AWS/README.md). | ||
Deployment has additional flags in `variables.tf` for controlling machine | ||
type and other template parameters. | ||
|
||
```sh | ||
PROJECT_ID=example-sandbox | ||
REGION=us-central1 | ||
ZONE=us-central1-a | ||
PREFIX=exdev | ||
cd deployment-examples/terraform/GCP/deployments/dev | ||
terraform init | ||
terraform apply \ | ||
-var gcp_project_id=project-name-goes-here | ||
-var gcp_project_id=$PROJECT_ID \ | ||
-var gcp_region=$REGION \ | ||
-var gcp_zone=$ZONE \ | ||
-var project_prefix=$PREFIX | ||
``` | ||
|
||
A complete and successful deployment should be able to run remote execution | ||
commands from bazel (or other supported build systems). | ||
|
||
## Example Test | ||
|
||
Simple way to test as a client is by | ||
[creating](https://cloud.google.com/sdk/gcloud/reference/compute/instances/create) | ||
a "workstation" instance on Google Cloud Platform, install bazel, clone | ||
`native-link` and run tests using the deployed remote cache and remote executor. | ||
|
||
```sh | ||
# Example of using gcloud generated cli command bootstrap instance. | ||
# Using google cloud console is easy to generate this command. | ||
# Use ubuntu-2204 x86_64 as the base image as it is compatible | ||
# with remote execution environment setup by the terraform scripts. | ||
NAME=dev-workstation-001 | ||
PROJECT_ID=example-sandbox | ||
REGION=us-central1 | ||
ZONE=us-central1-a | ||
SERVICE_ACCOUNT=123-compute@developer.gserviceaccount.com | ||
OS_IMAGE=projects/ubuntu-os-cloud/global/images/ubuntu-2204-jammy-v20231201 | ||
DISK=projects/example-sandbox/zones/us-central1-a/diskTypes/pd-standard | ||
|
||
gcloud compute instances create $NAME \ | ||
--project=$PROJECT_ID \ | ||
--zone=$ZONE \ | ||
--machine-type=e2-standard-8 \ | ||
--network-interface=network-tier=PREMIUM,stack-type=IPV4_ONLY,subnet=default \ | ||
--maintenance-policy=MIGRATE \ | ||
--provisioning-model=STANDARD \ | ||
--service-account=$SERVICE_ACCOUNT \ | ||
--scopes=https://www.googleapis.com/auth/devstorage.read_only,https://www.googleapis.com/auth/logging.write,https://www.googleapis.com/auth/monitoring.write,https://www.googleapis.com/auth/servicecontrol,https://www.googleapis.com/auth/service.management.readonly,https://www.googleapis.com/auth/trace.append \ | ||
--create-disk=auto-delete=yes,boot=yes,device-name=instance-1,image=${OS_IMAGE},mode=rw,size=30,type=$DISK \ | ||
--no-shielded-secure-boot \ | ||
--shielded-vtpm \ | ||
--shielded-integrity-monitoring \ | ||
--labels=goog-ec-src=vm_add-gcloud \ | ||
--reservation-affinity=any | ||
``` | ||
|
||
[SSH](https://cloud.google.com/sdk/gcloud/reference/compute/ssh) into workstation | ||
instance, install deps and clone `native-link` (which has bazel compatible remote | ||
execution setup). | ||
|
||
```sh | ||
# On local machine | ||
NAME=dev-workstation-001 | ||
PROJECT_ID=example-sandbox | ||
ZONE=us-central1-a | ||
gcloud compute ssh --zone $ZONE $NAME --project $PROJECT_ID | ||
|
||
# On gcp workstation | ||
git clone https://github.com/TraceMachina/native-link.git | ||
sudo apt install -y npm | ||
sudo npm install -g @bazel/bazelisk | ||
cd native-link | ||
|
||
DNS_ZONE=example-sandbox.example.com | ||
CAS="cas.${DNS_ZONE}" | ||
EXECUTOR="scheduler.${DNS_ZONE}" | ||
|
||
bazel test //... --experimental_remote_execution_keepalive \ | ||
--remote_instance_name=main \ | ||
--remote_cache=$CAS \ | ||
--remote_executor=$EXECUTOR \ | ||
--remote_default_exec_properties=cpu_count=1 \ | ||
--remote_timeout=3600 \ | ||
--remote_download_minimal \ | ||
--verbose_failures | ||
``` | ||
|
||
### Developing/Testing | ||
|
||
[Visual Studio Code](https://code.visualstudio.com/) could be used to actively | ||
work on native-link code cloned by using | ||
[Visual Studio Remote Development](https://code.visualstudio.com/docs/remote/remote-overview). | ||
The setup will allow for Visual Studio running on a local machine connected to | ||
a remote workstation, mapping along the file system and access to terminal. | ||
Using this setup can allow for working on native-link or testing different | ||
workloads without having to match environment expectations. Install the | ||
[Visual Studio Remote Development Extension Pack](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.vscode-remote-extensionpack), | ||
connect using ssh to work station instance and map `native-link` folder | ||
(or any other cloned project). | ||
|
||
## Notes | ||
|
||
### DNS Issues | ||
|
||
Setting the managed name server on some registrar can be slow, up to 24-72 | ||
hours. A way to work around the wait and get | ||
[certificates authorized](https://cloud.google.com/certificate-manager/docs/dns-authorizations#gcloud) | ||
is by setting a CNAME entry with your registrar containing the data field | ||
provided. Once the status of the certificate is active, then switching the | ||
name servers to use google's name servers will work. | ||
|
||
### Teardown | ||
|
||
If things need to be torn down due to misconfiguration or fumbles, leveraging | ||
the `project_prefix` will allow scoping of the resource names in such a way | ||
they can easily be searched or deleted manually. `terraform apply -destroy` | ||
will work in most cases, some resources may require manual deleting, such as | ||
cloud buckets or service accounts, the scripts don't handle them at the moment. | ||
There is also a dependency between global and dev, so when deleting / | ||
destroying, start with dev and then global. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters