Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add deployment examples to docs #584

Merged
merged 1 commit into from
Dec 31, 2023

Conversation

blakehatch
Copy link
Contributor

@blakehatch blakehatch commented Dec 30, 2023

Description

Add terraform deployment examples to docs.

Fixes #581

Type of change

Please delete options that are not relevant.

  • This change requires a documentation update

How Has This Been Tested?

Please also list any relevant details for your test configuration

Checklist


This change is Reviewable

Copy link
Member

@aaronmondal aaronmondal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: 0 of 1 LGTMs obtained


nativelink-docs/deployment-examples.mdx line 6 at r1 (raw file):

---

# AWS Terraform Deployment

nit: Prefer surrounding headings with blank lines.


nativelink-docs/deployment-examples.mdx line 7 at r1 (raw file):

# AWS Terraform Deployment
This directory contains a reference/starting point on creating a full AWS terraform deployment of Native Link's cache and remote execution system.

nit: The common standard line length for markdown is 80 characters.


nativelink-docs/deployment-examples.mdx line 26 at r1 (raw file):

It will take some time to apply, when it is finished everything should be running. The endpoints are:

nit: In cases where you don't want syntax highlighting you can use txt as the code block language.


nativelink-docs/deployment-examples.mdx line 32 at r1 (raw file):

As a reference you should be able to compile this project using bazel with something like:
```sh

nit: Prefer surrounding code blocks with blank lines.


nativelink-docs/deployment-examples.mdx line 49 at r1 (raw file):

#### More optimal configuration
You can reduce cost and increase reliability by moving the CAS onto the same machine that invokes the remote execution protocol (like bazel). Then point the configuration to `localhost` and it will translate the S3 calls into the Bazel Remote Execution Protocol API.

nit: A bunch of terms in these sections sometimes aren't capitalized. For instance "Bazel" and "Python".


nativelink-docs/deployment-examples.mdx line 62 at r1 (raw file):

## Security
The security permissions of each instance group is very strict. The major volnerabilitys are that the instances by default are all public ip instances and we allow incoming traffic on all instances to port 22 (SSH) for debugging reaons.

nit

Suggestion:

vulnerabilities

nativelink-docs/deployment-examples.mdx line 73 at r1 (raw file):

## Useful tips
You can add `-var terminate_ami_builder=false` to the `teraform apply` command and it will make it easier to modify/apply/test your changes to these `.tf` files.

nit: Prefer makes over will make, causes over will cause etc. It might not always be possible, but try to avoid future tense as that's more in line with most styleguides such as the Google guide https://github.com/errata-ai/Google/blob/ebf4c2c5a42863faff423049e402557352c8dd5d/features/rules.feature#L28


nativelink-docs/deployment-examples.mdx line 124 at r1 (raw file):

Dev Setup

nit

Suggestion:

Development

nativelink-docs/deployment-examples.mdx line 182 at r1 (raw file):

SSH into workstation

nit: missing "the"?


nativelink-docs/deployment-examples.mdx line 216 at r1 (raw file):

### Developing/Testing

[Visual Studio Code](https://code.visualstudio.com/) could be used to actively

nit: Is it a good idea to tell users how to use SSH in this guide?

Copy link
Collaborator

@MarcusSorealheis MarcusSorealheis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the actual documentation site, the bar for clarity is very high, hence many small things that would seem nit-like in ordinary code comments or even early GitHub README's that are not actually nits.

The CAS is only used as a public interface to the S3 data. All the services will talk to S3 directly, so they don't need to talk to the CAS instance.

#### More optimal configuration
You can reduce cost and increase reliability by moving the CAS onto the same machine that invokes the remote execution protocol (like bazel). Then point the configuration to `localhost` and it will translate the S3 calls into the Bazel Remote Execution Protocol API.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can reduce cost and increase reliability by moving the CAS onto the same machine that invokes the remote execution protocol (like bazel). Then point the configuration to `localhost` and it will translate the S3 calls into the Bazel Remote Execution Protocol API.
You can reduce cost and increase reliability by moving the CAS onto the same machine that invokes the remote execution protocol (like Bazel). Then point the configuration to `localhost` and it will translate the S3 calls into the Bazel Remote Execution Protocol API.


#### More optimal configuration
You can reduce cost and increase reliability by moving the CAS onto the same machine that invokes the remote execution protocol (like bazel). Then point the configuration to `localhost` and it will translate the S3 calls into the Bazel Remote Execution Protocol API.
In bazel you can do this by making an executable file at `tools/bazel` in your WORKSPACE directory. This file can be a scripting language (like bash or python), then start the local proxy before starting bazel as a background service and then invoke the actual bazel executable with the proper flags configured.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In bazel you can do this by making an executable file at `tools/bazel` in your WORKSPACE directory. This file can be a scripting language (like bash or python), then start the local proxy before starting bazel as a background service and then invoke the actual bazel executable with the proper flags configured.
In Bazel you can do this by making an executable file at `tools/bazel` in your WORKSPACE directory. This file can be a scripting language (like Bash or Python), then start the local proxy before starting Bazel as a background service and then invoke the actual Bazel executable with the proper flags configured.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bazel is a proper noun. So our Bash and Python. These details are always important for reader clarity, but even more important in documentation. No one is going to tag you as 0x1337 for writing fire docs sadly, but they will get confused by such errors. These rules are in the language to foster clarity, akin to some formatting rules in whitespace-forgiving programing languages.

---

# AWS Terraform Deployment
This directory contains a reference/starting point on creating a full AWS terraform deployment of Native Link's cache and remote execution system.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This directory contains a reference/starting point on creating a full AWS terraform deployment of Native Link's cache and remote execution system.
This directory contains a reference/starting point on creating a full AWS Terraform deployment of Native Link's cache and remote execution system.

It may take a few mins to propagate

## Terraform Setup
1. [Install terraform](https://www.terraform.io/downloads)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. [Install terraform](https://www.terraform.io/downloads)
1. [Install Terraform](https://www.terraform.io/downloads)

The CAS is only used as a public interface to the S3 data. All the services will talk to S3 directly, so they don't need to talk to the CAS instance.

#### More optimal configuration
You can reduce cost and increase reliability by moving the CAS onto the same machine that invokes the remote execution protocol (like bazel). Then point the configuration to `localhost` and it will translate the S3 calls into the Bazel Remote Execution Protocol API.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can reduce cost and increase reliability by moving the CAS onto the same machine that invokes the remote execution protocol (like bazel). Then point the configuration to `localhost` and it will translate the S3 calls into the Bazel Remote Execution Protocol API.
You can reduce cost and increase reliability by moving the CAS onto the same machine that invokes the remote execution protocol (like Bazel). Then point the configuration to `localhost` and it will translate the S3 calls into the Bazel's Remote Build Execution Protocol API.


## Useful tips
You can add `-var terminate_ami_builder=false` to the `teraform apply` command and it will make it easier to modify/apply/test your changes to these `.tf` files.
This command will cause the AMI builder instances to not be terminated, which costs more money, but makes it so that terraform will not create a new AMI each time you call the command.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This command will cause the AMI builder instances to not be terminated, which costs more money, but makes it so that terraform will not create a new AMI each time you call the command.
This command will cause the AMI builder instances to not be terminated, which costs more money, but makes it so that Terraform will not create a new AMI each time you call the command.


### Global Setup

Setup basic configurations for DNS, certificates, Compute API and terraform
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Setup basic configurations for DNS, certificates, Compute API and terraform
Setup basic configurations for DNS, certificates, Compute API and Terraform

managed name servers for the DNS zone need to be configured. If the certificate
management fails to generate the entire process might need to be redone.

After this is applied grab the name servers from the terraform state and enter
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
After this is applied grab the name servers from the terraform state and enter
After this is applied grab the name servers from the Terraform state and enter

```

A complete and successful deployment should be able to run remote execution
commands from bazel (or other supported build systems).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
commands from bazel (or other supported build systems).
commands from Bazel (or other supported build systems).

# Example of using gcloud generated cli command bootstrap instance.
# Using google cloud console is easy to generate this command.
# Use ubuntu-2204 x86_64 as the base image as it is compatible
# with remote execution environment setup by the terraform scripts.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# with remote execution environment setup by the terraform scripts.
# with remote execution environment setup by the Terraform scripts.

@blakehatch
Copy link
Contributor Author

For the actual documentation site, the bar for clarity is very high, hence many small things that would seem nit-like in ordinary code comments or even early GitHub README's that are not actually nits.

Agreed if we're gonna automate keeping these in-sync later we're gonna have to employ the msame standards for both. I realize these are older READMEs though

Copy link
Contributor Author

@blakehatch blakehatch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 LGTMs obtained


nativelink-docs/deployment-examples.mdx line 6 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

nit: Prefer surrounding headings with blank lines.

Done.


nativelink-docs/deployment-examples.mdx line 7 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

nit: The common standard line length for markdown is 80 characters.

Done.


nativelink-docs/deployment-examples.mdx line 7 at r1 (raw file):

Previously, MarcusSorealheis (Marcus Eagan) wrote…
This directory contains a reference/starting point on creating a full AWS Terraform deployment of Native Link's cache and remote execution system.

Done.


nativelink-docs/deployment-examples.mdx line 21 at r1 (raw file):

Previously, MarcusSorealheis (Marcus Eagan) wrote…
1. [Install Terraform](https://www.terraform.io/downloads)

Done.


nativelink-docs/deployment-examples.mdx line 26 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

nit: In cases where you don't want syntax highlighting you can use txt as the code block language.

Done.


nativelink-docs/deployment-examples.mdx line 32 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

nit: Prefer surrounding code blocks with blank lines.

Done.


nativelink-docs/deployment-examples.mdx line 49 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

nit: A bunch of terms in these sections sometimes aren't capitalized. For instance "Bazel" and "Python".

Done.


nativelink-docs/deployment-examples.mdx line 49 at r1 (raw file):

Previously, MarcusSorealheis (Marcus Eagan) wrote…
You can reduce cost and increase reliability by moving the CAS onto the same machine that invokes the remote execution protocol (like Bazel). Then point the configuration to `localhost` and it will translate the S3 calls into the Bazel's Remote Build Execution Protocol API.

Done.


nativelink-docs/deployment-examples.mdx line 49 at r1 (raw file):

Previously, MarcusSorealheis (Marcus Eagan) wrote…
You can reduce cost and increase reliability by moving the CAS onto the same machine that invokes the remote execution protocol (like Bazel). Then point the configuration to `localhost` and it will translate the S3 calls into the Bazel Remote Execution Protocol API.

Done.


nativelink-docs/deployment-examples.mdx line 50 at r1 (raw file):

Previously, MarcusSorealheis (Marcus Eagan) wrote…

Bazel is a proper noun. So our Bash and Python. These details are always important for reader clarity, but even more important in documentation. No one is going to tag you as 0x1337 for writing fire docs sadly, but they will get confused by such errors. These rules are in the language to foster clarity, akin to some formatting rules in whitespace-forgiving programing languages.

Yes I should've not assumed the README was already run through these formatting rules, will open a ticket to fix these errors in README as well


nativelink-docs/deployment-examples.mdx line 62 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

nit

Done.


nativelink-docs/deployment-examples.mdx line 65 at r1 (raw file):

Previously, MarcusSorealheis (Marcus Eagan) wrote…
If you would like to use `terraform destroy`, you will need to manually purge these buckets or change the Terraform files to force destroy them.

Done.


nativelink-docs/deployment-examples.mdx line 73 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

nit: Prefer makes over will make, causes over will cause etc. It might not always be possible, but try to avoid future tense as that's more in line with most styleguides such as the Google guide https://github.com/errata-ai/Google/blob/ebf4c2c5a42863faff423049e402557352c8dd5d/features/rules.feature#L28

Done.


nativelink-docs/deployment-examples.mdx line 74 at r1 (raw file):

Previously, MarcusSorealheis (Marcus Eagan) wrote…
This command will cause the AMI builder instances to not be terminated, which costs more money, but makes it so that Terraform will not create a new AMI each time you call the command.

Done.


nativelink-docs/deployment-examples.mdx line 97 at r1 (raw file):

Previously, MarcusSorealheis (Marcus Eagan) wrote…
Setup basic configurations for DNS, certificates, Compute API and Terraform

Done.


nativelink-docs/deployment-examples.mdx line 105 at r1 (raw file):

Previously, MarcusSorealheis (Marcus Eagan) wrote…
After this is applied grab the name servers from the Terraform state and enter

Done.


nativelink-docs/deployment-examples.mdx line 124 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

nit

Done.


nativelink-docs/deployment-examples.mdx line 143 at r1 (raw file):

Previously, MarcusSorealheis (Marcus Eagan) wrote…
commands from Bazel (or other supported build systems).

Done.


nativelink-docs/deployment-examples.mdx line 156 at r1 (raw file):

Previously, MarcusSorealheis (Marcus Eagan) wrote…
# with remote execution environment setup by the Terraform scripts.

Done.


nativelink-docs/deployment-examples.mdx line 182 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

nit: missing "the"?

Done.


nativelink-docs/deployment-examples.mdx line 216 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

nit: Is it a good idea to tell users how to use SSH in this guide?

Probably not necessary esp for the public facing docs. Will remove for now and we can decide what to do with these in the README.

PROJECT_ID=example-sandbox
DNS_ZONE=example-sandbox.example.com

cd deployment-examples/Terraform/experimental_GCP/deployments/global
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cd deployment-examples/Terraform/experimental_GCP/deployments/global
cd deployment-examples/terraform/experimental_GCP/deployments/global

the directory is lowercase

Copy link
Collaborator

@MarcusSorealheis MarcusSorealheis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

almost perfect. One minor issue that would trip users up. Could be sourced from me, also.

Copy link
Contributor Author

@blakehatch blakehatch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), pre-commit-checks, publish-image, ubuntu-20.04, ubuntu-20.04 / stable


nativelink-docs/deployment-examples.mdx line 125 at r2 (raw file):

Previously, MarcusSorealheis (Marcus Eagan) wrote…
cd deployment-examples/terraform/experimental_GCP/deployments/global

the directory is lowercase

Done.

Copy link
Collaborator

@MarcusSorealheis MarcusSorealheis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 1 of 1 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / ubuntu-22.04, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), pre-commit-checks, publish-image, ubuntu-20.04, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable

Copy link
Collaborator

@MarcusSorealheis MarcusSorealheis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 1 of 1 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / ubuntu-22.04, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), pre-commit-checks, publish-image, ubuntu-20.04, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable

Copy link
Contributor Author

@blakehatch blakehatch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dismissed @MarcusSorealheis from 12 discussions.
Reviewable status: 1 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Analyze (python), Bazel Dev / ubuntu-22.04, Cargo Dev / ubuntu-22.04, Local / ubuntu-22.04, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), pre-commit-checks, publish-image, ubuntu-20.04, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable

Copy link
Contributor Author

@blakehatch blakehatch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: 1 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Analyze (python), Bazel Dev / ubuntu-22.04, Cargo Dev / ubuntu-22.04, Local / ubuntu-22.04, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), pre-commit-checks, publish-image, ubuntu-20.04, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable

@blakehatch blakehatch merged commit 546484b into TraceMachina:main Dec 31, 2023
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Terraform Examples to Docs
3 participants