docs: AWS setup information #16

ethanholz · 2024-06-27T21:30:58Z

This PR aims to provide the minimum docs needed to set up your AWS account for use with this action. This is not finished because I think there may be some value in providing a more turn-key experience but I think this works well enough for now.

Added spelling and formatting changes.

ethanholz · 2024-06-28T20:51:00Z

One thing to consider is the addition of some IaC section for provisioning (either Terraform, CloudFormation, or Pulumi). While this is not very common in molecular software, it does represent best practices in broader DevOps spaces. Not sure how we would add this but wanted to note it here.

dwhswenson

Some initial thoughts on this (I know it isn't marked as ready for review yet, but I thought I'd give it a quick look already.)

dwhswenson · 2024-07-05T15:04:53Z

docs/aws.md

+    2. Go to the IAM console.
+    3. In the navigation pane, choose Policies.
+    4. Create a new policy by clicking "Create Policy".
+    5. Next, click JSON and paste the following JSON with the required permissions and then click "Next".


Is there an AWS Managed policy that matches this? That might simplify a few of these steps.

The closest Managed Policy for this is AmazonEC2FullAccess which contains the following permission set:

{ "Version": "2012-10-17", "Statement": [ { "Action": "ec2:*", "Effect": "Allow", "Resource": "*" }, { "Effect": "Allow", "Action": "elasticloadbalancing:*", "Resource": "*" }, { "Effect": "Allow", "Action": "cloudwatch:*", "Resource": "*" }, { "Effect": "Allow", "Action": "autoscaling:*", "Resource": "*" }, { "Effect": "Allow", "Action": "iam:CreateServiceLinkedRole", "Resource": "*", "Condition": { "StringEquals": { "iam:AWSServiceName": [ "autoscaling.amazonaws.com", "ec2scheduled.amazonaws.com", "elasticloadbalancing.amazonaws.com", "spot.amazonaws.com", "spotfleet.amazonaws.com", "transitgateway.amazonaws.com" ] } } } ] }

dwhswenson · 2024-07-05T15:07:19Z

docs/aws.md

+    4. Select an Amazon Linux 2 AMI.
+    5. Choose an instance type.
+    6. Configure the instance details.
+    7. Connect to the instance and install `docker`, `git`, and enable the `docker` service. This will be distro dependent.


probably worth us looking for an AMI that already meets this? Or we could even ship our own AMI (although that's a region-specific hassle).

From my understanding, the Deep Learning AMI should work well here but has some limitation in terms of instances that we can ship with it.

Since I am running into my own issues with this (regarding the openmm-gpu-test repo), we might use a tool like packer for building AMIs across cloud providers.

With the current pipeline setup for AMIs, I will be able to install CUDA and docker quite easily. Ubuntu comes shipped with git so we should be good there. We may need to setup pipelines with Docker as well.

dwhswenson · 2024-07-05T15:08:43Z

docs/aws.md

+    8. Feel free to install any other useful tools that you may need to get started.
+    9. Create an image of the instance using the following [AWS Docs](https://docs.aws.amazon.com/toolkit-for-visual-studio/latest/user-guide/tkv-create-ami-from-instance.html) (this may take a while)
+    10. Make sure to remove the instance once you are done creating the image.
+5. Optional: Create a VPC with subnet and security group.


It might be worth explaining why you would/wouldn't need a VPC/security group. Am I correct that the main reason to specify the VPC is if you're using the AWS account for other things, and was to keep a separate VPC for different account use cases?

You are correct on a VPC. Your security group is more focused on access and what machines might be able to SSH in your machines. I go back and forth on whether or not the VPC/subnet docs should be included because most of our community doesn't have a need for it right now. Those who have a need, generally know they need to set it up.

These have been removed in dcf41b5

codecov · 2024-07-11T19:51:22Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.64%. Comparing base (81c4f29) to head (a944a4d).
Report is 21 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #16      +/-   ##
==========================================
+ Coverage   94.11%   94.64%   +0.53%     
==========================================
  Files           3        3              
  Lines         289      299      +10     
==========================================
+ Hits          272      283      +11     
+ Misses         17       16       -1

Flag	Coverage Δ
unittests	`94.64% <ø> (+0.53%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ethanholz · 2024-07-12T19:42:07Z

One thing to consider is the addition of some IaC section for provisioning (either Terraform, CloudFormation, or Pulumi). While this is not very common in molecular software, it does represent best practices in broader DevOps spaces. Not sure how we would add this but wanted to note it here.

I don't think this will be published in the initial release. I have some of these resources available but not necessary at this point in time.

ethanholz · 2024-07-16T21:07:02Z

Added some more notes on using P2 instances, this section might be a moot point of we can use G4dn instances instead.

ethanholz · 2024-07-17T20:07:23Z

Two notes of things I would like to expand on for use:

Security concerns and what steps should be taken for use
Refactor the setup to reference and use the openmm-gpu-test repo

Although, this might be enough to do that in a separate PR

docs/aws.md

j-wags · 2024-07-18T00:20:57Z

docs/aws.md

+    1. This can be done with either a Personal Access Token or a Fine-Grained Personal Access Token.
+    2. Go to your GitHub account settings.
+    3. Click on "Developer settings".
+    4. Create a new token with `repo` scope.


I'm having a little trouble with this - Our repos are owned by our org, so making a personal access token, I don't have access to openforcefield org repos (in this case I only have access to openforcefield/openff-evaluator - my personal fork of its feedstock.)

Oh, the org level settings have to be updated to allow create of PATs for the org:

You may also be able to set the resource owner to the org as well.

Yeah, it looks like I can set the org as the resource owner now that I've allowed PATs at the org level.

The next issue I hit is that the "repo" scope is an option on the "classic" token creator page, but that page doesn't let me select the org as the resource owner (I'm thinking that this would work, but the token would then cover BOTH the org and my personal repos).

On the other hand, the "beta" token creator page does let me assign the org as resource owner, but it's not clear which permissions on this page are equivalent to "repo" from the classic token creator page. I'd love some guidance on that.

I'm heading out for the night but will try to pick this back up tomorrow.

I largely opt to use fine-grained Tokens instead of classic tokens as that is the direction that GitHub seems to be moving. I will update these docs to include the fine-grained access token permissions needed (I have what needs to be added just have to write it up).

Thanks for working on this!!

One annoyance about fine-grained tokens is that they expire. I wrote a little actions workflow that will fail when your token should be replaced: https://github.com/omsf-eco-infra/scripts/blob/main/.github/workflows/check_token.yml

Co-authored-by: Jeff Wagner <jwagnerjpl@gmail.com>

ethanholz · 2024-07-18T20:36:38Z

Was able to test AWS using g4dn instances and the AWS Deep Learning AMI and it succeeded out of the box. Going to provide edits that recommend this route instead as it removes maintenance burden and works OOTB with CUDA 12.

Migrating to support `g4dn` instances instead of `p2` since they are a better value for most of our use cases.

ethanholz · 2024-07-18T21:18:40Z

@dwhswenson I believe this is ready for review. I have a couple minor README updates that will need to be made but are not within scope for this PR and can be made outside of this.

dwhswenson

LGTM! Merging. Anything else needed before we cut a release?

ethanholz · 2024-07-20T15:55:26Z

LGTM! Merging. Anything else needed before we cut a release?

We will add a security advisory to the README and a reference to this page in the docs. Will get to this first thing Monday and push v0.2.0.

ethanholz added 4 commits June 27, 2024 10:52

docs(aws): initial AWS setup docs

d396ae7

docs: clean up AWS docs

f415c7b

Added spelling and formatting changes.

docs: spelling change for "set up" from "setup"

bdd2da1

fix: fixed README typos

7538f6f

dwhswenson reviewed Jul 5, 2024

View reviewed changes

ethanholz added 2 commits July 11, 2024 14:38

clean-up AMI creation and update VPC info

dcf41b5

reworked IAM and added more references

4e50a49

docs: add quota limit docs + how to request increase

468e47a

ethanholz added 3 commits July 15, 2024 12:05

docs: add GPU AMI info

0b3034a

docs: note pitfalls of P2 instances

cd4c264

docs(aws): added info on p2 instances, refactored to own section

03a8253

ethanholz marked this pull request as ready for review July 16, 2024 21:07

ethanholz added 2 commits July 17, 2024 15:13

fix(docs): added correct resource arn for AWS docs

aba3298

docs(aws): added a "hello world" example

dd0df2d

j-wags reviewed Jul 18, 2024

View reviewed changes

docs/aws.md Outdated Show resolved Hide resolved

j-wags reviewed Jul 18, 2024

View reviewed changes

docs: change from GITHUB_TOKEN to GH_PAT

83bba36

Co-authored-by: Jeff Wagner <jwagnerjpl@gmail.com>

docs(aws): reworded to note the use of g4dn instances

a944a4d

Migrating to support `g4dn` instances instead of `p2` since they are a better value for most of our use cases.

ethanholz requested a review from dwhswenson July 18, 2024 21:17

dwhswenson approved these changes Jul 19, 2024

View reviewed changes

dwhswenson merged commit 59f5698 into omsf-eco-infra:main Jul 19, 2024
5 checks passed

ethanholz deleted the docs/aws branch July 20, 2024 15:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: AWS setup information #16

docs: AWS setup information #16

ethanholz commented Jun 27, 2024

ethanholz commented Jun 28, 2024

dwhswenson left a comment

dwhswenson Jul 5, 2024

ethanholz Jul 11, 2024

dwhswenson Jul 5, 2024

ethanholz Jul 8, 2024

ethanholz Jul 11, 2024

dwhswenson Jul 5, 2024

ethanholz Jul 10, 2024

ethanholz Jul 12, 2024

codecov bot commented Jul 11, 2024 •

edited

Loading

ethanholz commented Jul 12, 2024

ethanholz commented Jul 16, 2024

ethanholz commented Jul 17, 2024 •

edited

Loading

j-wags Jul 18, 2024

j-wags Jul 18, 2024

ethanholz Jul 18, 2024

j-wags Jul 18, 2024

ethanholz Jul 18, 2024

dwhswenson Jul 18, 2024

ethanholz commented Jul 18, 2024

ethanholz commented Jul 18, 2024

dwhswenson left a comment

ethanholz commented Jul 20, 2024

docs: AWS setup information #16

docs: AWS setup information #16

Conversation

ethanholz commented Jun 27, 2024

ethanholz commented Jun 28, 2024

dwhswenson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jul 11, 2024 • edited Loading

Codecov Report

ethanholz commented Jul 12, 2024

ethanholz commented Jul 16, 2024

ethanholz commented Jul 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ethanholz commented Jul 18, 2024

ethanholz commented Jul 18, 2024

dwhswenson left a comment

Choose a reason for hiding this comment

ethanholz commented Jul 20, 2024

codecov bot commented Jul 11, 2024 •

edited

Loading

ethanholz commented Jul 17, 2024 •

edited

Loading