Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: AWS setup information #16

Merged
merged 14 commits into from
Jul 19, 2024
Merged

Conversation

ethanholz
Copy link
Contributor

This PR aims to provide the minimum docs needed to set up your AWS account for use with this action. This is not finished because I think there may be some value in providing a more turn-key experience but I think this works well enough for now.

@ethanholz
Copy link
Contributor Author

One thing to consider is the addition of some IaC section for provisioning (either Terraform, CloudFormation, or Pulumi). While this is not very common in molecular software, it does represent best practices in broader DevOps spaces. Not sure how we would add this but wanted to note it here.

Copy link
Contributor

@dwhswenson dwhswenson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial thoughts on this (I know it isn't marked as ready for review yet, but I thought I'd give it a quick look already.)

docs/aws.md Outdated
2. Go to the IAM console.
3. In the navigation pane, choose Policies.
4. Create a new policy by clicking "Create Policy".
5. Next, click JSON and paste the following JSON with the required permissions and then click "Next".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an AWS Managed policy that matches this? That might simplify a few of these steps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The closest Managed Policy for this is AmazonEC2FullAccess which contains the following permission set:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "ec2:*",
            "Effect": "Allow",
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": "elasticloadbalancing:*",
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": "cloudwatch:*",
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": "autoscaling:*",
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": "iam:CreateServiceLinkedRole",
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "iam:AWSServiceName": [
                        "autoscaling.amazonaws.com",
                        "ec2scheduled.amazonaws.com",
                        "elasticloadbalancing.amazonaws.com",
                        "spot.amazonaws.com",
                        "spotfleet.amazonaws.com",
                        "transitgateway.amazonaws.com"
                    ]
                }
            }
        }
    ]
}

docs/aws.md Outdated
4. Select an Amazon Linux 2 AMI.
5. Choose an instance type.
6. Configure the instance details.
7. Connect to the instance and install `docker`, `git`, and enable the `docker` service. This will be distro dependent.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably worth us looking for an AMI that already meets this? Or we could even ship our own AMI (although that's a region-specific hassle).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding, the Deep Learning AMI should work well here but has some limitation in terms of instances that we can ship with it.

Since I am running into my own issues with this (regarding the openmm-gpu-test repo), we might use a tool like packer for building AMIs across cloud providers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the current pipeline setup for AMIs, I will be able to install CUDA and docker quite easily. Ubuntu comes shipped with git so we should be good there. We may need to setup pipelines with Docker as well.

docs/aws.md Outdated
8. Feel free to install any other useful tools that you may need to get started.
9. Create an image of the instance using the following [AWS Docs](https://docs.aws.amazon.com/toolkit-for-visual-studio/latest/user-guide/tkv-create-ami-from-instance.html) (this may take a while)
10. Make sure to remove the instance once you are done creating the image.
5. Optional: Create a VPC with subnet and security group.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth explaining why you would/wouldn't need a VPC/security group. Am I correct that the main reason to specify the VPC is if you're using the AWS account for other things, and was to keep a separate VPC for different account use cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct on a VPC. Your security group is more focused on access and what machines might be able to SSH in your machines. I go back and forth on whether or not the VPC/subnet docs should be included because most of our community doesn't have a need for it right now. Those who have a need, generally know they need to set it up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These have been removed in dcf41b5

Copy link

codecov bot commented Jul 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.64%. Comparing base (81c4f29) to head (a944a4d).
Report is 21 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #16      +/-   ##
==========================================
+ Coverage   94.11%   94.64%   +0.53%     
==========================================
  Files           3        3              
  Lines         289      299      +10     
==========================================
+ Hits          272      283      +11     
+ Misses         17       16       -1     
Flag Coverage Δ
unittests 94.64% <ø> (+0.53%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ethanholz
Copy link
Contributor Author

One thing to consider is the addition of some IaC section for provisioning (either Terraform, CloudFormation, or Pulumi). While this is not very common in molecular software, it does represent best practices in broader DevOps spaces. Not sure how we would add this but wanted to note it here.

I don't think this will be published in the initial release. I have some of these resources available but not necessary at this point in time.

@ethanholz
Copy link
Contributor Author

Added some more notes on using P2 instances, this section might be a moot point of we can use G4dn instances instead.

@ethanholz ethanholz marked this pull request as ready for review July 16, 2024 21:07
@ethanholz
Copy link
Contributor Author

ethanholz commented Jul 17, 2024

Two notes of things I would like to expand on for use:

  1. Security concerns and what steps should be taken for use
  2. Refactor the setup to reference and use the openmm-gpu-test repo

Although, this might be enough to do that in a separate PR

docs/aws.md Outdated Show resolved Hide resolved
1. This can be done with either a Personal Access Token or a Fine-Grained Personal Access Token.
2. Go to your GitHub account settings.
3. Click on "Developer settings".
4. Create a new token with `repo` scope.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having a little trouble with this - Our repos are owned by our org, so making a personal access token, I don't have access to openforcefield org repos (in this case I only have access to openforcefield/openff-evaluator - my personal fork of its feedstock.)

Screenshot 2024-07-17 at 5 18 44 PM

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, the org level settings have to be updated to allow create of PATs for the org:

Screenshot 2024-07-17 at 5 25 38 PM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may also be able to set the resource owner to the org as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it looks like I can set the org as the resource owner now that I've allowed PATs at the org level.

The next issue I hit is that the "repo" scope is an option on the "classic" token creator page, but that page doesn't let me select the org as the resource owner (I'm thinking that this would work, but the token would then cover BOTH the org and my personal repos).

On the other hand, the "beta" token creator page does let me assign the org as resource owner, but it's not clear which permissions on this page are equivalent to "repo" from the classic token creator page. I'd love some guidance on that.

I'm heading out for the night but will try to pick this back up tomorrow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I largely opt to use fine-grained Tokens instead of classic tokens as that is the direction that GitHub seems to be moving. I will update these docs to include the fine-grained access token permissions needed (I have what needs to be added just have to write it up).

Thanks for working on this!!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One annoyance about fine-grained tokens is that they expire. I wrote a little actions workflow that will fail when your token should be replaced: https://github.com/omsf-eco-infra/scripts/blob/main/.github/workflows/check_token.yml

Co-authored-by: Jeff Wagner <jwagnerjpl@gmail.com>
@ethanholz
Copy link
Contributor Author

Was able to test AWS using g4dn instances and the AWS Deep Learning AMI and it succeeded out of the box. Going to provide edits that recommend this route instead as it removes maintenance burden and works OOTB with CUDA 12.

Migrating to support `g4dn` instances instead of `p2` since they are a
better value for most of our use cases.
@ethanholz
Copy link
Contributor Author

@dwhswenson I believe this is ready for review. I have a couple minor README updates that will need to be made but are not within scope for this PR and can be made outside of this.

Copy link
Contributor

@dwhswenson dwhswenson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Merging. Anything else needed before we cut a release?

@dwhswenson dwhswenson merged commit 59f5698 into omsf-eco-infra:main Jul 19, 2024
5 checks passed
@ethanholz
Copy link
Contributor Author

LGTM! Merging. Anything else needed before we cut a release?

We will add a security advisory to the README and a reference to this page in the docs. Will get to this first thing Monday and push v0.2.0.

@ethanholz ethanholz deleted the docs/aws branch July 20, 2024 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants