Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pilot an Infrastructure as Code tool for onboarding hubs to the cloud #15

Closed
bsweger opened this issue Feb 8, 2024 · 5 comments
Closed
Assignees
Labels
cloud work related to cloud-enabled hubs infra Infrastructure

Comments

@bsweger
Copy link
Collaborator

bsweger commented Feb 8, 2024

There are several options for adopting an infrastructure as code (IaC) approach to managing our cloud resources. To start this journey, we need to pick one and create an initial repo that hooks it up to the Hubverse's AWS account.

The goal of this issue is not to create a specific resource, but rather to establish a baseline for adding resources as needed.

@bsweger bsweger added the infra Infrastructure label Feb 8, 2024
@bsweger bsweger self-assigned this Feb 8, 2024
@bsweger
Copy link
Collaborator Author

bsweger commented Feb 9, 2024

After some research, I'm going to give Pulumi a try as our IaC tooling.

Terraform, the industry standard IaC product, is a great product but has a large learning curve and uses a domain-specific scripting language (HCL). Given the team's lack of full-time dev-ops staff and our fairly simple AWS usage, Terraform seems like overkill.

I haven't used Pulumi before, but it has reasonable adoption and supports Python as our IaC language.

Detailed notes on the research (links to internal Reich Lab Confluence...if anyone outside the lab is interested, please let me know): https://reichlab.atlassian.net/wiki/spaces/RLD/pages/4325787/Infrastructure+as+Code+IaC

Will go through the Pulumi tutorial and report back.

@annakrystalli
Copy link
Contributor

Would Ansible be a potential candidate for our purposes?

I have worked with it before and found it reasonably easy to pick up.
The fact that it is effectively coded in YAML makes it more language agnostic which I find quite appealing. Of course that's because I'm not a python person. 😜

https://stackshare.io/stackups/ansible-vs-pulumi#:~:text=In%20Summary%2C%20Ansible%20and%20Pulumi,and%20community%20and%20ecosystem%20size.

@bsweger
Copy link
Collaborator Author

bsweger commented Feb 13, 2024

Heya @annakrystalli sorry for not seeing your note sooner...I don't have github notifications going to my main inbox b/c they're so noisy, will try to fix it up so I see the important pings!

I didn't have Ansible on my "things to look at list" because the practice of "provisioning infrastructure with Terraform and doing config management/deployments/orchestration with Ansible" is so common that I hadn't considered Ansible for the former.

Assuming that Ansible can configure every type of AWS resource we'll need (it probably can, our needs are simple: S3 buckets, IAM roles, IAM policies, OIDC identity provider), the main difference between it and a tool like Terraform or Pulumi is state management.

IaC tools that track state (usually by storing it in the vendor's cloud or by self-hosting state in your own cloud) make it much easier to 1) audit changes to infrastructure and 2) manage "drift", which happens when someone modifies a managed resource outside of the tool (e.g., using the AWS console)

For example, tools that manage state are able to provide a "diff" before you actually apply infrastructure changes (this is from pulumi, but Terraform works the same way):

pulumi up
Previewing update (hubverse)

View in Browser (Ctrl+O): https://app.pulumi.com/bsweger/hubverse-aws/hubverse/previews/283337c1-0892-431f-8618-00938007ca32

     Type                              Name                   Plan
 +   pulumi:pulumi:Stack               hubverse-aws-hubverse  create
 +   └─ aws:iam:OpenIdConnectProvider  github-actions         create

Resources:
    + 2 to create

Do you want to perform this update?

Ansible doesn't track state, it just creates/changes infrastructure as defined in the playbook. Which is easier in some ways, because there's no state to track. But you lose the advantages of above.

All things considered, I'd vote for something with state management. Despite the extra moving part, having a mechanism to know when the state of our infrastructure as defined via code differs from the state of our actual infrastructure is worthwhile for a smaller team.

@bsweger
Copy link
Collaborator Author

bsweger commented Feb 13, 2024

@annakrystalli given that the lab as a whole is trying to level-up on Python, my original thought was that using Python here might be more accessible than Terraform's specific language. But maybe I'm making unwarranted assumptions? FWIW, my YAML experience in this space is that it's hard to test, maintain, and debug once you reach a certain level of complexity.

What do you think about something like this: https://github.com/Infectious-Disease-Modeling-Hubs/hubverse-infrastructure/blob/main/__main__.py

This is an experimental repo to get a feel for Pulumi, which I've never used before.

Will be adding a README soon, but the upshot is that you define your resources in the Python app and then apply them either:

  • via a GitHub action which adds the diff (see above comment) to PRs for review
  • via command line, which is what I'm doing to experiment

@bsweger bsweger changed the title Decide on infrastructure-as-code and create initial connection to Hubverse AWS Pilot an Infrastructure as Code tool for onboarding hubs to the cloud Feb 15, 2024
@bsweger
Copy link
Collaborator Author

bsweger commented Feb 15, 2024

I changed the title of this issue after realizing that "deciding" on a tool isn't something we can do via a discussion here.

I did a test drive of Pulumi and got a process working that will provision the AWS infrastructure required to mirror hub data to S3. The repo is here: https://github.com/Infectious-Disease-Modeling-Hubs/hubverse-infrastructure

It looks somewhat intimidating (every tool will look intimidating in its own way as we learn about it). It's important, however, to note that the learning curve for IaC tools is separate than the learning curve for understanding the AWS resources themselves. The former we can control to some extent, the latter we can't...it's part of the cost of being on the cloud.

I propose that a next step would be a Pulumi demo to get feedback to see what people think and determine if we'd like to explore an alternative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cloud work related to cloud-enabled hubs infra Infrastructure
Projects
Status: Done
Development

No branches or pull requests

2 participants