Version and TF module updates #13

git-ival · 2023-09-13T21:50:45Z

The changes can be summarized as follows:

ignore tfvars in VCS
add input vars for ami and instance_types
add options to upstream and downstream cluster objects
- refactor of downstream_clusters local var
- added optional datastore field
add more specific version constraints to tf providers
default to rancher-eng aws profile for aws tf provider auth

moio

👋 @git-ival, thanks for this contribution!

Sorry for bringing in quite a lot of questions to your first PR 😇

Don't be scared, I like most of what I am seeing here 😉

terraform/main/aws/inputs.tf

terraform/main/aws/main.tf

moio · 2023-09-14T08:15:44Z

terraform/main/aws/inputs.tf

+variable "downstream_config" {
+  type = object({
+    name           = optional(string)
+    server_count   = number


This opens an interesting design question: if we make those parameters variable, then a commit id will not suffice to describe the whole test environment - one will need the id plus all specified variables.

My original intent was to go all-in with code hardcoding parameters, so that the commit id is all one needs to know to repeat the test. Of course there are bits that can't be hardcoded - namely credentials and other security material - but those would be the only exceptions.

That intent comes from the past experience that those parameters then have to be tracked, and frequently end up in another Git repo that contains workflow definitions specifying environment variables for the tests. So one needs two Git repos, and their commit ids, before knowing the full picture.

What are your reasons to keep those variables variable?

I see, I hadn't realized that was the method being used for keeping track of the test environments specification. I like the flexibility that input variables can offer, but I understand the desire for being able to quickly and easily reproduce an env.

I see a potential usecase for corral and it's "corrals" (corral-packages). Or at the very least, the concept of a sort of manifest for an env that could be updated per desired config (see an example).

Alternatively we could track the actual testing, observations/results and config (input vars, etc.) via GH issues if that suffices and is desirable.

Otherwise, we can revert this change and leave things hardcoded.

Two aspects: first, I understand the need to override any piece of configuration as a "temporary" measure during development. At the moment that is possible via tfvars but not all useful aspects in main.tfs are actually variables - I have no problem in accepting patches to expose any useful aspect as variables with default values, so that they can be overridden when needed. In future, I can imagine this project running in CI always on defaults, and engineers being able to override any aspects in their setups during development. I am also open to consider alternative overriding mechanisms to make DX better.

Second: in a future where this project runs in CI, and where there is more than one set of "configuration parameters" (eg. the testsuite running on both AWS and Azure at any given time), where should such parameters be stored? Choosing where they are stored is important because we will need to long term record that information along with test results, in order to make them repeatable/verifiable. The obvious option in my mind is some git repo, as we will need tracking/PRs/blame/etc.

I see two options:

any such configuration parameter lives in the testsuite repo (this) alongside code. Implication: "environment files"/"corral files"/"tfvar files" are stored in a directory here, and a commit id to this repo contains everything needed to reproduce (modulo security-sensitive parameters such as AWS credentials). Other implication: a new test environment implies creating a new branch and changing parameters there. Further development will go on in master, and if new features (eg. new tests) are implemented, a rebase of test branches is needed to "sign off" that they now run on the new code - along with any related parameter adjustment

there is a separate repo with "environment files"/"corral files"/"tfvar files". Thus a build is defined by a combination of one commit ID from this repo and one commit ID with the "parameters repo". More pragmatically the parameters repo could contain multiple environments, so a build will be defined by two commit IDs and one "identifier" of some sort (eg. directory name or manifest path). Advantage: no need to branch off from master in this repo to create a new environment. Disadvantage: two repos to keep in sync.

In my experience I have seen projects using 2., and the bookkeeping kind of bored me. In this project I tried to go with 1., and found out rebasing is also meh, but I still like it a little bit better. Still I don't want to take a decision based on my feelings/tastes alone.

What is your experience and what do you think could serve us best?

Yeah the problem of where to store test env configs has been on my mind for some time, even outside of this repo's concerns. I think like you said, it would likley be easiest to store in a VCS/repo in order to more easily track the point-in-time configs, etc.

An alternative to the more "brute force" git branching/commit ID method, corral is actually capable of creating and storing packages in an OCI registry. See here for an example of some packages created via corral and pushed to the github packages for the repo. Corral has a command to easily publish a given package to a given registry via corral package publish </path/to/local/corral/package><OCI registry path:tag>. I think this could help solve the config storage issue by publishing packages to an OCI registry (tagged with the commit id and/or datetime stamp), this would also make things very easily reproducible for anyone who is less dev-oriented.

In the meantime, I think leaving the downstream cluster config as an input variable is still feasible since I have mantained the defaults for each field and I will make the num_downstreams default to 5 to stay in-line with the terraform config that was in-place before these changes.

In the meantime, I think leaving the downstream cluster config as an input variable is still feasible since I have mantained the defaults for each field and I will make the num_downstreams default to 5 to stay in-line with the terraform config that was in-place before these changes.

Yes, that makes total sense regardless of the rest of this discussion. Thanks.

Back to your point:

corral is actually capable of creating and storing packages in an OCI registry

That is an interesting feature, and it helps with an important problem (ease of reproducibility), but unless I am missing something I see that as separate design problem to solve.

Specifically, I do not see corral helping answering the question "where are test env vars tracked". Let's assume we develop a benchmark in order to, say, stress test the Steve API. That will be parametric on the Rancher version, the number of downstream clusters, the distro used underneath Rancher, the number of nodes and their sizing, the number of concurrent users hammering the API, and probably many more parameters.

IIUC corral will not help in storing those parameters, nor giving a certain "set of environment variables values" an ID, nor having a way to reference that ID from test results (eg. stored Prometheus metrics). Am I missing anything?

If not, where do you envision to store and how do you envision to reference those per-test sets of environment variable values?

I think this might be a bit easier to discuss via Teams or Slack but:

Specifically, I do not see corral helping answering the question "where are test env vars tracked".

To this point, I would imagine that this is where the corral package manifest comes in. Say we converted an existing module to be more "dynamic" (various input vars, maybe even some logic switches/optional resource creation if desired/needed), we could create different corral package templates that utilize this module and pass different input vars to modify exactly what sort of infra is spun up. Thus rather than relying on various git branches we could have a single git branch with multiple corral package templates that have the input var values baked into the manifest. All in the same repository.

In this way we could create various templates with specific input vars baked into the manifest (much the same way as you are doing via hardcoding inputs in git branches), and we could publish them to an OCI registry (like Github Packages) and use the git commit ID to tie them to the point-in-time versions of the modules etc.

terraform/modules/aws_loadbalancer/inputs.tf

terraform/modules/aws_loadbalancer/main.tf

moio

I left some questions.

One more general note: this PR and its commits have now lost the original intent, and contain fixes and improvements that go in many different directions.

In future, please consider separate PRs (or at least separate commits) to make granular merging easier. Thanks.

moio · 2023-11-14T09:35:14Z

terraform/modules/aws_network/main.tf

@@ -137,7 +138,7 @@ resource "aws_route_table_association" "secondary_private" {
 }

 resource "aws_vpc_dhcp_options" "dhcp_options" {
-  domain_name         = var.region == "us-east-1" ? "ec2.internal" : "${var.region}.compute.internal"
+  domain_name         = contains(["us-east-1", "us-west-1"], var.region) ? "ec2.internal" : "${var.region}.compute.internal"


It is my understanding that the original is the default behavior:

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-naming.html#instance-naming-rbn

Format for an instance in us-east-1: ec2-instance-id.ec2.internal Example: i-0123456789abcdef.ec2.internal Format for an instance in any other AWS Region: ec2-instance-id.region.compute.internal Example: i-0123456789abcdef.us-west-2.compute.internal

What is the reason why you set .ec2.internal for us-west-1?

It seems like you're correc, although I remember facing some issues when trying to use this module with us-west-1 without things change. I will revert this for now

moio · 2023-11-14T09:37:42Z

terraform/modules/aws_network/main.tf

  tags = {
    Project = var.project_name
    Name    = "${var.project_name}-nat-eip"
  }
+  depends_on = [aws_internet_gateway.main]


What is the reason to add this dependency? Did you encounter errors without it?

Yes, see the documentation for aws_eip

The relevant note:

moio · 2023-11-14T13:18:53Z

terraform/modules/aws_loadbalancer/inputs.tf

@@ -0,0 +1,21 @@
+variable "subdomain" {


What is the reason to add a load balancer to the Terraform configuration, rather than having that managed by Kubernetes?

No particular reason other than it is what I'm accustomed to. I've personally never used k8s in order to provision and manage cloud-provider loadbalancers 🤷‍♂️. This module can be removed if its not particularly useful

My problem with this is that it is not really realistic in terms of how actual prod environments look like, and it's significantly more complex than just exposing a node's IP. I proposed the latter approach in #17. Please remove this part

moio · 2024-05-03T13:21:48Z

Work to continue as part of project dartboard, I am closing this PR for the time being

moio reviewed Sep 14, 2023

View reviewed changes

git-ival added 4 commits October 24, 2023 16:45

update tf modules to improve their flexibility

a9ddfde

update rancher_monitoring.mjs and setup.mjs chart versions

fd6811b

add aws_loadbalancer module

9cfa3a9

set default num_downstreams, add missing inputs from lb module

fc5d814

git-ival force-pushed the updates branch from 762da89 to fc5d814 Compare October 24, 2023 16:48

moio requested changes Nov 14, 2023

View reviewed changes

moio mentioned this pull request Nov 17, 2023

AWS: allow exposing cluster nodes #17

Merged

moio closed this May 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version and TF module updates #13

Version and TF module updates #13

git-ival commented Sep 13, 2023 •

edited

moio left a comment

moio Sep 14, 2023

git-ival Sep 14, 2023

moio Sep 15, 2023

git-ival Sep 21, 2023 •

edited

git-ival Sep 21, 2023

moio Sep 21, 2023 •

edited

git-ival Sep 21, 2023

moio left a comment

moio Nov 14, 2023

git-ival Nov 15, 2023

moio Nov 14, 2023

git-ival Nov 15, 2023

moio Nov 14, 2023

git-ival Nov 15, 2023

moio Nov 17, 2023

moio commented May 3, 2024

Version and TF module updates #13

Version and TF module updates #13

Conversation

git-ival commented Sep 13, 2023 • edited

moio left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

git-ival Sep 21, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

moio Sep 21, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

moio left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

moio commented May 3, 2024

git-ival commented Sep 13, 2023 •

edited

git-ival Sep 21, 2023 •

edited

moio Sep 21, 2023 •

edited