Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terraform love #126

Merged
merged 26 commits into from
Apr 20, 2022
Merged

Terraform love #126

merged 26 commits into from
Apr 20, 2022

Conversation

mmlb
Copy link
Contributor

@mmlb mmlb commented Mar 18, 2022

Description

All sorts of things mainly aimed at terraform setup. Quick one-liners for each change follow, more info/context usually found in the commit message.

direnv: use has from stdlib

Simplification of .envrc by making use of stdlib.

git: Manage git ignores in just one .gitignore file

I was originally updating .gitignore in the terraform folder, but the root file had relevant entries so I decided to just use one.

tf: Get rid of mention of ewr1 in comment

This bit of comment was unhelpful and actually very wrong.

tf: Add output value for the provisioner ssh hostname

Makes for easier ssh'ing to the provisioner, can do ssh $(tf output -raw provisioner_ssh).

tf/setup: Configure bash to be stricter/safer

Speaks for itself I think.

tf/setup: Ensure all functions use same execution mode

Some functions were running in a subshell, this is not necessary and relatively unknown.

tf/setup: Use apt-get helper function

Pulling common apt-get args into a function instead of copy/pasting in each call.

tf/setup: Only explicitly install the docker packages

Why are we manuall installing deps when apt is better at it than humans?

tf/setup: Don't hard code the arch when adding docker apt-repository

Pretty self explanatory, bit me when I tried a aarch64 machine.

tf/setup: Install docker-compose using pip

Also ran into no file (binary ?) for non x86_64 machines in GitHub releases.

tf/setup: Make main function actually functional

main (as a func) wasn't really doing anything useful before, now the file can be sourced vs executed.

tf/setup: Do not restart docker service

No need to restart docker, just because we installed docker-compose.

tf/setup: Persist 2 separate network config

This way machine stays useful after reboots.

tf/setup: Improve correctness of get_second_interface_from_bond0

Mostly a nit/theoretical fix, but nice to have imo.

tf/setup: Persist iptables gw rules

This way machine stays useful after reboots.

tf: Add local variable for worker_macs

So lines that need the mac read better.

tf: Add output for worker_macs

This way we can get to watching boots logs more quickly.

tf: Add outputs for provisioner and worker ids

For look up in EMAPI/Portal.

tf: Modify compose/.env file for repeat docker-compose runs

Before this docker-compose would do the wrong thing if ran manually because the env vars were not around.

tf: Put all setup logic in setup.sh

I originally did this because there was a race between userdata and tf remote-exec, but that is no longer true.
This is good to have anyway so that we only need to look at one file to grok the setup/run process instead of two.

tf: Add some interactive user goodies

Similar to the ones for vagrant, makes for nicer interactive use.

tf: Use format instead of formatlist for worker_sos output

This way we can ssh into sos more easily.

vagrant: Move all provisioner code into just one script

One shell script is easier to read than a bunch of shell blocks in a ruby file imo.

vagrant: Install docker and docker-compose via setup.sh

Running vagrant w/o these plugins causes vagrant to fetch them and exit, which means we need to bring vagrant up again.
This is pretty poor experience imo, by installing in setup.sh we don't need anything from the host os.

deploy: Use the same folder path on both terraform and vagrant

Who cares that we're running in tf vs vagrant anyway, this way mental models/paths are valid for both.
This also lets the setup.sh scripts have more things in common that are semantically common, which hopefully means they'll be easier to keep in sync.

Why is this needed

I tried running sandbox on some arm machines and ran into all sorts of hard coded assumptions and race conditions that don't seem to hit in x86 land. This got me to spend some time running lots of tf setups and hitting the a few bugs. Having to cd and run full command names also got boring.

How Has This Been Tested?

Lots of terraform apply.

How are existing users impacted? What migration steps/scripts do we need?

More reliable/generic tf setup.

Checklist:

I have:

  • updated the documentation and/or roadmap (if required)
  • added unit or e2e tests
  • provided instructions on how to upgrade

@displague
Copy link
Member

Please hold on to this until #96.

@mmlb
Copy link
Contributor Author

mmlb commented Mar 22, 2022

Please hold on to this until #96.

Sounds good, will do. I noticed you add more to the cloud-init/userdata where as I got rid of it completely. I did that because it was easier to see what was going on while using the remote-exec provisioner. I think its good to see this considering that this repo is meant as a demonstration of setup. its also easier to debug if something does end up going wrong. How do you feel about what I did @displague?

@displague
Copy link
Member

@mmlb I haven't reviewed your PR yet. We could have an SSH provisioner wait for the end of the userdata by watching the cloud init logs for the end of scripting, reporting output over SSH while it waits.

Perhaps the Terraform module and the sandbox Terraform configuration should be refactored some more. Sandbox would call upon the module (we would move this elsewhere) with arguments.

@mmlb
Copy link
Contributor Author

mmlb commented Mar 23, 2022

@mmlb I haven't reviewed your PR yet. We could have an SSH provisioner wait for the end of the userdata by watching the cloud init logs for the end of scripting, reporting output over SSH while it waits.

I had this initially but then after a while I realized that there's not really a good reason to do both userdata and a remote-exec, that it made most sense to have the execution in just one script. I moved everything over to remote-exec as it was a better experience for me.

Perhaps the Terraform module and the sandbox Terraform configuration should be refactored some more. Sandbox would call upon the module (we would move this elsewhere) with arguments.

That feels like a separate thing altogether. Sandbox and the terraform setup is pretty opionoated and relatively simple, makes sense for dev/messing around. Having a module seems like it'll increase the complexity a bunch, worth it for more serious cases but too much for sandbox to me.

@mmlb mmlb force-pushed the terraform-love branch 10 times, most recently from 9509dfd to 09f505d Compare April 4, 2022 20:12
@mmlb mmlb marked this pull request as ready for review April 4, 2022 20:28
@mmlb mmlb requested a review from displague April 4, 2022 20:28
@mmlb
Copy link
Contributor Author

mmlb commented Apr 4, 2022

@jacobweinstock would love your 👁️ here too

More ergonomic.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Lots of dupes here and it was confusing, better to just have one place.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
It's incorrect, surprise surprise...

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Makes it nice and easy to connect to the provisioner using cli.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Signed-off-by: Manuel Mendez <mmendez@equinix.com>
func() is not the same as func{}, the latter is run in a new subshell. There
does not seem to be a need for the subshell behavior so I'm changing to the
more common (in this file, and also generally speaking) {} form.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
This helper function lets us avoid the need to specify the env and extra args
every single time we want to interact with packages. This uses apt-get and not apt
because apt prints the following warning message:

> # WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

and there's no overwhelming reason to favor apt > apt-get, so lets not risk it.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Let dpkg package dependencies and apt-get figure out dependencies.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
mmlb added 17 commits April 18, 2022 15:43
This is not needed and would mess up someone trying to run the tf stack on
aarch64 (.../me whistles innocently...) or other arches.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
GitHub releases only has binaries for x86, which sort of don't work on other
architectures. Luckily the cheese shop has all the flavors of docker-compose
any one would be interested in.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Previuosly main didn't really offer much over just doing the same work outside
of a function, main would always run. With this new setup we can source
setup.sh and have nothing run or just have a full run if executed instead.

Very much like Python's `if __name__ == '__main__': # call main func` pattern.

Also add some spaces to group functionality together for easier
reading/scanning.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Its not necessary.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Without this the network settings are lost on reboot, which doesn't make for a
very useful development machine. I put the layer2 iface config within a
file in interfaces.d to make reruns of this script work correctly after the next
commit.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
I noticed that the slaves names in the bonding/slaves file was not guaranteed
to be in any specific order so added a sort call. I also wanted to future proof
the function in case there's ever a type that has more than 2 nics in the bond
by default by ensuring we use the second nic. Finally, I made the function
behave correctly if the script is being re-run and the bond has already been
broken up, this was useful while debugging breakages and fixing this script.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Otherwise gateway functionality is broken after a reboot.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
This variable doesn't assume that port[1] == "eth0", instead it grabs the mac
for the port whos  name is eth0. It also keeps track of the mac(s) in an array
because I was messing with multiple workers at one point and the logic to
figure this out was tricky and I don't want to lose it.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Makes it easy to get the macs and make use of, for example look at boots logs.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Makes using them much easier.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Later/manual runs of docker-compose will do the wrong thing without this
because the env settings won't be populated for those runs. This fixes that
error by first modifying the .env file which docker-compose will use.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
No sense in splitting up the setup logic into 2 different files.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Alias of dc=docker-compose alias for 700% effeciency gains. Alias tink for
even more gains!

Automatically cd'ing to /root/sandbox/compose for interactive logins because
that probably makes a lot of sense.

And a `tink` wrapper/helper script that calls docker-compose underneath. Much
nicer than `dc` calls everywhere and will even work with `watch`.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Makes it easier to use since we can avoid jq.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
This just seems easier to manage one script instead of many ruby blocks. Bonus
points for having it look similar to terraform/setup.sh so common things look
the same(ish).

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Without this a user needs to run vagrant more than once the first time they
try this out, once to fetch the vagrant-docker-compose plugin and then again
to actually run through. Why go through that when we already have code for
install docker and docker-compose that we have to maintain seprately anyway
and leads to a better first experience too?

Another minor benefit is that we increase the code sharing between terraform
and vagrant's setup.sh, which is nice.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Better to have everything look the same for easier diffing/comparing
vs having the tool name in the path.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
@displague
Copy link
Member

provisioned successfully - looks good. I like the simplifications.

Perhaps in the future, we can go all-in on cloud-config format rather than using shell scripts. The devils in the details though.

@displague displague added the ready-to-merge Signal to Mergify to merge the PR. label Apr 20, 2022
@mergify mergify bot merged commit 18b0444 into tinkerbell:main Apr 20, 2022
@mmlb mmlb deleted the terraform-love branch April 20, 2022 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-to-merge Signal to Mergify to merge the PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants