Lots of refactors and cleanups #124

mmlb · 2022-03-14T22:53:51Z

Description

A bunch of QoL fixes for compose and vagrant.

Why is this needed

I was having trouble following some of the things done in compose especially when comparing vagrant vs terraform. I had a hunch that somethings weren't necessary and set about to check and started unraveling some threads.

How Has This Been Tested?

Lots of:

vagrant destroy -f; vagrant up provisioner; ... wait ...; vagrant up machine1; reboot

throughout deving and at the end.

How are existing users impacted? What migration steps/scripts do we need?

Should be easier to contribute to sandbox as there's even less special-case-stuff in vagrant and in the compose dirs.
The "one-off-services" directories being named after the services themselves was nice to be able to go back and forth easily without having to a mental-map-lookup.

Checklist:

I have:

updated the documentation and/or roadmap (if required)
added unit or e2e tests
provided instructions on how to upgrade

Signed-off-by: Manuel Mendez <mmendez@equinix.com>

We can totally test docker-compose stuff without vagrant or terraform so lets add it to shell.nix to ensure its around. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

Signed-off-by: Manuel Mendez <mmendez@equinix.com>

It's unnecessary and makes some changes I've got lined up to Vagrantfile more complex. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

Its completely unnecessary. Docker-compose handles relative files relative to the project folder, which by default is the folder that contains the docker-compose.yml file. This makes the docker-compose.yml easier to parse for humans. Verified with `docker-compose -f $file config`, in and out of vagrant, in and out of the compose dir, and by running a vagrant run. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

jacobweinstock · 2022-03-15T16:45:35Z

Hmm....there's a lot in this PR. step 4 in the vagrant-virtualbox quickstart failed for me with the following:

Every 2.0s: tink workflow events ; tink workflow state                                                                                                                              2022-03-15 16:38:19

Error: tink workflow events [id] takes an arguments
Usage:
  tink workflow events [id]

Examples:
tink workflow events [id]

Flags:
  -h, --help   help for events

Global Flags:
  -f, --facility string   used to build grpc and http urls

tink workflow events [id] takes an arguments
Error: tink workflow state [id] [flags] requires an argument
Usage:
  tink workflow state [id] [flags]

Examples:
tink workflow state [id]

Flags:
  -h, --help   help for state

Global Flags:
  -f, --facility string   used to build grpc and http urls

tink workflow state [id] [flags] requires an argument

A little further investigation shows:

$ cat /vagrant/compose/create-tink-records/workflow/workflow_id.txt
cat: /vagrant/compose/create-tink-records/workflow/workflow_id.txt: No such file or directory

I am still working through more of this PR.

mmlb · 2022-03-15T16:49:33Z

Hmm....there's a lot in this PR. step 4 in the vagrant-virtualbox quickstart failed for me with the following:

Thanks for linking this, I have not gone through that file top-to-bottom, so not surprised something isn't working. I'll see what else is wrong.

I am still working through more of this PR.

Thanks for looking 😄

jacobweinstock · 2022-03-15T16:56:06Z

the machine1 vm, in the vagrant-virtualbox quickstart, is successfully booting into Hook, but failing to pull container images and therefore failing to run any workflows on the machine1.

mmlb · 2022-03-15T16:57:21Z

the machine1 vm, in the vagrant-virtualbox quickstart, is successfully booting into Hook, but failing to pull container images and therefore failing to run any workflows on the machine1.

bummer, guess I'll have to dust off a machine and fire up virtualbox to debug.

This is better than calling `provisioner.vm.provision :docker_compose` because it allows the user to continue messing with the compose stack after the first run and not lose the variables. Before this change the manifests would definitely wrong if using libvirt on subsequent interactions. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

I noticed that I could avoid provision_provisioner by just defining the manifest_suffix in the per-provider override block and then just use it afterwards in common code. Then I moved all the provision logic together because it doesn't really make sense to have it all over the place. This included `$script` since there's no point in it being a global variable if its only used in the provisioner provision. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

Nothing else is needed in vagrant so let's not sync them. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

Alias of dc=docker-compose alias for 700% effeciency gains. Alais tink for even more gains! Automatically cd'ing to /vagrant/compose for interactive logins because that probably makes a lot of sense. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

osie/hook stuff on its own, followed by container images, then manifests and finally mac/ip stuff. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

This way we have long running services separated from one-off executions so its quicker to navigate through the file. Within each "section" the services are ordered alphabetically, once again for quicker navigation. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

Scanning/navigation/knowing where you are wrt the rest of the file is just easier if things are sorted. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

Lets organize it by having service container image/exec fields first (including volumes needed), then its "outputs" (ports), then runtime monitoring. This seems to keep the important/usefull info up top of the service definition vs having to search around (in different relative locations in different services...). Signed-off-by: Manuel Mendez <mmendez@equinix.com>

This is easier to copy/paste, and I'l soon be messing with these so that they won't be so long that we feel the need to break them up into multiple lines. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

This way we don't have to keep the mapping in context, the service name is the directory name. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

Signed-off-by: Manuel Mendez <mmendez@equinix.com>

This is a pretty big commit, which I don't like, but I don't feel like splitting up, its easier to read as a whole vs as a series of commits (IMO). The changes are as follows: Get rid of `exec_in_bash`: tink-cli recently added bash to the image, along with adding the needed packages lazily allows us to let shebang lines do its thing. Use `set -euxo pipefail` instead of just `set -xo pipefail`: -o pipefail w/o -e is basically useless since it won't stop execution, just stops the command early, and -u adds to the safety. Use jq to modify json files instead of sed: This is just easier to read imo and is much less brittle. Output modifications to stdout: This way we can see what happened using docker-compose logs. Use tmpfiles + atomic renames: Avoids having a "corrupt" file if we interrupt in the middle modification. Not really a big deal but good to do imo. Redirect `tink` stderr to /dev/null: I don't care to see useless debug messages when I'm trying to debug other things. Simplify the IMG_URL modification: Only modify the IP address, and avoid assuming the name of the image which is better for future modification. Only create template, workflow entries if needed: This way we don't have to drop all the records just to run a new workflow for example. Overall I think this ends up being a very nice simplification of the create-tink-records implementation. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

mmlb · 2022-03-15T22:08:43Z

the machine1 vm, in the vagrant-virtualbox quickstart, is successfully booting into Hook, but failing to pull container images and therefore failing to run any workflows on the machine1.

bummer, guess I'll have to dust off a machine and fire up virtualbox to debug.

@jacobweinstock can you fetch and try again? I had a typo before that I've fixed. I don't think that would explain what you saw though. I was able to try with virtualbox on mac just now and the workflow completed successfully. Happy to report the ubuntu virtualbox VM still has networking and did in fact grow its partition.

nshalman

Code keeps changing but I was trying to use it just now and the terraform code is throwing
null_resource.setup (remote-exec): bash: generate-tls-certs/trust.sh: No such file or directory which make me wonder if this is pulling in some of the no more tls stuff, but in an incomplete way?

deploy/compose/.env

mmlb · 2022-03-16T18:27:03Z

Code keeps changing but I was trying to use it just now and the terraform code is throwing null_resource.setup (remote-exec): bash: generate-tls-certs/trust.sh: No such file or directory which make me wonder if this is pulling in some of the no more tls stuff, but in an incomplete way?

I figured out the trust.sh wasn't actually necessary so dropped it, but missed the ref in main.tf. Should be fixed now.

Another commit where more than one thing is happening but I don't really feel like splitting into many. Changes are as follows: Use `set -euxo pipefail` instead of just `set -xo pipefail`: -o pipefail w/o -e is basically useless since it won't stop execution, just stops the command early, and -u adds to the safety. Get rid of code for tinkerbell/osie: This simplifies almost all of the file since hook is pretty much ready to go as is. Get rid of the whole indexing stuff: No need for indices when the file names are unique anyway. If there's no indices the filenames alone work and we don't need to figure out how to "build" up a filename for the downloaded file. I noticed that the /destination parameter is never used so dropped it. All in all, a *very* nice simplification/cleanup of this code. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

Another commit where more than one thing is happening but I don't really feel like splitting into many. Changes are as follows: Use `set -euo pipefail` instead of just `set -o pipefail`: -o pipefail w/o -e is basically useless since it won't stop execution, just stops the command early, and -u adds to the safety. I got rid of -x because it was making the script pretty noisy. Nicer log/debug messages: It was kind of hard to see what skopeo was doing for each sync call so I added some nicer to find messages. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

Another commit where more than one thing is happening but I don't really feel like splitting into many. Changes are as follows: Use bash:4.4 instead of ubuntu container image: No need for ubuntu image when alpine works. Get rid of most of the functions for simple command invocations: Most of the functions were just function arg setup that would then do a simple call to the binary. No need for all the over head when just reusing variables and calling the binaries work. I got rid of the output file arg since a simple file extension swap seemed to be the way to go, if we ever want something else I'd say just pipe to stdout. Use pigz instead of gzip: pigz is a better gzip (works in parallel) which is helpful when converting large files. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

This way changes do not need to involve a human mentally ignoring it. This also paves the way for mounting volumes as read-only. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

Lets get rid of explicit `rw` as thats default anyway and its confusing to have some `:rw` and some not specified, but both are `rw`. I've gone and made all source script mounts be `ro` so that none of the containers modify any of the files in git. I've also made the auth and certs volumes ro for the services that don't need to modify any data. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

Was only used by vagrant and it seems to not actually be needed. I was able to bring up the stack from scratch and run the ubuntu.yaml template to completion without it. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

This way we don't need to hard code any nic names or update if things change. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

Not sure why only the EM template got this, the VMs can use it too. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

nshalman

This is some really nice work.
I think I need to do another round of testing of this code before I approve. Random thoughts sprinkled in comments.

Update: Tested with Vagrant / Libvirt.

nshalman · 2022-03-18T14:32:24Z

deploy/compose/docker-compose.yml

+  ##### Actual services first #####
+  boots:
+    image: ${BOOTS_SERVER_IMAGE}
+    command: -dhcp-addr 0.0.0.0:67 -tftp-addr $TINKERBELL_HOST_IP:69 -http-addr $TINKERBELL_HOST_IP:80 -log-level DEBUG


I imagine this is correct for the current boots image being used, but there will be changes when we move to the latest boots.

yep. I think I want to do a bump-to-latest-versions after this PR and a TF one I've got lined up.

nshalman · 2022-03-18T14:33:29Z

deploy/compose/fetch-and-convert-ubuntu-img/fetch.sh

@@ -0,0 +1,21 @@
+#!/usr/bin/env bash
+# This script is designed to download a cloud image file (.img) and then convert it to a .raw.gz file.
+# This is purpose built so non-raw cloud image files can be used with the "image2disk" action.


Some day if we can get qemuimg2disk action images shipped it would be nice to not need this step.

nshalman · 2022-03-18T14:36:15Z

deploy/vagrant/tink

@@ -0,0 +1,3 @@
+#!/usr/bin/env bash
+
+exec docker-compose -f /vagrant/compose/docker-compose.yml exec tink-cli tink "$@"


It might be nice to ship something similar for the terraform side too. No action needed in this PR.

you've read my mind. I've got some more changes to the tf setup I did yesterday that will have this and the cd for interactive logins too. Its coming. :D

#126 needs a test against libvirt, virtualbox, and tf before making it ready but 👁️ if you'd like.

nshalman · 2022-03-18T14:36:50Z

docs/CONTRIBUTING.md

@@ -54,37 +54,36 @@ The sandbox architecture can be broken down into 3 distinct groups.

 2. Support Services

-   - [OSIE/Hook File Server](https://docs.tinkerbell.org/services/osie/)
+   - [Hook File Server](https://docs.tinkerbell.org/services/osie/)


I think this change suggests we need to ship Hook docs sometime soon...

Yep all the docs need updating re hook.

The differences between libvirt and virtualbox are actually pretty minor and not worth carrying different files for each. The template just needs sda -> vda and the hardware.json differed only by the gateway, which seems isn't even necessary. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

mmlb added 7 commits March 14, 2022 18:50

nix: Format shell.nix with nixfmt

ddced19

Signed-off-by: Manuel Mendez <mmendez@equinix.com>

nix: Add docker-compose package

d8944b5

We can totally test docker-compose stuff without vagrant or terraform so lets add it to shell.nix to ensure its around. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

nix: Add rufo

136ea9c

Signed-off-by: Manuel Mendez <mmendez@equinix.com>

vagrant: Format Vagrantfile with rufo

5d81f49

Signed-off-by: Manuel Mendez <mmendez@equinix.com>

ci: Check Vagrantfile format

4d093a0

Signed-off-by: Manuel Mendez <mmendez@equinix.com>

compose: Get rid of quotes around values in .env file

5f0d931

It's unnecessary and makes some changes I've got lined up to Vagrantfile more complex. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

mmlb requested review from jacobweinstock and nshalman March 15, 2022 16:14

mmlb marked this pull request as ready for review March 15, 2022 16:14

mmlb added the ci-check/vagrant-setup This label trigger a GitHub action that tests the Vagrant Setup guide https://tinkerbell.org/setup/ label Mar 15, 2022

mmlb added 9 commits March 15, 2022 14:32

vagrant: Sync only the compose folder

bbaa534

Nothing else is needed in vagrant so let's not sync them. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

compose: Reorganize env vars to be in common groups

33e05c1

osie/hook stuff on its own, followed by container images, then manifests and finally mac/ip stuff. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

compose: Sort all the things (that make sense)

198a9a5

Scanning/navigation/knowing where you are wrt the rest of the file is just easier if things are sorted. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

compose: Unjsonify command attributes

0f9ed2a

This is easier to copy/paste, and I'l soon be messing with these so that they won't be so long that we feel the need to break them up into multiple lines. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

mmlb force-pushed the lots-of-refactors-and-cleanups branch from 1e59400 to 8b51b8e Compare March 15, 2022 19:09

mmlb removed the ci-check/vagrant-setup This label trigger a GitHub action that tests the Vagrant Setup guide https://tinkerbell.org/setup/ label Mar 15, 2022

mmlb added 3 commits March 15, 2022 16:21

compose: Rename local directories to match service name

7c3578a

This way we don't have to keep the mapping in context, the service name is the directory name. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

compose: Update tink-cli image to version with bash

20656e9

Signed-off-by: Manuel Mendez <mmendez@equinix.com>

mmlb force-pushed the lots-of-refactors-and-cleanups branch from 8b51b8e to 700e5f8 Compare March 15, 2022 20:22

mmlb added ci-check/vagrant-setup This label trigger a GitHub action that tests the Vagrant Setup guide https://tinkerbell.org/setup/ and removed ci-check/vagrant-setup This label trigger a GitHub action that tests the Vagrant Setup guide https://tinkerbell.org/setup/ labels Mar 15, 2022

mmlb force-pushed the lots-of-refactors-and-cleanups branch from 700e5f8 to bb10656 Compare March 16, 2022 14:25

nshalman reviewed Mar 16, 2022

View reviewed changes

deploy/compose/.env Outdated Show resolved Hide resolved

mmlb force-pushed the lots-of-refactors-and-cleanups branch from bb10656 to 024165e Compare March 16, 2022 18:24

mmlb added 8 commits March 16, 2022 14:29

compose: Update scripts so they do not mess with git tracked files

d701431

This way changes do not need to involve a human mentally ignoring it. This also paves the way for mounting volumes as read-only. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

compose: Drop generate-tls-certs/trust.sh

1453052

Was only used by vagrant and it seems to not actually be needed. I was able to bring up the stack from scratch and run the ubuntu.yaml template to completion without it. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

compose: Make use netplan's nic glob support

e0a0bd7

This way we don't need to hard code any nic names or update if things change. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

compose: Add grow-partition to vagrant templates

92e8d1c

Not sure why only the EM template got this, the VMs can use it too. Signed-off-by: Manuel Mendez <mmendez@equinix.com>

mmlb force-pushed the lots-of-refactors-and-cleanups branch from 024165e to 1cbd643 Compare March 16, 2022 18:29

nshalman reviewed Mar 18, 2022

View reviewed changes

nshalman approved these changes Mar 18, 2022

View reviewed changes

mmlb removed the request for review from jacobweinstock March 18, 2022 16:33

mmlb force-pushed the lots-of-refactors-and-cleanups branch from 1cbd643 to a99f1f1 Compare March 18, 2022 16:40

mmlb added the ready-to-merge Signal to Mergify to merge the PR. label Mar 18, 2022

mergify bot merged commit d828f14 into tinkerbell:main Mar 18, 2022

mmlb deleted the lots-of-refactors-and-cleanups branch March 18, 2022 16:44

displague mentioned this pull request Mar 22, 2022

fix terraform resource paths, use templatefile, update metal version #96

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lots of refactors and cleanups #124

Lots of refactors and cleanups #124

mmlb commented Mar 14, 2022 •

edited

jacobweinstock commented Mar 15, 2022

mmlb commented Mar 15, 2022

jacobweinstock commented Mar 15, 2022

mmlb commented Mar 15, 2022

mmlb commented Mar 15, 2022 •

edited

nshalman left a comment

mmlb commented Mar 16, 2022

nshalman left a comment •

edited

nshalman Mar 18, 2022

mmlb Mar 18, 2022

nshalman Mar 18, 2022

nshalman Mar 18, 2022

mmlb Mar 18, 2022

mmlb Mar 18, 2022

nshalman Mar 18, 2022

mmlb Mar 18, 2022

		@@ -0,0 +1,3 @@
		#!/usr/bin/env bash

		exec docker-compose -f /vagrant/compose/docker-compose.yml exec tink-cli tink "$@"

Lots of refactors and cleanups #124

Lots of refactors and cleanups #124

Conversation

mmlb commented Mar 14, 2022 • edited

Description

Why is this needed

How Has This Been Tested?

How are existing users impacted? What migration steps/scripts do we need?

Checklist:

jacobweinstock commented Mar 15, 2022

mmlb commented Mar 15, 2022

jacobweinstock commented Mar 15, 2022

mmlb commented Mar 15, 2022

mmlb commented Mar 15, 2022 • edited

nshalman left a comment

Choose a reason for hiding this comment

mmlb commented Mar 16, 2022

nshalman left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mmlb commented Mar 14, 2022 •

edited

mmlb commented Mar 15, 2022 •

edited

nshalman left a comment •

edited