Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lots of refactors and cleanups #124

Merged
merged 28 commits into from
Mar 18, 2022

Conversation

mmlb
Copy link
Contributor

@mmlb mmlb commented Mar 14, 2022

Description

A bunch of QoL fixes for compose and vagrant.

Why is this needed

I was having trouble following some of the things done in compose especially when comparing vagrant vs terraform. I had a hunch that somethings weren't necessary and set about to check and started unraveling some threads.

How Has This Been Tested?

Lots of:

vagrant destroy -f; vagrant up provisioner; ... wait ...; vagrant up machine1; reboot

throughout deving and at the end.

How are existing users impacted? What migration steps/scripts do we need?

Should be easier to contribute to sandbox as there's even less special-case-stuff in vagrant and in the compose dirs.
The "one-off-services" directories being named after the services themselves was nice to be able to go back and forth easily without having to a mental-map-lookup.

Checklist:

I have:

  • updated the documentation and/or roadmap (if required)
  • added unit or e2e tests
  • provided instructions on how to upgrade

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
We can totally test docker-compose stuff without vagrant or terraform
so lets add it to shell.nix to ensure its around.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Signed-off-by: Manuel Mendez <mmendez@equinix.com>
It's unnecessary and makes some changes I've got lined up to Vagrantfile more
complex.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Its completely unnecessary. Docker-compose handles relative files relative
to the project folder, which by default is the folder that contains the
docker-compose.yml file. This makes the docker-compose.yml easier to parse
for humans.

Verified with `docker-compose -f $file config`, in and out of vagrant, in and
out of the compose dir, and by running a vagrant run.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
@mmlb mmlb marked this pull request as ready for review March 15, 2022 16:14
@mmlb mmlb added the ci-check/vagrant-setup This label trigger a GitHub action that tests the Vagrant Setup guide https://tinkerbell.org/setup/ label Mar 15, 2022
@jacobweinstock
Copy link
Member

Hmm....there's a lot in this PR. step 4 in the vagrant-virtualbox quickstart failed for me with the following:

Every 2.0s: tink workflow events ; tink workflow state                                                                                                                              2022-03-15 16:38:19

Error: tink workflow events [id] takes an arguments
Usage:
  tink workflow events [id]

Examples:
tink workflow events [id]

Flags:
  -h, --help   help for events

Global Flags:
  -f, --facility string   used to build grpc and http urls

tink workflow events [id] takes an arguments
Error: tink workflow state [id] [flags] requires an argument
Usage:
  tink workflow state [id] [flags]

Examples:
tink workflow state [id]

Flags:
  -h, --help   help for state

Global Flags:
  -f, --facility string   used to build grpc and http urls

tink workflow state [id] [flags] requires an argument

A little further investigation shows:

$ cat /vagrant/compose/create-tink-records/workflow/workflow_id.txt
cat: /vagrant/compose/create-tink-records/workflow/workflow_id.txt: No such file or directory

I am still working through more of this PR.

@mmlb
Copy link
Contributor Author

mmlb commented Mar 15, 2022

Hmm....there's a lot in this PR. step 4 in the vagrant-virtualbox quickstart failed for me with the following:

Thanks for linking this, I have not gone through that file top-to-bottom, so not surprised something isn't working. I'll see what else is wrong.

I am still working through more of this PR.

Thanks for looking 😄

@jacobweinstock
Copy link
Member

the machine1 vm, in the vagrant-virtualbox quickstart, is successfully booting into Hook, but failing to pull container images and therefore failing to run any workflows on the machine1.

image

@mmlb
Copy link
Contributor Author

mmlb commented Mar 15, 2022

the machine1 vm, in the vagrant-virtualbox quickstart, is successfully booting into Hook, but failing to pull container images and therefore failing to run any workflows on the machine1.

image

bummer, guess I'll have to dust off a machine and fire up virtualbox to debug.

This is better than calling `provisioner.vm.provision :docker_compose` because
it allows the user to continue messing with the compose stack after the first
run and not lose the variables. Before this change the manifests would
definitely wrong if using libvirt on subsequent interactions.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
I noticed that I could avoid provision_provisioner by just defining the
manifest_suffix in the per-provider override block and then just use it
afterwards in common code. Then I moved all the provision logic together
because it doesn't really make sense to have it all over the place. This
included `$script` since there's no point in it being a global variable if
its only used in the provisioner provision.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Nothing else is needed in vagrant so let's not sync them.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Alias of dc=docker-compose alias for 700% effeciency gains.
Alais tink for even more gains!

Automatically cd'ing to /vagrant/compose for interactive logins because that
probably makes a lot of sense.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
osie/hook stuff on its own, followed by container images, then manifests and
finally mac/ip stuff.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
This way we have long running services separated from one-off executions
so its quicker to navigate through the file.

Within each "section" the services are ordered alphabetically, once again for
quicker navigation.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Scanning/navigation/knowing where you are wrt the rest of the file
is just easier if things are sorted.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Lets organize it by having service container image/exec fields first
(including volumes needed), then its "outputs" (ports), then runtime
monitoring. This seems to keep the important/usefull info up top of the
service definition vs having to search around (in different relative locations
in different services...).

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
This is easier to copy/paste, and I'l soon be messing with these so that they
won't be so long that we feel the need to break them up into multiple lines.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
@mmlb mmlb force-pushed the lots-of-refactors-and-cleanups branch from 1e59400 to 8b51b8e Compare March 15, 2022 19:09
@mmlb mmlb removed the ci-check/vagrant-setup This label trigger a GitHub action that tests the Vagrant Setup guide https://tinkerbell.org/setup/ label Mar 15, 2022
This way we don't have to keep the mapping in context, the service name is the
directory name.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Signed-off-by: Manuel Mendez <mmendez@equinix.com>
This is a pretty big commit, which I don't like, but I don't feel like
splitting up, its easier to read as a whole vs as a series of commits
(IMO). The changes are as follows:

Get rid of `exec_in_bash`:
tink-cli recently added bash to the image, along with adding the needed
packages lazily allows us to let shebang lines do its thing.

Use `set -euxo pipefail` instead of just `set -xo pipefail`:
-o pipefail w/o -e is basically useless since it won't stop execution, just
stops the command early, and -u adds to the safety.

Use jq to modify json files instead of sed:
This is just easier to read imo and is much less brittle.

Output modifications to stdout:
This way we can see what happened using docker-compose logs.

Use tmpfiles + atomic renames:
Avoids having a "corrupt" file if we interrupt in the middle modification. Not
really a big deal but good to do imo.

Redirect `tink` stderr to /dev/null:
I don't care to see useless debug messages when I'm trying to debug other things.

Simplify the IMG_URL modification:
Only modify the IP address, and avoid assuming the name of the image which is
better for future modification.

Only create template, workflow entries if needed:
This way we don't have to drop all the records just to run a new workflow for
example.

Overall I think this ends up being a very nice simplification of the
create-tink-records implementation.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
@mmlb mmlb force-pushed the lots-of-refactors-and-cleanups branch from 8b51b8e to 700e5f8 Compare March 15, 2022 20:22
@mmlb
Copy link
Contributor Author

mmlb commented Mar 15, 2022

the machine1 vm, in the vagrant-virtualbox quickstart, is successfully booting into Hook, but failing to pull container images and therefore failing to run any workflows on the machine1.

bummer, guess I'll have to dust off a machine and fire up virtualbox to debug.

@jacobweinstock can you fetch and try again? I had a typo before that I've fixed. I don't think that would explain what you saw though. I was able to try with virtualbox on mac just now and the workflow completed successfully. Happy to report the ubuntu virtualbox VM still has networking and did in fact grow its partition.

@mmlb mmlb added ci-check/vagrant-setup This label trigger a GitHub action that tests the Vagrant Setup guide https://tinkerbell.org/setup/ and removed ci-check/vagrant-setup This label trigger a GitHub action that tests the Vagrant Setup guide https://tinkerbell.org/setup/ labels Mar 15, 2022
@mmlb mmlb force-pushed the lots-of-refactors-and-cleanups branch from 700e5f8 to bb10656 Compare March 16, 2022 14:25
Copy link
Member

@nshalman nshalman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code keeps changing but I was trying to use it just now and the terraform code is throwing
null_resource.setup (remote-exec): bash: generate-tls-certs/trust.sh: No such file or directory which make me wonder if this is pulling in some of the no more tls stuff, but in an incomplete way?

deploy/compose/.env Outdated Show resolved Hide resolved
@mmlb mmlb force-pushed the lots-of-refactors-and-cleanups branch from bb10656 to 024165e Compare March 16, 2022 18:24
@mmlb
Copy link
Contributor Author

mmlb commented Mar 16, 2022

Code keeps changing but I was trying to use it just now and the terraform code is throwing null_resource.setup (remote-exec): bash: generate-tls-certs/trust.sh: No such file or directory which make me wonder if this is pulling in some of the no more tls stuff, but in an incomplete way?

I figured out the trust.sh wasn't actually necessary so dropped it, but missed the ref in main.tf. Should be fixed now.

Another commit where more than one thing is happening but I don't really
feel like splitting into many. Changes are as follows:

Use `set -euxo pipefail` instead of just `set -xo pipefail`:
-o pipefail w/o -e is basically useless since it won't stop execution, just
stops the command early, and -u adds to the safety.

Get rid of code for tinkerbell/osie:
This simplifies almost all of the file since hook is pretty much ready to go
as is.

Get rid of the whole indexing stuff:
No need for indices when the file names are unique anyway. If there's no indices
the filenames alone work and we don't need to figure out how to "build" up a
filename for the downloaded file.

I noticed that the /destination parameter is never used so dropped it.

All in all, a *very* nice simplification/cleanup of this code.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Another commit where more than one thing is happening but I don't really
feel like splitting into many. Changes are as follows:

Use `set -euo pipefail` instead of just `set -o pipefail`:
-o pipefail w/o -e is basically useless since it won't stop execution, just
stops the command early, and -u adds to the safety. I got rid of -x because
it was making the script pretty noisy.

Nicer log/debug messages:
It was kind of hard to see what skopeo was doing for each sync call so I added
some nicer to find messages.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Another commit where more than one thing is happening but I don't really
feel like splitting into many. Changes are as follows:

Use bash:4.4 instead of ubuntu container image:
No need for ubuntu image when alpine works.

Get rid of most of the functions for simple command invocations:
Most of the functions were just function arg setup that would then do a simple
call to the binary. No need for all the over head when just reusing variables
and calling the binaries work.

I got rid of the output file arg since a simple file extension swap seemed to be
the way to go, if we ever want something else I'd say just pipe to stdout.

Use pigz instead of gzip:
pigz is a better gzip (works in parallel) which is helpful when converting
large files.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
This way changes do not need to involve a human mentally ignoring it. This also
paves the way for mounting volumes as read-only.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Lets get rid of explicit `rw` as thats default anyway and its confusing to have
some `:rw` and some not specified, but both are `rw`. I've gone and made all
source script mounts be `ro` so that none of the containers modify any of the
files in git. I've also made the auth and certs volumes ro for the services
that don't need to modify any data.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Was only used by vagrant and it seems to not actually be needed. I was able to
bring up the stack from scratch and run the ubuntu.yaml template to completion
without it.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
This way we don't need to hard code any nic names or update if things change.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Not sure why only the EM template got this, the VMs can use it too.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
@mmlb mmlb force-pushed the lots-of-refactors-and-cleanups branch from 024165e to 1cbd643 Compare March 16, 2022 18:29
Copy link
Member

@nshalman nshalman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some really nice work.
I think I need to do another round of testing of this code before I approve. Random thoughts sprinkled in comments.

Update: Tested with Vagrant / Libvirt.

##### Actual services first #####
boots:
image: ${BOOTS_SERVER_IMAGE}
command: -dhcp-addr 0.0.0.0:67 -tftp-addr $TINKERBELL_HOST_IP:69 -http-addr $TINKERBELL_HOST_IP:80 -log-level DEBUG
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine this is correct for the current boots image being used, but there will be changes when we move to the latest boots.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep. I think I want to do a bump-to-latest-versions after this PR and a TF one I've got lined up.

@@ -0,0 +1,21 @@
#!/usr/bin/env bash
# This script is designed to download a cloud image file (.img) and then convert it to a .raw.gz file.
# This is purpose built so non-raw cloud image files can be used with the "image2disk" action.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some day if we can get qemuimg2disk action images shipped it would be nice to not need this step.

@@ -0,0 +1,3 @@
#!/usr/bin/env bash

exec docker-compose -f /vagrant/compose/docker-compose.yml exec tink-cli tink "$@"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to ship something similar for the terraform side too. No action needed in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you've read my mind. I've got some more changes to the tf setup I did yesterday that will have this and the cd for interactive logins too. Its coming. :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#126 needs a test against libvirt, virtualbox, and tf before making it ready but 👁️ if you'd like.

@@ -54,37 +54,36 @@ The sandbox architecture can be broken down into 3 distinct groups.

2. Support Services

- [OSIE/Hook File Server](https://docs.tinkerbell.org/services/osie/)
- [Hook File Server](https://docs.tinkerbell.org/services/osie/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change suggests we need to ship Hook docs sometime soon...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep all the docs need updating re hook.

@mmlb mmlb removed the request for review from jacobweinstock March 18, 2022 16:33
The differences between libvirt and virtualbox are actually pretty minor and not
worth carrying different files for each. The template just needs sda -> vda and
the hardware.json differed only by the gateway, which seems isn't even
necessary.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
@mmlb mmlb force-pushed the lots-of-refactors-and-cleanups branch from 1cbd643 to a99f1f1 Compare March 18, 2022 16:40
@mmlb mmlb added the ready-to-merge Signal to Mergify to merge the PR. label Mar 18, 2022
@mergify mergify bot merged commit d828f14 into tinkerbell:main Mar 18, 2022
@mmlb mmlb deleted the lots-of-refactors-and-cleanups branch March 18, 2022 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-to-merge Signal to Mergify to merge the PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants