Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix empty docker-compose in basebox #81

Merged
merged 10 commits into from
Apr 29, 2021
Merged

Fix empty docker-compose in basebox #81

merged 10 commits into from
Apr 29, 2021

Conversation

mmlb
Copy link
Contributor

@mmlb mmlb commented Apr 27, 2021

Description

Ensures docker-compose is correctly downloaded.
Also adds some better debuggability to setup.sh and the vagrant provision script.
A bunch of misc clean ups following the boy scout rule (leave things better than you found them)

Why is this needed

Fixes: #59

How Has This Been Tested?

vagrant up provisioner now works

How are existing users impacted? What migration steps/scripts do we need?

Fixes a bug where the vagrant sandbox wasn't working.

Checklist:

I have:

  • updated the documentation and/or roadmap (if required)
  • added unit or e2e tests
  • provided instructions on how to upgrade

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
More in line with the rest of scripts and is easier to mentally parse.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Both [[ ]] and (( )) bashisms are better than the alternative
in POSIX sh, since they are builtin and don't suffer from quoting
or number-of-args issues.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Indentation is helpful to know a function's scope. Not indenting the
heredoc makes scanning harder.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
Better for adding/removing things this way.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
pipefail for more safety and xtrace for better debuggability.
The missing xtrace here is likely what led to the docker-compose
issue going unfixed for so long as the last bit of output was
from the gencerts container and did not make any sense (because it
wasn't the issue :D ).

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
The tinkerbell.sh script ends up doing some other work after
calling setup.sh and has set -x enabled so the whats_next message
is likely to be missed. So now save it for later reading as the last
thing done.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
@mmlb mmlb requested review from gianarb and nshalman April 27, 2021 20:10
@mmlb mmlb added the do-not-merge Signal to Mergify to block merging of the PR. label Apr 27, 2021
@mmlb
Copy link
Contributor Author

mmlb commented Apr 27, 2021

Note, this PR won't actually resolve anything for 2 reasons:

  1. We need to build a new basebox and update vagrant cloud and update this Vagrantfile with the new version.
  2. I can't actually promise that my basebox provision.sh script actually works. I haven't been able to figure out how to get packer to correctly build my box. I'm still seeing the empty docker-compose file even when I take drastic measures to ensure vagrant uses my built box.

@mmlb
Copy link
Contributor Author

mmlb commented Apr 27, 2021

Here's what I'm seeing (some extra debugging thrown into provision.sh for ... debugging):

==> vagrant-libvirt: + setup_docker_compose
==> vagrant-libvirt: + which docker-compose
==> vagrant-libvirt: + :
==> vagrant-libvirt: + ls -l /usr/local/bin/docker-compose
==> vagrant-libvirt: ls: cannot access '/usr/local/bin/docker-compose': No such file or directory
==> vagrant-libvirt: + :
==> vagrant-libvirt: + local name url
==> vagrant-libvirt: ++ uname -s
==> vagrant-libvirt: ++ uname -m
==> vagrant-libvirt: + name=docker-compose-Linux-x86_64
==> vagrant-libvirt: + url=https://github.com/docker/compose/releases/download/1.26.0/docker-compose-Linux-x86_64
==> vagrant-libvirt: + curl -fsSLO https://github.com/docker/compose/releases/download/1.26.0/docker-compose-Linux-x86_64
==> vagrant-libvirt: + curl -fsSLO https://github.com/docker/compose/releases/download/1.26.0/docker-compose-Linux-x86_64.sha256
==> vagrant-libvirt: + sha256sum -c
    vagrant-libvirt: docker-compose-Linux-x86_64: OK
==> vagrant-libvirt: + rm -f docker-compose-Linux-x86_64.sha256
==> vagrant-libvirt: + chmod +x docker-compose-Linux-x86_64
==> vagrant-libvirt: + sudo mv docker-compose-Linux-x86_64 /usr/local/bin/docker-compose
==> vagrant-libvirt: + docker-compose -h
    vagrant-libvirt: Define and run multi-container applications with Docker.
    vagrant-libvirt:
    vagrant-libvirt: Usage:
    vagrant-libvirt:   docker-compose [-f <arg>...] [options] [COMMAND] [ARGS...]
    vagrant-libvirt:   docker-compose -h|--help
    vagrant-libvirt:
    vagrant-libvirt: Options:
    vagrant-libvirt:   -f, --file FILE             Specify an alternate compose file
    vagrant-libvirt:                               (default: docker-compose.yml)
    vagrant-libvirt:   -p, --project-name NAME     Specify an alternate project name
    vagrant-libvirt:                               (default: directory name)
    vagrant-libvirt:   -c, --context NAME          Specify a context name
    vagrant-libvirt:   --verbose                   Show more output
    vagrant-libvirt:   --log-level LEVEL           Set log level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
    vagrant-libvirt:   --no-ansi                   Do not print ANSI control characters
    vagrant-libvirt:   -v, --version               Print version and exit
    vagrant-libvirt:   -H, --host HOST             Daemon socket to connect to
    vagrant-libvirt:
    vagrant-libvirt:   --tls                       Use TLS; implied by --tlsverify
    vagrant-libvirt:   --tlscacert CA_PATH         Trust certs signed only by this CA
    vagrant-libvirt:   --tlscert CLIENT_CERT_PATH  Path to TLS certificate file
    vagrant-libvirt:   --tlskey TLS_KEY_PATH       Path to TLS key file
    vagrant-libvirt:   --tlsverify                 Use TLS and verify the remote
    vagrant-libvirt:   --skip-hostname-check       Don't check the daemon's hostname against the
    vagrant-libvirt:                               name specified in the client certificate
    vagrant-libvirt:   --project-directory PATH    Specify an alternate working directory
    vagrant-libvirt:                               (default: the path of the Compose file)
    vagrant-libvirt:   --compatibility             If set, Compose will attempt to convert keys
    vagrant-libvirt:                               in v3 files to their non-Swarm equivalent
    vagrant-libvirt:   --env-file PATH             Specify an alternate environment file
    vagrant-libvirt:
    vagrant-libvirt: Commands:
    vagrant-libvirt:   build              Build or rebuild services
    vagrant-libvirt:   config             Validate and view the Compose file
    vagrant-libvirt:   create             Create services
    vagrant-libvirt:   down               Stop and remove containers, networks, images, and volumes
    vagrant-libvirt:   events             Receive real time events from containers
    vagrant-libvirt:   exec               Execute a command in a running container
    vagrant-libvirt:   help               Get help on a command
    vagrant-libvirt:   images             List images
    vagrant-libvirt:   kill               Kill containers
    vagrant-libvirt:   logs               View output from containers
    vagrant-libvirt:   pause              Pause services
    vagrant-libvirt:   port               Print the public port for a port binding
    vagrant-libvirt:   ps                 List containers
    vagrant-libvirt:   pull               Pull service images
    vagrant-libvirt:   push               Push service images
    vagrant-libvirt:   restart            Restart services
    vagrant-libvirt:   rm                 Remove stopped containers
    vagrant-libvirt:   run                Run a one-off command
    vagrant-libvirt:   scale              Set number of containers for a service
    vagrant-libvirt:   start              Start services
    vagrant-libvirt:   stop               Stop services
    vagrant-libvirt:   top                Display the running processes
    vagrant-libvirt:   unpause            Unpause services
    vagrant-libvirt:   up                 Create and start containers
    vagrant-libvirt:   version            Show the Docker-Compose version information
==> vagrant-libvirt: + which docker-compose
    vagrant-libvirt: /usr/local/bin/docker-compose
==> vagrant-libvirt: + ls -l /usr/local/bin/docker-compose
    vagrant-libvirt: -rwxrwxr-x 1 vagrant vagrant 12254032 Apr 27 20:51 /usr/local/bin/docker-compose

from the packer build but vagrant up still fails 😭

    provisioner: + check_command docker-compose
    provisioner: + command_exists docker-compose
    provisioner: + command -v docker-compose
    provisioner: ++ which docker-compose
    provisioner: + [[ -s /usr/local/bin/docker-compose ]]
    provisioner: + echo 'ERROR: Prerequisite command not installed: docker-compose'
    provisioner: ERROR: Prerequisite command not installed: docker-compose
    provisioner: + return 1
    provisioner: + failed=1

(I added a check to command_exists to check the size > 0)

The steps I've done is basically:

vagrant box remove $(vagrant box list | awk '{print $1}')
sudo virsh vol-list default |awk '/.img/ {print $1}' | xargs sudo virsh vol-delete --pool default %
git clean -fxd
packer build --parallel-builds=1 template.json
vagrant box add develop output-vagrant-libvirt/package.box
vagrant destroy -f; vagrant up --provision --no-destroy-on-error provisioner

Note: yes I changed the sandbox's Vagrantfile's box to just develop just to make sure it doesn't pull from vagrant cloud

Can some one please try to reproduce or point out what I'm doing wrong?

@mmlb mmlb added the ci-check/vagrant-setup This label trigger a GitHub action that tests the Vagrant Setup guide https://tinkerbell.org/setup/ label Apr 27, 2021
@gianarb
Copy link
Contributor

gianarb commented Apr 28, 2021

I tried this PR with Vagrant+Virtualbox and it works.
I mentioned multiple times the e2e test on sandbox runs on Vagrant+Virtualbox, so my feeling is that it is a libvirt specific problem. I opened this issue to see if we can get libvirt in our self hosted runners to run the test twice, with libvirt as well tinkerbell/infrastructure#10

This fixes the vagrant based sandbox from not working. This was particularly
annoying to track down because of not having `set -x` in `setup.sh` but
what looks like xtrace output in stderr. The xtrace output on stderr
was actually from the `generate_certificates` container:

```
    provisioner: 2021/04/26 21:22:32 [INFO] signed certificate with serial number 142120228981443865252746731124927082232998754394
    provisioner: + cat
    provisioner:  server.pem
    provisioner:  ca.pem
    provisioner: + cmp
    provisioner:  -s
    provisioner:  bundle.pem.tmp
    provisioner:  bundle.pem
    provisioner: + mv
    provisioner:  bundle.pem.tmp
    provisioner:  bundle.pem
    provisioner: Error: No such object:
==> provisioner: Clearing any previously set forwarded ports...
==> provisioner: Removing domain...
The SSH command responded with a non-zero exit status. Vagrant
assumes that this means the command failed. The output for this command
should be in the log above. Please read the output to determine what
went wrong.
```
I ended up doubting the `if ! cmp` blocks until I added `set -euxo pipefail` and
the issue was pretty obviously in docker-compose land.

```
$ vagrant destroy -f; vagrant up provisioner
==> worker: Domain is not created. Please run `vagrant up` first.
==> provisioner: Domain is not created. Please run `vagrant up` first.
Bringing machine 'provisioner' up with 'libvirt' provider...
==> provisioner: Checking if box 'tinkerbelloss/sandbox-ubuntu1804' version '0.1.0' is up to date...
==> provisioner: Creating image (snapshot of base box volume).
==> provisioner: Creating domain with the following settings...
...
    provisioner: 2021/04/27 18:20:13 [INFO] signed certificate with serial number 138080403356863347716407921665793913032297783787
    provisioner: + cat server.pem ca.pem
    provisioner: + cmp -s bundle.pem.tmp bundle.pem
    provisioner: + mv bundle.pem.tmp bundle.pem
    provisioner: + local certs_dir=/etc/docker/certs.d/192.168.1.1
    provisioner: + cmp --quiet /vagrant/deploy/state/certs/ca.pem /vagrant/deploy/state/webroot/workflow/ca.pem
    provisioner: + cp /vagrant/deploy/state/certs/ca.pem /vagrant/deploy/state/webroot/workflow/ca.pem
    provisioner: + cmp --quiet /vagrant/deploy/state/certs/ca.pem /etc/docker/certs.d/192.168.1.1/tinkerbell.crt
    provisioner: + [[ -d /etc/docker/certs.d/192.168.1.1/ ]]
    provisioner: + cp /vagrant/deploy/state/certs/ca.pem /etc/docker/certs.d/192.168.1.1/tinkerbell.crt
    provisioner: + setup_docker_registry
    provisioner: + local registry_images=/vagrant/deploy/state/registry
    provisioner: + [[ -d /vagrant/deploy/state/registry ]]
    provisioner: + mkdir -p /vagrant/deploy/state/registry
    provisioner: + start_registry
    provisioner: + docker-compose -f /vagrant/deploy/docker-compose.yml up --build -d registry
    provisioner: + check_container_status registry
    provisioner: + local container_name=registry
    provisioner: + local container_id
    provisioner: ++ docker-compose -f /vagrant/deploy/docker-compose.yml ps -q registry
    provisioner: + container_id=
    provisioner: + local start_moment
    provisioner: + local current_status
    provisioner: ++ docker inspect '' --format '{{ .State.StartedAt }}'
    provisioner: Error: No such object:
    provisioner: + start_moment=
    provisioner: + finish
    provisioner: + rm -rf /tmp/tmp.ve3XJ7qtgA
```

Notice that `container_id` is empty. This turns out to be because
`docker-compose` is an empty file!

```
vagrant@provisioner:/vagrant/deploy$ docker-compose up --build registry
vagrant@provisioner:/vagrant/deploy$ which docker-compose
/usr/local/bin/docker-compose
vagrant@provisioner:/vagrant/deploy$ docker-compose -h
vagrant@provisioner:/vagrant/deploy$ file /usr/local/bin/docker-compose
/usr/local/bin/docker-compose: empty
```

So with the following test patch:

```diff
diff --git a/deploy/vagrant/scripts/tinkerbell.sh b/deploy/vagrant/scripts/tinkerbell.sh
index 915f27f..dcb379c 100644
--- a/deploy/vagrant/scripts/tinkerbell.sh
+++ b/deploy/vagrant/scripts/tinkerbell.sh
@@ -34,6 +34,14 @@ setup_nat() (
 main() (
 	export DEBIAN_FRONTEND=noninteractive

+	local name=docker-compose-$(uname -s)-$(uname -m)
+	local url=https://github.com/docker/compose/releases/download/1.26.0/$name
+	curl -fsSLO "$url"
+	curl -fsSLO "$url.sha256"
+	sha256sum -c <"$name.sha256"
+	chmod +x "$name"
+	sudo mv "$name" /usr/local/bin/docker-compose
+
 	if ! [[ -f ./.env ]]; then
 		./generate-env.sh eth1 >.env
 	fi
```

We can try again and we're back to a working state:

```
$ vagrant destroy -f; vagrant up provisioner
==> worker: Domain is not created. Please run `vagrant up` first.
==> provisioner: Domain is not created. Please run `vagrant up` first.
Bringing machine 'provisioner' up with 'libvirt' provider...
==> provisioner: Checking if box 'tinkerbelloss/sandbox-ubuntu1804' version '0.1.0' is up to date...
==> provisioner: Creating image (snapshot of base box volume).
==> provisioner: Creating domain with the following settings...
...
    provisioner: + setup_docker_registry
    provisioner: + local registry_images=/vagrant/deploy/state/registry
    provisioner: + [[ -d /vagrant/deploy/state/registry ]]
    provisioner: + mkdir -p /vagrant/deploy/state/registry
    provisioner: + start_registry
    provisioner: + docker-compose -f /vagrant/deploy/docker-compose.yml up --build -d registry
    provisioner: Creating network "deploy_default" with the default driver
    provisioner: Creating volume "deploy_postgres_data" with default driver
    provisioner: Building registry
    provisioner: Step 1/7 : FROM registry:2.7.1
...
    provisioner: Successfully tagged deploy_registry:latest
    provisioner: Creating deploy_registry_1 ...
Creating deploy_registry_1 ... done
    provisioner: + check_container_status registry
    provisioner: + local container_name=registry
    provisioner: + local container_id
    provisioner: ++ docker-compose -f /vagrant/deploy/docker-compose.yml ps -q registry
    provisioner: + container_id=2e3d9557fd4c0d7f7e1c091b957a0033d23ebb93f6c8e5cdfeb8947b2812845c
...
    provisioner: + sudo -iu vagrant docker login --username=admin --password-stdin 192.168.1.1
    provisioner: WARNING! Your password will be stored unencrypted in /home/vagrant/.docker/config.json.
    provisioner: Configure a credential helper to remove this warning. See
    provisioner: https://docs.docker.com/engine/reference/commandline/login/#credentials-store
    provisioner: Login Succeeded
    provisioner: + set +x
    provisioner: NEXT:  1. Enter /vagrant/deploy and run: source ../.env; docker-compose up -d
    provisioner:        2. Try executing your fist workflow.
    provisioner:           Follow the steps described in https://tinkerbell.org/examples/hello-world/ to say 'Hello World!' with a workflow.
```

:toot:

Except that my results are not due to the way docker-compose is being installed
at all. After still running into this issue when using a box built with the new
install method I was still seeing empty docker-compose files. I ran a bunch of
experiments to try and figure out what is going on. The issue is strictly
in vagrant-libvirt since vagrant-virtualbox works fine. Turns out data isn't
being flushed back to disk at shutdown. Both calling `sync` or writing multiple
copies of the binary to the fs (3x at least) ended up working. Then I was informed
of a known vagrant-libvirt issue which matches this behavior, vagrant-libvirt/vagrant-libvirt#1013!

Fixes #59

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
This way we can better gaurd against empty files as seen
in the previous commits message.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
@mmlb mmlb added ci-check/vagrant-setup This label trigger a GitHub action that tests the Vagrant Setup guide https://tinkerbell.org/setup/ and removed ci-check/vagrant-setup This label trigger a GitHub action that tests the Vagrant Setup guide https://tinkerbell.org/setup/ labels Apr 28, 2021
@mmlb
Copy link
Contributor Author

mmlb commented Apr 28, 2021

@gianarb looks like the vagrant test is failing because of port conflict. Is there a vagrant guest still running on the box maybe?

@mmlb mmlb removed the do-not-merge Signal to Mergify to block merging of the PR. label Apr 28, 2021
@mmlb
Copy link
Contributor Author

mmlb commented Apr 28, 2021

@gianarb / @nshalman PTAL

@nshalman
Copy link
Member

@mmlb at this point is the sync the only major functionality change to fix the underlying issue, where the rest is all cleanup?

@mmlb
Copy link
Contributor Author

mmlb commented Apr 28, 2021

@nshalman yep

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
@nshalman
Copy link
Member

I have tested this and it worked for me.

Manual changes:

diff --git a/deploy/vagrant/Vagrantfile b/deploy/vagrant/Vagrantfile
index 61624ff..8a24ef7 100644
--- a/deploy/vagrant/Vagrantfile
+++ b/deploy/vagrant/Vagrantfile
@@ -25,8 +25,7 @@ end
 Vagrant.configure('2') do |config|

   config.vm.define :provisioner do |provisioner|
-    provisioner.vm.box = "tinkerbelloss/sandbox-ubuntu1804"
-    provisioner.vm.box_version = "0.1.0"
+    provisioner.vm.box = "develop"
     provisioner.vm.hostname = 'provisioner'
     provisioner.vm.synced_folder './../../', '/vagrant'
     provisioner.vm.provision :shell,

Commands run

[nix-shell:~/sandbox/deploy/vagrant/basebox/ubuntu1804]
$ packer build --parallel-builds=1 template.json
...
[nix-shell:~/sandbox/deploy/vagrant/basebox/ubuntu1804]
$ vagrant box add develop output-vagrant-libvirt/package.box
...
[nix-shell:~/sandbox/deploy/vagrant]
$ vagrant destroy -f; vagrant up --provision --no-destroy-on-error provisioner

@gianarb gianarb removed the ci-check/vagrant-setup This label trigger a GitHub action that tests the Vagrant Setup guide https://tinkerbell.org/setup/ label Apr 29, 2021
@gianarb gianarb added ci-check/vagrant-setup This label trigger a GitHub action that tests the Vagrant Setup guide https://tinkerbell.org/setup/ and removed ci-check/vagrant-setup This label trigger a GitHub action that tests the Vagrant Setup guide https://tinkerbell.org/setup/ labels Apr 29, 2021
@gianarb gianarb added the ready-to-merge Signal to Mergify to merge the PR. label Apr 29, 2021
@mergify mergify bot merged commit 4add7ee into master Apr 29, 2021
@mmlb mmlb deleted the fix-vagrant-deployment branch April 29, 2021 18:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-check/vagrant-setup This label trigger a GitHub action that tests the Vagrant Setup guide https://tinkerbell.org/setup/ ready-to-merge Signal to Mergify to merge the PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"vagrant up provisioner" failing on Ubuntu 20.04.2 with libvirtd backend
3 participants