Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman machine start fails after reboot (recurrence of #10824 but with the applehv provider.) #21288

Closed
kaorihinata opened this issue Jan 17, 2024 · 5 comments · Fixed by #21291
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@kaorihinata
Copy link
Contributor

Issue Description

It appears that the previous issue (#10824) which seemed to be specific to the qemu machine provider is now reoccurring with the applehv machine provider. Once you podman machine init and podman machine start, if you then reboot the host, $TMPDIR is purged, and the paths in .config/containers/podman/machine/applehv/podman-machine-default.json referring to the podman subdirectory of $TMPDIR become invalid. As requested in the previous ticket, I've tried podman machine stop before rebooting. The result is the same.

Steps to reproduce the issue

Steps to reproduce the issue

  1. Install podman and dependencies (vfkit, gvproxy, etc.)
  2. Enable the applehv provider via .config/containers/containers.conf.
  3. podman machine init --now (or init + start.)
  4. podman machine stop (optionally.)
  5. Reboot the host.
  6. podman machine start.

Describe the results you received

podman --log-level=debug machine start returns a failure to start gvproxy due to missing path. Output below:

INFO[0000] podman filtering at log level debug
DEBU[0000] Using Podman machine with `applehv` virtualization provider
DEBU[0000] connection refused: http://localhost:8081/vm/state
Starting machine "podman-machine-default"
DEBU[0000] connection refused: http://localhost:8081/vm/state
DEBU[0000] gvproxy binary being used: /Users/nn/.sandbox/libexec/podman/gvproxy
DEBU[0000] [-debug -mtu 1500 -ssh-port 51540 -listen-vfkit unixgram:///var/folders/cv/w6b1bgq95x35fzl9y6c10kl40000gn/T/podman/gvproxy.sock -forward-dest /run/user/501/podman/podman.sock -forward-user core -forward-identity /Users/nn/.ssh/podman-machine-default -forward-sock /Users/nn/.local/share/containers/podman/machine/applehv/podman.sock -pid-file /var/folders/cv/w6b1bgq95x35fzl9y6c10kl40000gn/T/podman/gvproxy.pid]
DEBU[0000] gvproxy unixgram socket "/var/folders/cv/w6b1bgq95x35fzl9y6c10kl40000gn/T/podman/gvproxy.sock" not found: stat /var/folders/cv/w6b1bgq95x35fzl9y6c10kl40000gn/T/podman/gvproxy.sock: no such file or directory
Error: gvproxy exited unexpectedly with exit code 1
DEBU[0000] Shutting down engines

Describe the results you expected

podman machine start returns successfully.

podman info output

OS: darwin/arm64
provider: applehv
version: 4.8.3

This issue pertains to a `podman machine start` failure, so the above is the only non-error output.

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

Host is macOS Sonoma (14.3).
Host architecture is aarch64/arm64.
The variant is the Apple M3 Pro which necessitated the use of the applehv provider (due to upstream qemu issues with EDK2 and the shim.)

Additional information

No response

@kaorihinata kaorihinata added the kind/bug Categorizes issue or PR as related to a bug. label Jan 17, 2024
@kaorihinata
Copy link
Contributor Author

As with #10824, I can confirm that simply shimming podman with a pre-check that creates "${TMPDIR%%/}/podman" works as a temporary workaround.

@kaorihinata
Copy link
Contributor Author

kaorihinata commented Jan 18, 2024

Ahh, there's no attempt to create the podman temporary directory if it's missing. The qemu provider does it in pkg/machine/applehv/machine.go (in Start(), specifically), but pkg/machine/applehv/machine.go makes no similar attempt. As an initial attempt, the following patch works in my sandbox (based off of what was done in the qemu provider):

Edit: Revised to build against HEAD not stable. Also noticed getRuntimeDir() does more than just get the runtime directory, so I've revised the patch to call getRuntimeDir() and nothing else as it already tries to create the podman directory.

--- machine.go.orig	2024-01-17 22:17:52
+++ machine.go	2024-01-17 22:19:07
@@ -572,6 +572,11 @@
 		return machine.ErrVMAlreadyRunning
 	}

+	_, err = m.getRuntimeDir()
+	if err != nil {
+		return err
+	}
+
 	// TODO handle returns from startHostNetworking
 	forwardSock, forwardState, err := m.startHostNetworking()
 	if err != nil {

@rhatdan
Copy link
Member

rhatdan commented Jan 18, 2024

Care to open a PR to fix?

@jfrantzius
Copy link

Hi @rhatdan , I tested the fix with brew install podman --HEAD, which seemed to work well:

podman --version
podman version 5.0.0-dev

So I'm now on commit 2ba3605 of the podman main branch:

brew info podman
==> podman: stable 4.9.0 (bottled), HEAD
Tool for managing OCI containers and pods
https://podman.io/
/opt/homebrew/Cellar/podman/HEAD-2ba3605 (198 files, 72.8MB) *
  Built from source on 2024-01-29 at 16:36:08

Stopping the machine, restarting the laptop and starting up the machine worked a few times, but today I got the same error again:

❯
podman --debug machine start
INFO[0000] podman filtering at log level debug
DEBU[0000] Using Podman machine with `applehv` virtualization provider
DEBU[0000] connection refused: http://localhost:50478/vm/state
Starting machine "podman-machine-default"
DEBU[0000] connection refused: http://localhost:50478/vm/state
DEBU[0000] creating runtimeDir: /var/folders/04/bfd6wk9n385174twdm8hx_dc0000gp/T/podman
DEBU[0000] gvproxy binary being used: /opt/homebrew/Cellar/podman/HEAD-2ba3605/libexec/podman/gvproxy
DEBU[0000] [-debug -mtu 1500 -ssh-port 50479 -listen-vfkit unixgram:///var/folders/04/bfd6wk9n385174twdm8hx_dc0000gp/T/podman/gvproxy.sock -forward-user root -forward-identity /Users/joerg.frantzius/.local/share/containers/podman/machine/machine -forward-sock /Users/joerg.frantzius/.local/share/containers/podman/machine/applehv/podman.sock -forward-dest /run/podman/podman.sock -pid-file /var/folders/04/bfd6wk9n385174twdm8hx_dc0000gp/T/podman/gvproxy.pid]
DEBU[0000] gvproxy unixgram socket "/var/folders/04/bfd6wk9n385174twdm8hx_dc0000gp/T/podman/gvproxy.sock" not found: stat /var/folders/04/bfd6wk9n385174twdm8hx_dc0000gp/T/podman/gvproxy.sock: no such file or directory
Error: gvproxy exited unexpectedly with exit code 0
DEBU[0000] Shutting down engines

My vfkit is this:

❯
vfkit --version
vfkit version: 0.5.1

Maybe this bug should be reopened?

@jfrantzius
Copy link

OK the error message is the same as in #21442 , but as described there, my applehv machine from current main branch does survive reboot, so it looks like this issue is indeed fixed (on main branch).

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Apr 30, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 30, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants