Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

after winget upgrade to 5.0.0, all podman machine commands fail with json error #22144

Closed
alisonatwork opened this issue Mar 23, 2024 · 12 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. machine

Comments

@alisonatwork
Copy link

alisonatwork commented Mar 23, 2024

Issue Description

I just upgraded Podman to 5.0.0 using winget upgrade, and now I cannot access any of the machines or basic functionality.

The error when running any podman machine command is as follows:

Error: unable to load machine config file: "json: cannot unmarshal string into Go struct field MachineConfig.ImagePath of type define.VMFile"

Steps to reproduce the issue

Steps to reproduce the issue:

  1. Upgrade to Podman 5.0.0 from 4.9.3 using winget (default --scope machine)
  2. Attempt to run podman machine list (or any other podman machine command)

Describe the results you received

Error: unable to load machine config file: "json: cannot unmarshal string into Go struct field MachineConfig.ImagePath of type define.VMFile"

Describe the results you expected

Podman should work normally, but it does not.

podman info output

PS C:\Users\user> podman info
OS: windows/amd64
provider: wsl
version: 5.0.0

Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM
Error: unable to connect to Podman socket: failed to connect: dial tcp 127.0.0.1:49769: connectex: No connection could be made because the target machine actively refused it.
PS C:\Users\user> podman machine init
Error: unable to load machine config file: "json: cannot unmarshal string into Go struct field MachineConfig.ImagePath of type define.VMFile"
PS C:\Users\user> podman system connection list
Name                         URI                                                          Identity                                    Default     ReadWrite
podman-machine-default       ssh://user@127.0.0.1:49769/run/user/1000/podman/podman.sock  C:\Users\user\.ssh\podman-machine-default  true        false
podman-machine-default-root  ssh://root@127.0.0.1:49769/run/podman/podman.sock            C:\Users\user\.ssh\podman-machine-default  false       false
PS C:\Users\user> podman machine list
Error: unable to load machine config file: "json: cannot unmarshal string into Go struct field MachineConfig.ImagePath of type define.VMFile"
PS C:\Users\user> podman machine start
Error: unable to load machine config file: "json: cannot unmarshal string into Go struct field MachineConfig.ImagePath of type define.VMFile"
PS C:\Users\user> podman machine reset
Error: unable to load machine config file: "json: cannot unmarshal string into Go struct field MachineConfig.ImagePath of type define.VMFile"
PS C:\Users\user> wsl --list
Windows Subsystem for Linux Distributions:
podman-machine-default (Default)

Podman in a container

No

Privileged Or Rootless

None

Upstream Latest Release

Yes

Additional environment details

Windows 11 Version 23H2 (OS Build 22631.3371)

Additional information

Tested on a clean system with no other WSL machines installed.

Tested both as administrator and normal user, same error message.

@alisonatwork alisonatwork added the kind/bug Categorizes issue or PR as related to a bug. label Mar 23, 2024
@alisonatwork
Copy link
Author

alisonatwork commented Mar 23, 2024

Looking through the source code, it seems that the problem is that it is trying to parse the JSON files in ${Env:USERPROFILE}\.config\containers\podman\machine\wsl\, but the schema has changed so it can't load the existing podman-machine-default.json and thus nothing works. If I rename that file, then the podman machine commands run, but they do not recognize the already-existing machine from 4.9.3.

I see from the changelog that the VM format has changed, but if that is the case then I think the installer should either refuse to install if it encounters an existing VM, or it should rename the old ones so that at least podman will still work correctly after the upgrade and users can still manually get the data off their old machines using wsl. Right now I am not sure the best way to proceed without possibly losing data that is on the existing VM.

@sjfke
Copy link

sjfke commented Mar 23, 2024

Had the same error... it is due to v4.x podman machine not being cleared out properly, sequence I used:

  • uninstall podman-desktop
    PS C:\Users\sjfke> podman machine stop
    PS C:\Users\sjfke> podman machine rm (y)
  • uninstall podman (v4.x)
  • reboot
  • install podman (v5.x)
  • reboot
    PS C:\Users\sjfke> podman machine info
    PS C:\Users\sjfke> podman machine init
    PS C:\Users\sjfke> podman machine start
  • install podman-desktop

@sjfke
Copy link

sjfke commented Mar 23, 2024

Maybe all you need to do is the following before upgrading

PS C:\Users\sjfke> podman machine stop
PS C:\Users\sjfke> podman machine rm (y) # this removes all traces of the v4 podman from your account

@Luap99
Copy link
Member

Luap99 commented Mar 24, 2024

This is expected, see the release notes and https://blog.podman.io/2024/03/migration-of-podman-4-to-podman-5-machines/

@Luap99 Luap99 closed this as not planned Won't fix, can't repro, duplicate, stale Mar 24, 2024
@alisonatwork
Copy link
Author

The linked blog is not very helpful, because it says:

If you are working on a critical project where you rely on Podman machine for your development, you should consider waiting to upgrade to Podman 5.

How long should we wait? Is there any kind of fix scheduled for this?

If the only supported upgrade path is to delete all the virtual machines and start from scratch, then I think the installer should at least prompt the user to do that before silently "upgrading" the system to a broken state.

It's still not clear how to fix this once the system has entered the broken state. Do we need to downgrade back to the old version, back up all the data from the various machines, then delete them (as per the blog) and upgrade again?

@alisonatwork
Copy link
Author

alisonatwork commented Mar 24, 2024

Okay, in attempting to fix this without having to juggle different versions of Podman, I followed the following steps after a fresh reboot:

  1. wsl --export podman-machine-default podman-machine-default.tar
  2. wsl --unregister podman-machine-default
  3. rm ${Env:USERPROFILE}\.config\containers\podman\machine\wsl\podman-machine-default.json
  4. rm ${Env:USERPROFILE}\.config\containers\podman\machine\wsl\podman-machine-default.lock
  5. Repeat steps 1 through 4 for every other Podman machine on the system
  6. podman machine init
  7. podman machine start

At this point you can at least run containers again. The next step will be recreating every machine you need and manually copying across any custom stuff from the tar backups. The VMs still don't have rpm-ostree so the process of reconfiguring stuff in /etc and updating with dnf seems unchanged from the pre-5.0 dance.

If anyone is looking for the new VM source, I think it has moved over here now: https://github.com/containers/podman-machine-wsl-os (previously https://github.com/containers/podman-wsl-fedora).

@sjfke
Copy link

sjfke commented Mar 24, 2024 via email

@alisonatwork
Copy link
Author

alisonatwork commented Mar 25, 2024

Unfortunately the suggestion I posted above still does not clean out everything from the old Podman installation. podman system connection list still shows all of the old virtual machines, sockets and SSH configurations. Even worse, podman system connection remove -a does not remove these connections, even though the machines themselves might have been deleted via wsl --unregister.

...and of course podman system connection remove -a now makes it impossible to run podman machine reset again, because:

Unregistering...
Error: 1 error occurred:
        * unable to find connection named "podman-machine-default"

What a mess.

Update - only fix I have found is to completely uninstall Podman altogether (you can't downgrade, even with winget --force flag), then reinstall 4.9.3, then remove all the machines, then upgrade to 5.0.0 again.

@sjfke
Copy link

sjfke commented Mar 26, 2024

Hi Alison

Essentially I was trying to do a v4.9.3 -to- v5.0.0 upgrade and concur this is a mess...

Is 'podman-v5.0.0' considered stable enough to work with or should I uninstall and install 'podman-v4.9.4'? and wait...

Sjfke

@NeffIsBack
Copy link

Hi,
I have a fresh installation and am running into this error. What does it cause it and how could I resolve it? I also tried removing postman 5.0.0 and installing postman 4.9.x but there I get the error, that a default machine already exists, but at podman machine rm tells me there isn't a default machine yet.

@dealer426
Copy link

Try this uninstall 5.0, reinstall 4.9.3

run from windows PowerShell or bash:

$ podman system connection rm podman-machine-default
$ podman system connection rm podman-machine-default-root
$ wsl --list --all
$ wsl --unregister podman-machine-default

Go to .ssh folder on winodws

Delete podman ssh keys

Uninstall 4.9.3

Install 5.0

C:\Users\burns.config\containers\podman\machine\wsl

Go to this folder

Delete all files

Run these commands again

$ podman system connection rm podman-machine-default
$ podman system connection rm podman-machine-default-root

Ready to Build a new podman machine

$ podman machine init

@harryssuperman
Copy link

For me the easiest way was just going to Podman Desktop and following the suggested instructions!. Maybe that help another people

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. machine
Projects
None yet
Development

No branches or pull requests

6 participants