QEMU stops working with minikube #15021

spowelljr · 2022-09-26T22:55:27Z

QEMU version: 7.1.0
Machine: macOS 12.6 M1 (arm64)

The qemu driver was working fine and then all of a sudden it stopped working.

Ran minikube delete --all & top | grep qemu to verify no QEMU instance is running.

Tried minikube start --driver qemu and hangs with the following logs:
logs.txt

Ran minikube delete --all --purge and then started again, it hangs further in the process with the following logs:
logs2.txt

Also tried using different Kubernetes versions as well.

Tried uninstalling QEMU, restarting computer, then reinstalling but still the same error.

The text was updated successfully, but these errors were encountered:

afbjorklund · 2022-09-27T06:07:51Z

Unfortunately the "user" networking is still somewhat flaky, when running on macOS.

But it seems related to DNS, which is supposed to be answering on the host side:
dial tcp: lookup k8s.gcr.io on 10.0.2.3:53: read udp 10.0.2.15:36748->10.0.2.3:53: i/o timeout

https://wiki.qemu.org/Documentation/Networking#User_Networking_.28SLIRP.29

spowelljr · 2022-09-28T21:01:27Z

https://unix.stackexchange.com/a/614603

In the default "user mode" networking, QEMU uses only the first DNS nameserver from the host machine.

It is a known QEMU behavior, which is not expected to be fixed in QEMU.

More details: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=625689

spowelljr · 2022-09-28T21:48:48Z

To expand on the above comment, it's related to DNS and the user network.

In the default "user mode" networking, QEMU uses only the first DNS nameserver from the host machine.

If I look at the /etc/resolv.conf on my Mac the first nameserver is a corp one.

So when I try curling in the ISO it's routing the DNS lookup to the corp DNS and fails.

When starting minikube with the qemu driver with the user network we can confirm this by get the following error:
❗ This VM is having trouble accessing https://registry.k8s.io

And if you SSH into the machine you're not able to curl anything:

$ curl www.google.com
curl: (6) Could not resolve host: www.google.com

In the logs we can see the DNS errors holding everything up are from Docker:

Sep 28 20:07:30 minikube dockerd[860]: time="2022-09-28T20:07:30.312679028Z" level=warning msg="Error getting v2 registry: Get \"https://k8s.gcr.io/v2/\": dial tcp: lookup k8s.gcr.io on 10.0.2.3:53: read udp 10.0.2.15:58598->10.0.2.3:53: i/o timeout"
Sep 28 20:07:30 minikube dockerd[860]: time="2022-09-28T20:07:30.312819695Z" level=info msg="Attempting next endpoint for pull after error: Get \"https://k8s.gcr.io/v2/\": dial tcp: lookup k8s.gcr.io on 10.0.2.3:53: read udp 10.0.2.15:58598->10.0.2.3:53: i/o timeout"
Sep 28 20:07:30 minikube dockerd[860]: time="2022-09-28T20:07:30.320921361Z" level=error msg="Handler for POST /v1.40/images/create returned error: Get \"https://k8s.gcr.io/v2/\": dial tcp: lookup k8s.gcr.io on 10.0.2.3:53: read udp 10.0.2.15:58598->10.0.2.3:53: i/o timeout"
Sep 28 20:08:20 minikube dockerd[860]: time="2022-09-28T20:08:20.338281844Z" level=warning msg="Error getting v2 registry: Get \"https://k8s.gcr.io/v2/\": dial tcp: lookup k8s.gcr.io on 10.0.2.3:53: read udp 10.0.2.15:57092->10.0.2.3:53: i/o timeout"
Sep 28 20:08:20 minikube dockerd[860]: time="2022-09-28T20:08:20.338366219Z" level=info msg="Attempting next endpoint for pull after error: Get \"https://k8s.gcr.io/v2/\": dial tcp: lookup k8s.gcr.io on 10.0.2.3:53: read udp 10.0.2.15:57092->10.0.2.3:53: i/o timeout"
Sep 28 20:08:20 minikube dockerd[860]: time="2022-09-28T20:08:20.343149260Z" level=error msg="Handler for POST /v1.40/images/create returned error: Get \"https://k8s.gcr.io/v2/\": dial tcp: lookup k8s.gcr.io on 10.0.2.3:53: read udp 10.0.2.15:57092->10.0.2.3:53: i/o timeout"
Sep 28 20:08:40 minikube dockerd[860]: time="2022-09-28T20:08:40.342848186Z" level=warning msg="Error getting v2 registry: Get \"https://k8s.gcr.io/v2/\": dial tcp: lookup k8s.gcr.io on 10.0.2.3:53: read udp 10.0.2.15:44128->10.0.2.3:53: i/o timeout"
Sep 28 20:08:40 minikube dockerd[860]: time="2022-09-28T20:08:40.343528645Z" level=info msg="Attempting next endpoint for pull after error: Get \"https://k8s.gcr.io/v2/\": dial tcp: lookup k8s.gcr.io on 10.0.2.3:53: read udp 10.0.2.15:44128->10.0.2.3:53: i/o timeout"
Sep 28 20:08:40 minikube dockerd[860]: time="2022-09-28T20:08:40.349768811Z" level=error msg="Handler for POST /v1.40/images/create returned error: Get \"https://k8s.gcr.io/v2/\": dial tcp: lookup k8s.gcr.io on 10.0.2.3:53: read udp 10.0.2.15:44128->10.0.2.3:53: i/o timeout"
Sep 28 20:09:10 minikube dockerd[860]: time="2022-09-28T20:09:10.349929159Z" level=warning msg="Error getting v2 registry: Get \"https://k8s.gcr.io/v2/\": dial tcp: lookup k8s.gcr.io on 10.0.2.3:53: read udp 10.0.2.15:36394->10.0.2.3:53: i/o timeout"
Sep 28 20:09:10 minikube dockerd[860]: time="2022-09-28T20:09:10.350541201Z" level=error msg="Not continuing with pull after error: Get \"https://k8s.gcr.io/v2/\": dial tcp: lookup k8s.gcr.io on 10.0.2.3:53: read udp 10.0.2.15:36394->10.0.2.3:53: i/o timeout"
Sep 28 20:09:10 minikube dockerd[860]: time="2022-09-28T20:09:10.350810076Z" level=error msg="Handler for POST /v1.40/images/create returned error: Get \"https://k8s.gcr.io/v2/\": dial tcp: lookup k8s.gcr.io on 10.0.2.3:53: read udp 10.0.2.15:36394->10.0.2.3:53: i/o timeout"
Sep 28 20:09:40 minikube dockerd[860]: time="2022-09-28T20:09:40.357594007Z" level=warning msg="Error getting v2 registry: Get \"https://k8s.gcr.io/v2/\": dial tcp: lookup k8s.gcr.io on 10.0.2.3:53: read udp 10.0.2.15:50671->10.0.2.3:53: i/o timeout"
Sep 28 20:09:40 minikube dockerd[860]: time="2022-09-28T20:09:40.357712840Z" level=info msg="Attempting next endpoint for pull after error: Get \"https://k8s.gcr.io/v2/\": dial tcp: lookup k8s.gcr.io on 10.0.2.3:53: read udp 10.0.2.15:50671->10.0.2.3:53: i/o timeout"
Sep 28 20:09:40 minikube dockerd[860]: time="2022-09-28T20:09:40.364860423Z" level=error msg="Handler for POST /v1.40/images/create returned error: Get \"https://k8s.gcr.io/v2/\": dial tcp: lookup k8s.gcr.io on 10.0.2.3:53: read udp 10.0.2.15:50671->10.0.2.3:53: i/o timeout"

It then just hangs until it eventually fails.

However, if I start minikube using --network=socket_vmnet (using #14989) it starts fine and am able to curl without issues. And if I start minikube using --network=user but with --container-runtime=containerd it also starts successfully as the Docker step is avoided but am still unable to curl.

Thanks for pointing me in the right direction @afbjorklund

medyagh · 2022-09-29T21:39:40Z

good job in getting to the bottom of this @spowelljr we should add this to our documentation as a known issue.

k8s-triage-robot · 2022-12-28T22:46:33Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2023-01-27T23:42:24Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

spowelljr added kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. co/qemu-driver QEMU related issues labels Sep 26, 2022

afbjorklund added the os/macos label Sep 27, 2022

This was referenced Sep 29, 2022

Fail faster when using QEMU with user network and no network connectivity #15041

Open

Add known issue with QEMU and user network to documentation #15042

Closed

spowelljr self-assigned this Sep 29, 2022

spowelljr mentioned this issue Sep 30, 2022

ARM64 fail to nslookup #14722

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 28, 2022

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 27, 2023

medyagh mentioned this issue Apr 22, 2024

Fedora VM Drivers on minikube 1.33 cannot pull images (resolved.conf) #18705

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QEMU stops working with minikube #15021

QEMU stops working with minikube #15021

spowelljr commented Sep 26, 2022 •

edited

Loading

afbjorklund commented Sep 27, 2022

spowelljr commented Sep 28, 2022

spowelljr commented Sep 28, 2022 •

edited

Loading

medyagh commented Sep 29, 2022

k8s-triage-robot commented Dec 28, 2022

k8s-triage-robot commented Jan 27, 2023

QEMU stops working with minikube #15021

QEMU stops working with minikube #15021

Comments

spowelljr commented Sep 26, 2022 • edited Loading

afbjorklund commented Sep 27, 2022

spowelljr commented Sep 28, 2022

spowelljr commented Sep 28, 2022 • edited Loading

medyagh commented Sep 29, 2022

k8s-triage-robot commented Dec 28, 2022

k8s-triage-robot commented Jan 27, 2023

spowelljr commented Sep 26, 2022 •

edited

Loading

spowelljr commented Sep 28, 2022 •

edited

Loading