Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dns stops working after a while #779

Closed
1 of 5 tasks
brennan-airtime opened this issue Aug 16, 2023 · 7 comments
Closed
1 of 5 tasks

dns stops working after a while #779

brennan-airtime opened this issue Aug 16, 2023 · 7 comments
Milestone

Comments

@brennan-airtime
Copy link

brennan-airtime commented Aug 16, 2023

Description

dns seems to have issues after running the VM for an amount of time. Maybe something to do with suspend/resume Recreating the VM restores DNS:

colima delete ; colima start --cpu 6 --memory 24 --disk 800 --vm-type=vz

github.com fails during docker build, inside containers and even if I ssh into the VM.

DNS works fine on my host.

Version

colima version 0.5.5
git commit: 6251dc2

runtime: docker
arch: x86_64
client: v24.0.2
server: v23.0.6

Operating System

  • macOS Intel <= 12 (Monterrey)
  • macOS Intel >= 13 (Ventura)
  • macOS M1 <= 12 (Monterrey)
  • macOS M1 >= 13 (Ventura)
  • Linux

Output of colima status

INFO[0000] colima is running using macOS Virtualization.Framework
INFO[0000] arch: x86_64
INFO[0000] runtime: docker
INFO[0000] mountType: virtiofs
INFO[0000] socket: unix:///Users/brennan/.colima/default/docker.sock

Reproduction Steps

  1. colima start --cpu 6 --memory 24 --disk 800 --vm-type=vz
  2. use it for a while
  3. dns becomes very intermittent

Expected behaviour

No response

Additional context

No response

@luisdavim
Copy link

I have the same issue but it only seems to happen wen using --vm-type=vz

@SZChimp
Copy link

SZChimp commented Aug 21, 2023

Hi, I don't know if this helps, but have you tried using another DNS server? You can set one using --dns=1.2.3.4. Maybe this helps to narrow down the error source.

@brennan-airtime
Copy link
Author

brennan-airtime commented Aug 21, 2023

If I do a dns query with nslookup or dig from inside a container and set say cloudflare's dns server 1.1.1.1 it works fine. I believe there is a DNS proxy or some DNS component inside the colima VM having issues.

Inside containers:

# cat /etc/resolv.conf 
nameserver 192.168.5.3

Which is the ip addr of the eth0 interface inside the VM colima ssh

eth0      Link encap:Ethernet  HWaddr 52:55:55:C8:C3:AA
          inet addr:192.168.5.1  Bcast:0.0.0.0  Mask:255.255.255.0
          inet6 addr: fe80::5055:55ff:fec8:c3aa/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1028518 errors:0 dropped:0 overruns:0 frame:0
          TX packets:459468 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1226620327 (1.1 GiB)  TX bytes:524014950 (499.7 MiB)

@michaeldiscala
Copy link

My team is seeing this same issue periodically, using the vz vm type. My network config is as follows:

# Network configurations for the virtual machine.
network:
  # Assign reachable IP address to the virtual machine.
  # NOTE: this is currently macOS only and ignored on Linux.
  # Default: false
  address: true
  
  # Custom DNS resolvers for the virtual machine.
  #
  # EXAMPLE
  # dns: [8.8.8.8, 1.1.1.1]
  #
  # Default: []
  dns:
    - 8.8.8.8
    - 1.1.1.1
  
  # DNS hostnames to resolve to custom targets using the internal resolver.
  # This setting has no effect if a custom DNS resolver list is supplied above.
  # It does not configure the /etc/hosts files of any machine or container.
  # The value can be an IP address or another host.
  #
  # EXAMPLE
  # dnsHosts:
  #   example.com: 1.2.3.4
  dnsHosts: {}
  
  # Network driver to use (slirp, gvproxy), (requires vmType `qemu`)
  #   - slirp is the default user mode networking provided by Qemu
  #   - gvproxy is an alternative to VPNKit based on gVisor https://github.com/containers/gvisor-tap-vsock
  # Default: gvproxy
  driver: gvproxy

Please let me know if there's any diagnostic information that would be helpful for me to share. Thank you!

@brennan-airtime
Copy link
Author

It seems it is the docker daemon dns proxy having the issue. Next time it happens I'll just try restarting dockerd to see if it resolves the issue.

@rawroland
Copy link

I am also experiencing this issue, and would like to add the reasons which cause the problem, if this could be helpful. It happens to me both after my system goes to sleep and when I switched my network connection, that is from lan to wifi and vice versa. The only thing that helps for me is if I restart colima.

@rfay
Copy link
Contributor

rfay commented Nov 1, 2023

@abiosoft abiosoft added this to the v0.6.0 milestone Nov 12, 2023
jesse-c added a commit to SeldonIO/MLServer that referenced this issue May 30, 2024
* build: Lock GitHub runners' OS

This was motivated by our macOS jobs failing [2] because
colima is missing. It looks like this is because the
latest versions of the macOS runner no longer have
colima installed by default [1].

colima is now explicitly installed.

[1] actions/runner-images#6216
[2] `/Users/runner/work/_temp/f19ffbff-27a9-4fc7-80b6-97791d2de141.sh: line 9: colima: command not found`

* build: Lock Colima

* build: Move macOS Docker installation to script

* build: Move macOS libomp activation to script

* build: Use latest Colima

The > 0.6.0 releases actually fix the issue we have linked [1][2][3].

[1] abiosoft/colima#577
[2] https://github.com/jesse-c/MLServer/blob/c3acd60995a72141027eff506e4fd330fe824179/hack/install-docker-macos.sh#L18-L20
[3] > Switch to new user-v2 network. Fixes abiosoft/colima#648, abiosoft/colima#603, abiosoft/colima#577, abiosoft/colima#779, abiosoft/colima#137, abiosoft/colima#740.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants