Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minikube tries to delete the default libvirt network #18919

Closed
nirs opened this issue May 18, 2024 · 3 comments · Fixed by #18920
Closed

Minikube tries to delete the default libvirt network #18919

nirs opened this issue May 18, 2024 · 3 comments · Fixed by #18920

Comments

@nirs
Copy link
Contributor

nirs commented May 18, 2024

What Happened?

When running minikube with --driver kvm2 --network default it will use the libvirt default network, but it does not create it. If fact, it will fail if the default network does not exist.
When deleting a minikube profile, it tries the delete the default network it did not create.

Looking at verbose logs we see that minikube tries to delete the default network if the network is not used by any vms:

I0518 02:51:47.824566 1219676 out.go:177] * Deleting "minikube" in kvm2 ...
I0518 02:51:47.824590 1219676 main.go:141] libmachine: (minikube) Calling .Remove
I0518 02:51:47.824675 1219676 main.go:141] libmachine: (minikube) DBG | Removing machine...
I0518 02:51:47.834997 1219676 main.go:141] libmachine: (minikube) DBG | Trying to delete the networks (if possible)
I0518 02:51:47.845017 1219676 main.go:141] libmachine: (minikube) DBG | Checking if network default exists...
I0518 02:51:47.845706 1219676 main.go:141] libmachine: (minikube) DBG | Network default exists
I0518 02:51:47.845718 1219676 main.go:141] libmachine: (minikube) DBG | Trying to list all domains...
I0518 02:51:47.845824 1219676 main.go:141] libmachine: (minikube) DBG | Listed all domains: total of 4 domains
I0518 02:51:47.845834 1219676 main.go:141] libmachine: (minikube) DBG | Trying to get name of domain...
I0518 02:51:47.845839 1219676 main.go:141] libmachine: (minikube) DBG | Got domain name: fedora39-base
I0518 02:51:47.845842 1219676 main.go:141] libmachine: (minikube) DBG | Getting XML for domain fedora39-base...
I0518 02:51:47.846001 1219676 main.go:141] libmachine: (minikube) DBG | Got XML for domain fedora39-base
I0518 02:51:47.846311 1219676 main.go:141] libmachine: (minikube) DBG | Unmarshaled XML for domain fedora39-base: kvm.result{Name:"fedora39-base", Interfaces:[]kvm.iface{kvm.iface{Source:kvm.source{Network:"default"}}}}
I0518 02:51:47.846351 1219676 main.go:141] libmachine: (minikube) Deleting of networks failed: network still in use at least by domain 'fedora39-base',

This is very wrong - minikube does not own the libvirt default network, so it must not try to remove it.

This is 100% reproducible when no other vm on the system is using the default network, and randomly fails when deleting multiple profiles in parallel, since the check for used network is racy (time of check, time of use).

Looking at the relevant code, the intent is very clear that we do not want to delete the default network (d.Network), but only the minikube private network (d.PrivateNetwork). So it seems that the root cause is that the d.PrivateNetwork is set to "default" by mistake at some point.

238 func (d *Driver) deleteNetwork() error {
239     conn, err := getConnection(d.ConnectionURI)
240     if err != nil {
241         return errors.Wrap(err, "getting libvirt connection")
242     }
243     defer conn.Close()
244 
245     // network: default
246     // It is assumed that the OS manages this network
247 
248     // network: private
249     log.Debugf("Checking if network %s exists...", d.PrivateNetwork)                                                                                                                       
250     network, err := conn.LookupNetworkByName(d.PrivateNetwork)
251     if err != nil {
252         if lvErr(err).Code == libvirt.ERR_NO_NETWORK {
253             log.Warnf("Network %s does not exist. Skipping deletion", d.PrivateNetwork)
254             return nil
255         }
256         return errors.Wrapf(err, "failed looking up network %s", d.PrivateNetwork)
257     }
258     defer func() { _ = network.Free() }()
259     log.Debugf("Network %s exists", d.PrivateNetwork)
260     
261     err = d.checkDomains(conn)
262     if err != nil {
263         return err
264     }

Checking the domain, we have 2 interfaces created on the default network, instead of one interface on the default network, and one on the "minikube-net" private network:

    <interface type='network'>
      <mac address='52:54:00:16:16:55'/>
      <source network='default' portid='48a95c83-922c-45b1-974e-52feada03103' bridge='virbr0'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>
    <interface type='network'>
      <mac address='52:54:00:76:58:ad'/>
      <source network='default' portid='aa7bbe43-62bc-466d-8da6-dcae3dbcf5c3' bridge='virbr0'/>
      <target dev='vnet1'/>
      <model type='virtio'/>
      <alias name='net1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

We can also see that minikube is using only the default network in the config:

$ grep KVMNetwork /data/tmp/.minikube/profiles/minikube/config.json 
	"KVMNetwork": "default",

Maybe this is the intended behavior - if you use --network defualt both interfaces are created on the specified network.

Based on this, I think we should skip deletion if the private network is "default".

Attach the log file

Logs in the description.

Operating System

Redhat/Fedora

Driver

KVM2

@nirs nirs changed the title Minikube try to delete the default libvirt network Minikube tries to delete the default libvirt network May 19, 2024
@medyagh
Copy link
Member

medyagh commented May 20, 2024

Thanks @nirs for the PR and that sounds resonable, I am curious what are some practical usages of using the "default" network vs letting minikube create its own custom network ?

I am saying this because many years ago when we were sharing the network and not using dedicated network we were facing a lot of issues such as ip conflicts or stuck networks or clean ups... I just like to learn your specific needs if you dont mind sharing

@nirs
Copy link
Contributor Author

nirs commented May 20, 2024

Thanks @nirs for the PR and that sounds resonable, I am curious what are some practical usages of using the "default" network vs letting minikube create its own custom network ?

We create a DR setup with 3 clsuters (hub, dr1, dr2). The DR clusters run rook-ceph storage, and use the host network. This makes it easier to setup storage replication between the dr1 and dr2. With the storage replicated, when we create a workload on one cluster and enable DR protection, the ceph volume is replicated to the other cluster. Then we can simulate a disaster by suspending or destroying one of the clusters, and start the workload on the other cluster.

On a real setup, we allow connect the remote clusters using submariner. For the testing setup we simplify, we have enough trouble without submariner.

You can check this FOSDEM talk showing all this with virtual machine as workload:
https://fosdem.org/2024/schedule/event/fosdem-2024-3256-instant-ramen-quick-and-easy-multi-cluster-kubernetes-development-on-your-laptop/

@medyagh
Copy link
Member

medyagh commented May 21, 2024

Thanks @nirs for the PR and that sounds resonable, I am curious what are some practical usages of using the "default" network vs letting minikube create its own custom network ?

We create a DR setup with 3 clsuters (hub, dr1, dr2). The DR clusters run rook-ceph storage, and use the host network. This makes it easier to setup storage replication between the dr1 and dr2. With the storage replicated, when we create a workload on one cluster and enable DR protection, the ceph volume is replicated to the other cluster. Then we can simulate a disaster by suspending or destroying one of the clusters, and start the workload on the other cluster.

On a real setup, we allow connect the remote clusters using submariner. For the testing setup we simplify, we have enough trouble without submariner.

You can check this FOSDEM talk showing all this with virtual machine as workload: https://fosdem.org/2024/schedule/event/fosdem-2024-3256-instant-ramen-quick-and-easy-multi-cluster-kubernetes-development-on-your-laptop/

thank you for sharing, thats interesting, so in this case the other servers in the cluster already use the "Default" network and I assume you pass a flag to minikube to force it to use the default network as well, right?

I think as long as we make sure we have a good clean up story (for when we delete minikube) that should be good PR for minikube, and thanks for taking the time to contribute it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants