Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

macvlan: error killing pod: Link not found #953

Closed
cyclinder opened this issue Oct 10, 2023 · 11 comments · Fixed by #954
Closed

macvlan: error killing pod: Link not found #953

cyclinder opened this issue Oct 10, 2023 · 11 comments · Fixed by #954

Comments

@cyclinder
Copy link
Contributor

cyclinder commented Oct 10, 2023

Events:
  Type     Reason          Age                   From               Message
  ----     ------          ----                  ----               -------
  Normal   Scheduled       5m8s                  default-scheduler  Successfully assigned ns4740-192363920/ds-4bfa5c3fb5-8pwcf to spiderpool1008202421-control-plane
  Normal   AddedInterface  5m8s                  multus             Add eth0 [fc00:f853:ccd:e793:f::c7/64 172.18.40.6/16] from kube-system/macvlan-vlan0
  Normal   AddedInterface  5m6s                  multus             Add net1 [fd00:95b2::3/120 10.185.192.3/24] from ns4740-192363920/test-multus-6d8e8be07a
  Normal   Pulled          5m5s                  kubelet            Container image "alpine" already present on machine
  Normal   Created         5m5s                  kubelet            Created container samplepod
  Normal   Started         5m5s                  kubelet            Started container samplepod
  Normal   Killing         5m3s                  kubelet            Stopping container samplepod
  Warning  FailedKillPod   32s (x21 over 4m33s)  kubelet            error killing pod: failed to "KillPodSandbox" for "5c8fe27b-afa4-4ff2-9399-93773a4e1f5e" with KillPodSandboxError: "rpc error: code = Unknown desc = failed to destroy network for sandbox \"402c1fae47b140592e12cd11f3f5f1927788e6196e6f8d313ab09eee9cd4fedc\": delegateDel: error invoking ConflistDel - \"test-multus-6d8e8be07a\": conflistDel: error in getting result from DelNetworkList: Link not found"

I checked the code, this may have happened in

if err != ip.ErrLinkNotFound {

Link not found != link not found?

@cyclinder
Copy link
Contributor Author

cyclinder commented Oct 10, 2023

But I tested the code, It works well.

root@cyclinder3:~/cyclinder/test/linknofound# cat main.go
package main

import (

	"fmt"
	"log"

	"github.com/containernetworking/plugins/pkg/ip"

)

func main() {

	if err := ip.DelLinkByName("eth0"); err != nil {
		fmt.Println("del err: ",err)
		if err != ip.ErrLinkNotFound {
			log.Fatalln(err)
		}
	}

	fmt.Println("Del")
}
root@cyclinder3:~/cyclinder/test/linknofound# go run main.go
del err:  link not found
Del

@mlguerrero12
Copy link
Member

I think the error is coming from the loadConf function of the del command. Probably when trying to get the MTU of the master. The "Link not found" with capital L is an error string returned by netlink. I'll have a look.

@cyclinder
Copy link
Contributor Author

Yeah, Agree, It looks like only this is where errors can happen. I would like to fix this, May I?

@mlguerrero12
Copy link
Member

sure, thanks

@cyclinder
Copy link
Contributor Author

It looks like we can't change the code of loadConf function directly, If we ignore the LinkNotFound error at getMTU, which can affect the cmdAdd, the best option is that we check the LinkNotFound first in cmdDel, If true, we return directly.

@mlguerrero12
Copy link
Member

what happens with the macvlan interfaces after the master is deleted? Are they removed as well?

@cyclinder
Copy link
Contributor Author

yes.

root@cyclinder3:~# ip l show ens192.100
754: ens192.100@ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:50:56:b4:27:ca brd ff:ff:ff:ff:ff:ff
root@cyclinder3:~# ip netns exec net1 ip a
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
755: macvlan1@if754: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 86:8f:ec:8e:bd:e9 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::848f:ecff:fe8e:bde9/64 scope link
       valid_lft forever preferred_lft forever
root@cyclinder3:~# ip l d ens192.100
root@cyclinder3:~# ip netns exec net1 ip a
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

@mlguerrero12
Copy link
Member

cool

Another thing, what's the purpose of calling the loadConf function in the del command. It seems to me that we only need to access the n.IPAM parameter. If that's the case, we only need this

n := &NetConf{}
if err := json.Unmarshal(args.StdinData, n); err != nil {
	return nil, "", fmt.Errorf("failed to load netconf: %v", err)
}

We could replace loadConf with this in the del command (changing the error message "failed to unmarshall config" or something like that).

Just had a quick look. Please confirm this.

@cyclinder
Copy link
Contributor Author

Yeah, I just checked the code, We only need n.IPAM, we can replace loadConf. But I'm not sure if additional parameters will be needed in the future.

@mlguerrero12
Copy link
Member

and they will be there, what I'm trying to say is that we don't need the other parts of loadConf

@cyclinder
Copy link
Contributor Author

I mean we are not sure If we need the "other parts" of loadConf in the future (although most likely it is not needed😄)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants