Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8s network policy legacy e2e test fails #2285

Closed
zhangzujian opened this issue Feb 2, 2023 · 4 comments · Fixed by #2313 or #2322
Closed

k8s network policy legacy e2e test fails #2285

zhangzujian opened this issue Feb 2, 2023 · 4 comments · Fixed by #2313 or #2322
Assignees
Labels
bug Something isn't working
Projects

Comments

@zhangzujian
Copy link
Member

zhangzujian commented Feb 2, 2023

Expected Behavior

Actual Behavior

IPv4:

Summarizing 8 Failures:
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should ensure an IP overlapping both IPBlock.CIDR and IPBlock.Except is allowed [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1957
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should enforce multiple egress policies with egress allow-all policy taking precedence [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1957
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should support a 'default-deny-all' policy [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1957
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should enforce egress policy allowing traffic to a server in a different namespace based on PodSelector and NamespaceSelector [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1957
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should work with Ingress,Egress specified together [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1957
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should enforce except clause while egress access to server in CIDR block [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1957
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should allow egress access to server in CIDR block [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1957
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should allow egress access on one named port [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1957

IPv6:

Summarizing 10 Failures:
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should enforce except clause while egress access to server in CIDR block [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1957
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should enforce policies to check ingress and egress policies can be controlled independently based on PodSelector [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1932
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should allow egress access on one named port [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1957
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should work with Ingress,Egress specified together [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1957
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should ensure an IP overlapping both IPBlock.CIDR and IPBlock.Except is allowed [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1957
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should support a 'default-deny-all' policy [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1957
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should enforce egress policy allowing traffic to a server in a different namespace based on PodSelector and NamespaceSelector [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1957
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should enforce multiple egress policies with egress allow-all policy taking precedence [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1957
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should allow egress access to server in CIDR block [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1957
  [FAIL] [sig-network] NetworkPolicyLegacy [LinuxOnly] NetworkPolicy between server and client [It] should enforce updated policy [Feature:NetworkPolicy]
  /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.26.1/test/e2e/network/netpol/network_legacy.go:1957

Steps to Reproduce the Problem

Additional Info

  • Kubernetes version:

    Output of kubectl version:

    (paste your output here)
    
  • kube-ovn version:

    (paste your output here)
    
  • operation-system/kernel version:

    Output of awk -F '=' '/PRETTY_NAME/ { print $2 }' /etc/os-release:
    Output of uname -r:

    (paste your output here)
    
@oilbeater oilbeater added this to To do in 2023-2 Feb 6, 2023
@oilbeater oilbeater added the bug Something isn't working label Feb 6, 2023
@changluyi changluyi moved this from To do to In progress in 2023-2 Feb 6, 2023
@changluyi
Copy link
Collaborator

changluyi commented Feb 7, 2023

测试案例中只要是调用到testCannotConnect函数的基本会错。

networklegacy里面的案例 在测试pod不能连通的时候,是通过pod添加启动脚本来实现的。

  spec:
    containers:
    - args:
      - -c
      - for i in $(seq 1 5); do /agnhost connect 10.104.63.59:80 --protocol tcp --timeout
        8s && exit 0 || sleep 1; done; exit 1

如果脚本连的ip能通,返回exit 0, pod状态podSuccess,
如果脚本连的ip不能通, 返回exit 1, pod 状态podFailed,
所以当检查不能连通时预期状态是podFailed,但kube-ovn跑的时候状态是podSuccess

原因是:
kubeovn的netpol实现机制是pod的lsp加入portGroup, portGroup绑定ACL实现的,但pod还没起来的话,lsp是不会加入portGroup的,所以kubeovn在pod启动前,acl不生效的

//fetchSelectedPorts 里面去检查isPodAlive
	for _, pod := range pods {
		if !isPodAlive(pod) || pod.Spec.HostNetwork {
			continue
		}

看需不需要把这个isPodAlive的判断拿掉。

另外networklegacy和networkpolicy的测试案例是完全一样的,
networkpolicy相当于给networklegacy的测试方法重构了下,增加了结果矩阵图,重试等等功能,不确定是否一定要去满足networklegacy的测试案例。

@oilbeater
Copy link
Collaborator

把 isPodAlive 去掉吧,现在这样相当于存在安全隐患,启动的一段时间内策略是失效的

@changluyi
Copy link
Collaborator

changluyi commented Feb 7, 2023

networkpolicy的acl配置下去时序也不太稳定。
90%概率pod启动脚本在acl配置前启动,10%概率在配置acl后。

举个例子kube-ovn-controller的log

I0207 08:17:02.671488      13 pod.go:407] handle add pod network-policy-694/client-a-7974b
...
I0207 08:17:02.775084      13 pod.go:484] take 49 ms to handle update pod network-policy-694/client-a-7974b
...
需要等到status.PodIP更新才会触发np更新, 也就是cni 那个handleAdd执行完才进入np的更新这个过程大概2s
I0207 08:17:04.159675      13 pod.go:136] namespace network-policy-694's namedPort portname is serve-80 with info &{80 map[server-h9vwr:]} //开始进入networkpolicy

I0207 08:17:04.278666      13 network_policy.go:469] create egress acl cmd is: // 执行完acl

在server端抓包,一旦有包证明pod启动脚本开始执行,可以看到抓包时间比np更新早1s

08:17:03.730644 00:00:00:9d:63:e6 > 00:00:00:b7:23:6f, ethertype IPv4 (0x0800), length 66: 10.16.0.202.80 > 10.16.0.205.43280: Flags [.], ack 2, win 211, options [nop,nop,TS val 11269029 ecr 11269028], length 0

calico 我测试了下没这种时序问题,可能calico的acl执行起来更快。

另外观察了下pod phase处于pending状态的时候就开始执行启动脚本。

2023-2 automation moved this from In progress to Done Feb 15, 2023
@oilbeater oilbeater moved this from Done to In progress in 2023-2 Feb 16, 2023
@oilbeater oilbeater reopened this Feb 16, 2023
@oilbeater
Copy link
Collaborator

Need backport to release-1.11

This was referenced Feb 20, 2023
@changluyi changluyi moved this from In progress to Done in 2023-2 Feb 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants