Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List of eks failing tests #9678

Closed
6 of 28 tasks
nebril opened this issue Nov 27, 2019 · 6 comments
Closed
6 of 28 tasks

List of eks failing tests #9678

nebril opened this issue Nov 27, 2019 · 6 comments
Labels
area/CI Continuous Integration testing issue or flake area/eni Impacts ENI based IPAM. stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.

Comments

@nebril
Copy link
Member

nebril commented Nov 27, 2019

See #9682 for how to run these
See #9675 for fixups and hacks

  • [Fail] K8sHealthTest [BeforeEach] checks cilium-health status between nodes
    [TODO] Skip on eks because endpoint-endpoint probe won't work with chaining

  • [Fail] K8sFQDNTest [BeforeEach] Restart Cilium validate that FQDN is still working
    [TODO] Hardcoded DNS bind service IP is wrong for this cluster. Need to switch to a template.

  • [Fail] K8sFQDNTest [BeforeEach] Validate that multiple specs are working correctly
    [TODO] Hardcoded DNS bind service IP is wrong for this cluster. Need to switch to a template.

  • [Fail] K8sUpdates [It] Tests upgrade and downgrade from a Cilium stable image to master
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/k8sT/Updates.go:194

  • [Fail] K8sKafkaPolicyTest Kafka Policy Tests [It] KafkaPolicies
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/k8sT/KafkaPolicies.go:237

  • [Fail] K8sIstioTest [BeforeEach] Istio Bookinfo Demo Tests bookinfo inter-service connectivity
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/k8sT/istio.go:109

  • [Fail] K8sServicesTest Checks ClusterIP Connectivity [It] Checks service on same node
    fix w/ GetNodeNames

  • [Fail] K8sServicesTest Checks service across nodes [It] Tests NodePort (kube-proxy)
    [TODO] Hardcoded DNS bind service IP is wrong for this cluster. Need to switch to a template.
    Passes with fixed DNS IP

  • [Fail] K8sServicesTest Checks service across nodes with L7 policy [It] Tests NodePort with L7 Policy
    Fails on curls to cilium host internal IP and remote IP. test/k8sT/services.go:286

  • [Fail] K8sServicesTest Bookinfo Demo [It] Tests bookinfo demo
    503 Service Unavailable probably because envoy isn’t able to upstream

  • [Fail] K8sDatapathConfig MonitorAggregation [It] Checks that monitor aggregation restricts notifications
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/k8sT/DatapathConfiguration.go:127

  • [Fail] K8sDatapathConfig Encapsulation [It] Check connectivity with sockops and VXLAN encapsulation
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/k8sT/DatapathConfiguration.go:168

  • [Fail] K8sDatapathConfig Encapsulation [It] Check connectivity with VXLAN encapsulation
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/k8sT/DatapathConfiguration.go:168

  • [Fail] K8sDatapathConfig Encapsulation [It] Check connectivity with Geneve encapsulation
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/k8sT/DatapathConfiguration.go:168

  • [Fail] K8sDatapathConfig Transparent encryption DirectRouting [It] Check connectivity with transparent encryption and direct routing
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/k8sT/assertionHelpers.go:135

  • [Fail] K8sDatapathConfig IPv4Only [It] Check connectivity with IPv6 disabled
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/k8sT/DatapathConfiguration.go:262

  • [Fail] K8sDatapathConfig ManagedEtcd [It] Check connectivity with managed etcd
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/k8sT/assertionHelpers.go:135

  • [Fail] K8sPolicyTest Basic Test [It] checks all kind of Kubernetes policies
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/k8sT/Policies.go:269
    [TODO] ingress proxy - skip

  • [Fail] K8sPolicyTest Basic Test [It] CNP test MatchExpressions key
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/k8sT/Policies.go:355
    [TODO] ingress proxy - skip

  • [Fail] K8sPolicyTest Basic Test Validate CNP update [It] Enforces connectivity correctly when the same L3/L4 CNP is updated
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/k8sT/Policies.go:674
    [TODO] Hardcoded DNS bind service IP is wrong for this cluster. Need to switch to a template.
    Passes with fixed DNS IP

  • [Fail] K8sPolicyTest Basic Test Redirects traffic to proxy when no policy is applied with proxy-visibility annotation [BeforeEach] Tests HTTP proxy visibility without policy
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/k8sT/Policies.go:786
    [TODO] Hardcoded DNS bind service IP is wrong for this cluster. Need to switch to a template.
    Passes with fixed DNS IP

  • [Fail] K8sPolicyTest Basic Test Redirects traffic to proxy when no policy is applied with proxy-visibility annotation [BeforeEach] Tests DNS proxy visibility without policy
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:539
    [TODO] Hardcoded DNS bind service IP is wrong for this cluster. Need to switch to a template.
    Passes with fixed DNS IP

  • [Fail] K8sPolicyTest Basic Test Redirects traffic to proxy when no policy is applied with proxy-visibility annotation [BeforeEach] Tests proxy visibility interactions with policy lifecycle operations
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:539
    [TODO] Hardcoded DNS bind service IP is wrong for this cluster. Need to switch to a template.
    Passes with fixed DNS IP

  • [Fail] K8sPolicyTest GuestBook Examples [It] checks policy example
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/k8sT/Policies.go:1019

  • [Fail] K8sPolicyTest Namespaces policies [BeforeEach] Tests the same Policy in different namespaces
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/k8sT/Policies.go:1113

  • [Fail] K8sPolicyTest Namespaces policies [BeforeEach] Kubernetes Network Policy by namespace selector
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:539

  • [Fail] K8sPolicyTest Namespaces policies [BeforeEach] Cilium Network policy using namespace label and L7
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:539
    [TODO] ingress proxy - skip

  • [Fail] K8sPolicyTest Clusterwide policies [BeforeEach] Test clusterwide connectivity with policies
    /Users/ray/covalent/gopath/src/github.com/cilium/cilium/test/k8sT/Policies.go:1324

@ungureanuvladvictor
Copy link
Member

ungureanuvladvictor commented Dec 22, 2020

I took a new stab at this and the following tests are the ones left failing (at commit 9c0939a):

  • K8sDatapathConfig
    • Check connectivity with transparent encryption and direct routing
    • Check connectivity with transparent encryption and direct routing with bpf_host

I suspect we still skip some tests because they do not match kernel conditions/node setup and other things. Also I think that the BPF based masq was not tested, not sure if we have specific e2e tests for those.

All tests were ran on:

  • OS: Amazon Linux 2
  • Kernel: 4.14.209-160.335.amzn2.x86_64

Tagging @joestringer since we talked about this on Slack a few weeks ago.

@ungureanuvladvictor ungureanuvladvictor added the area/eni Impacts ENI based IPAM. label Dec 23, 2020
@pchaigno
Copy link
Member

pchaigno commented Jan 7, 2021

I took a new stab at this and the following tests are the ones left failing (at commit 9c0939a):

* `K8sDatapathConfig`
  
  * `Check connectivity with transparent encryption and direct routing`
  * `Check connectivity with transparent encryption and direct routing with bpf_host`

Could they be failing for the same reason they were failing on GKE? See e0fba74.

@ungureanuvladvictor
Copy link
Member

ungureanuvladvictor commented Jan 7, 2021

I tried the tests with direct routing disabled so we test just the encryption functionality and the thing I found out is that the connectivity check is failing due to Cilium on the server side is detecting the packets as being received from the identity remote-node rather than the client side pod. After that I disabled masquerading because that it's enabled by default in the ENI Helm integration but I had no success (same failure mode).

Endpoint on one node:

kubectl exec -it -n kube-system cilium-btw9d -- cilium endpoint list
ENDPOINT   POLICY (ingress)   POLICY (egress)   IDENTITY   LABELS (source:key[=value])                                                                       IPv6   IPv4           STATUS
           ENFORCEMENT        ENFORCEMENT
183        Enabled            Disabled          46812      k8s:io.cilium.k8s.policy.cluster=default                                                                 10.0.99.221    ready
                                                           k8s:io.cilium.k8s.policy.serviceaccount=default
                                                           k8s:io.kubernetes.pod.namespace=202101071143k8sdatapathconfigtransparentencryptiondirectrouting
                                                           k8s:zgroup=testDS
777        Disabled           Disabled          1          k8s:alpha.eksctl.io/cluster-name=cilium                                                                                 ready
                                                           k8s:alpha.eksctl.io/instance-id=i-0a5cb1965fcdc953b
                                                           k8s:alpha.eksctl.io/nodegroup-name=workers
                                                           k8s:cilium.io/ci-node=k8s1
                                                           k8s:node-lifecycle=on-demand
                                                           k8s:node.kubernetes.io/instance-type=m5dn.xlarge
                                                           k8s:topology.kubernetes.io/region=us-east-1
                                                           k8s:topology.kubernetes.io/zone=us-east-1a
                                                           reserved:host
1747       Disabled           Disabled          2188       k8s:eks.amazonaws.com/component=coredns                                                                  10.0.116.73    ready
                                                           k8s:io.cilium.k8s.policy.cluster=default
                                                           k8s:io.cilium.k8s.policy.serviceaccount=coredns
                                                           k8s:io.kubernetes.pod.namespace=kube-system
                                                           k8s:k8s-app=kube-dns
1984       Disabled           Disabled          4          reserved:health                                                                                          10.0.114.226   ready
3318       Disabled           Disabled          32938      k8s:io.cilium.k8s.policy.cluster=default                                                                 10.0.126.78    ready
                                                           k8s:io.cilium.k8s.policy.serviceaccount=default
                                                           k8s:io.kubernetes.pod.namespace=202101071143k8sdatapathconfigtransparentencryptiondirectrouting
                                                           k8s:zgroup=testDSClient

Monitor output for the server on that node:

kubectl exec -it -n kube-system cilium-btw9d -- cilium monitor --related-to=183 -t drop
Listening for events on 4 CPUs with 64x4096 of shared memory
Press Ctrl-C to quit
level=info msg="Initializing dissection cache..." subsys=monitor
xx drop (Policy denied) flow 0x244ee541 to endpoint 183, identity remote-node->46812: 10.0.183.243:49881 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0x492d83db to endpoint 183, identity remote-node->46812: 10.0.183.243:18866 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0xa24b89b0 to endpoint 183, identity remote-node->46812: 10.0.183.243:30954 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0x374f52b8 to endpoint 183, identity remote-node->46812: 10.0.183.243:15740 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0xd0344b3a to endpoint 183, identity remote-node->46812: 10.0.183.243:59338 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0xb865a38a to endpoint 183, identity remote-node->46812: 10.0.183.243:65317 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0x374f52b8 to endpoint 183, identity remote-node->46812: 10.0.183.243:15740 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0xa24b89b0 to endpoint 183, identity remote-node->46812: 10.0.183.243:30954 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0x492d83db to endpoint 183, identity remote-node->46812: 10.0.183.243:18866 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0x244ee541 to endpoint 183, identity remote-node->46812: 10.0.183.243:49881 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0xb865a38a to endpoint 183, identity remote-node->46812: 10.0.183.243:65317 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0xd0344b3a to endpoint 183, identity remote-node->46812: 10.0.183.243:59338 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0xdb8dce59 to endpoint 183, identity remote-node->46812: 10.0.153.224:30037 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0x8e827543 to endpoint 183, identity remote-node->46812: 10.0.153.224:1203 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0xc63fc0da to endpoint 183, identity remote-node->46812: 10.0.153.224:6665 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0xdb8dce59 to endpoint 183, identity remote-node->46812: 10.0.153.224:30037 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0x8e827543 to endpoint 183, identity remote-node->46812: 10.0.153.224:1203 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0xc63fc0da to endpoint 183, identity remote-node->46812: 10.0.153.224:6665 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0x930d85d to endpoint 183, identity world->46812: 10.0.108.136:39974 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0xa24b89b0 to endpoint 183, identity remote-node->46812: 10.0.183.243:30954 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0x374f52b8 to endpoint 183, identity remote-node->46812: 10.0.183.243:15740 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0x492d83db to endpoint 183, identity remote-node->46812: 10.0.183.243:18866 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0x244ee541 to endpoint 183, identity remote-node->46812: 10.0.183.243:49881 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0xd0344b3a to endpoint 183, identity remote-node->46812: 10.0.183.243:59338 -> 10.0.99.221:80 tcp SYN
xx drop (Policy denied) flow 0xb865a38a to endpoint 183, identity remote-node->46812: 10.0.183.243:65317 -> 10.0.99.221:80 tcp SYN
^C
Received an interrupt, disconnecting from monitor...

Policy entries for that endpoint:

kubectl exec -it -n kube-system cilium-btw9d -- cilium bpf policy get 183
POLICY   DIRECTION   LABELS (source:key[=value])                                                                       PORT/PROTO   PROXY PORT   BYTES   PACKETS
Allow    Ingress     reserved:host                                                                                     ANY          NONE         13500   150
Allow    Ingress     k8s:io.cilium.k8s.policy.cluster=default                                                          ANY          NONE         0       0
                     k8s:io.cilium.k8s.policy.serviceaccount=default
                     k8s:io.kubernetes.pod.namespace=202101071143k8sdatapathconfigtransparentencryptiondirectrouting
                     k8s:zgroup=testDSClient
Allow    Egress      reserved:unknown                                                                                  ANY          NONE         28440   150

@pchaigno
Copy link
Member

pchaigno commented Jan 7, 2021

Is 10.0.183.243 actually a remote node IP or it is the IP of the client pod? If it's the IP of the remote node, that shouldn't happen because masquerading is disabled, right? Happy to help debug this via Slack if that helps.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not
had recent activity. It will be closed if no further activity occurs.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jul 16, 2023
@github-actions
Copy link

This issue has not seen any activity since it was marked stale.
Closing.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake area/eni Impacts ENI based IPAM. stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.
Projects
None yet
Development

No branches or pull requests

4 participants