Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: K8sServicesTest Checks service across nodes Tests NodePort BPF BPF NAT engine handles unknown protocol packets Should not drop SCTP packets #16838

Closed
jrajahalme opened this issue Jul 9, 2021 · 1 comment · Fixed by #16895
Assignees
Labels
area/CI Continuous Integration testing issue or flake ci/flake This is a known failure that occurs in the tree. Please investigate me!
Projects

Comments

@jrajahalme
Copy link
Member

jrajahalme commented Jul 9, 2021

In https://jenkins.cilium.io/job/Cilium-PR-K8s-1.20-kernel-4.19/918/testReport/junit/Suite-k8s-1/20/K8sServicesTest_Checks_service_across_nodes_Tests_NodePort_BPF_BPF_NAT_engine_handles_unknown_protocol_packets_Should_not_drop_SCTP_packets/:

/home/jenkins/workspace/Cilium-PR-K8s-1.20-kernel-4.19/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:518
Expected
    <int>: 0
to be >
    <int>: 0
/home/jenkins/workspace/Cilium-PR-K8s-1.20-kernel-4.19/src/github.com/cilium/cilium/test/k8sT/Services.go:1393

The problem here is that iperf transmits no data (0 bytes), which is not greater than 0.

Test output (stderr) shows that iperf3 fails with an error:

iperf3: error - control socket has closed unexpectedly
command terminated with exit code 1

iperf pod is in hostnetworking mode, if that makes a difference.

Host endpoint has no policy enforcement, but has not seen any traffic either:

/sys/fs/bpf/tc/globals/cilium_policy_01097:

POLICY   DIRECTION   IDENTITY   PORT/PROTO   PROXY PORT   BYTES   PACKETS   
Allow    Ingress     0          ANY          NONE         0       0         
Allow    Egress      0          ANY          NONE         0       0         

e3cb6a69_K8sServicesTest_Checks_service_across_nodes_Tests_NodePort_BPF_BPF_NAT_engine_handles_unknown_protocol_packets_Should_not_drop_SCTP_packets (1).zip

@jrajahalme jrajahalme added area/CI Continuous Integration testing issue or flake ci/flake This is a known failure that occurs in the tree. Please investigate me! labels Jul 9, 2021
@jrajahalme jrajahalme assigned pchaigno and unassigned pchaigno Jul 9, 2021
@pchaigno
Copy link
Member

pchaigno commented Jul 9, 2021

Initial Analysis

iperf3-server-9pf5x, the source pod is on k8s2, with IP address 192.168.36.12. It's trying to reach 192.168.36.11 so we don't have any visibility on that traffic (BPF programs are attached on native devices but monitor aggregation is enabled so no {to,from}-network traces).

According to test-output.log, the iperf3 client failed with:

iperf3: error - control socket has closed unexpectedly

The server failed with:

iperf3: error - unable to read from stream socket: Bad file descriptor

Many reports of this error on the iperf repository, so probably fairly generic.

CI Dashboard Analysis

Via the CI Dashboard, I found one other case on master, dating from June 30th:
https://jenkins.cilium.io/job/cilium-master-k8s-1.16-kernel-net-next/549/testReport/junit/Suite-k8s-1/16/K8sServicesTest_Checks_service_across_nodes_Tests_NodePort_BPF_BPF_NAT_engine_handles_unknown_protocol_packets_Should_not_drop_SCTP_packets/
20394869_K8sServicesTest_Checks_service_across_nodes_Tests_NodePort_BPF_BPF_NAT_engine_handles_unknown_protocol_packets_Should_not_drop_SCTP_packets.zip

Next Steps

We are missing the SCTP ss output to be able to check the listening/established connections. I'll send a PR to fix that. We can also temporarily disable monitor aggregation for that test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake ci/flake This is a known failure that occurs in the tree. Please investigate me!
Projects
No open projects
CI Force
  
Awaiting triage
Development

Successfully merging a pull request may close this issue.

3 participants