Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Cilium L4LB XDP - LoadBalancing test - Failed to connect to 10.0.0.2 port 80 #31167

Closed
giorio94 opened this issue Mar 5, 2024 · 5 comments
Labels
area/CI Continuous Integration testing issue or flake ci/flake This is a known failure that occurs in the tree. Please investigate me!

Comments

@giorio94
Copy link
Member

giorio94 commented Mar 5, 2024

CI failure

Hit twice on #31013
Link: https://github.com/cilium/cilium/actions/runs/8152624430/job/22299599878

+[14:49:38] LB_VIP=10.0.0.2
++[14:49:38] docker inspect nginx -f '{{ .State.Pid }}'
+[14:49:38] nsenter -t 3500 -n /bin/sh -c 'ip a a dev eth0 10.0.0.2/32'
+[14:49:38] docker exec -t lb-node docker exec -t cilium-lb cilium-dbg service update --id 1 --frontend 10.0.0.2:80 --backends 172.18.0.3:80 --k8s-load-balancer
Deprecation warning: --id parameter will change from int to string in v1.14

Creating new service with id '1'

Added service with 1 backends

++[14:49:38] docker exec lb-node ip -o -4 a s eth0
++[14:49:38] awk '{print $4}'
++[14:49:38] cut -d/ -f1
+[14:49:39] LB_NODE_IP=172.18.0.2
+[14:49:39] ip r a 10.0.0.2/32 via 172.18.0.2
++[14:49:39] seq 1 10
+[14:49:39] for i in $(seq 1 10)
+[14:49:39] curl -o /dev/null 10.0.0.2:80
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--     0
.....
  0     0    0     0    0     0      0      0 --:--:--  0:02:14 --:--:--     0
curl: (28) Failed to connect to 10.0.0.2 port 80 after 134353 ms: Connection timed out
+[14:51:53] echo 'Failed 1'
Failed 1
+[14:51:53] exit -1

Looking at previously failed attempts, there are a few other recent occurrences of the same error:

@giorio94 giorio94 added area/CI Continuous Integration testing issue or flake ci/flake This is a known failure that occurs in the tree. Please investigate me! labels Mar 5, 2024
@danehans
Copy link
Contributor

danehans commented Mar 5, 2024

@giorio94
Copy link
Member Author

giorio94 commented Mar 6, 2024

Rebasing onto main seems to have helped in my case (the failure looked pretty consistent previously). So maybe the branch was missing some fixes which got in recently.

@tommyp1ckles
Copy link
Contributor

@danehans did rebasing help with this as well, other two hits on this one seemed to potentially be solved with a rebase (not sure what commit however).

@danehans
Copy link
Contributor

danehans commented Mar 9, 2024

@tommyp1ckles rebasing fixed this issue for #30921.

@tommyp1ckles
Copy link
Contributor

@danehans Thanks, I'm going to close for now unless someone else hits it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake ci/flake This is a known failure that occurs in the tree. Please investigate me!
Projects
None yet
Development

No branches or pull requests

3 participants