Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: pod-to-ingress-service-allow-ingress-identity/pod-to-ingress-service: exit code 22 #29468

Open
joestringer opened this issue Nov 28, 2023 · 5 comments
Assignees
Labels
area/CI Continuous Integration testing issue or flake ci/flake This is a known failure that occurs in the tree. Please investigate me!

Comments

@joestringer
Copy link
Member

CI failure

Failure: https://github.com/cilium/cilium/actions/runs/7025266863/job/19115742712

Example failure:

[=] Test [pod-to-ingress-service-allow-ingress-identity]
.
  ℹ️  📜 Applying CiliumNetworkPolicy 'all-ingress-deny' to namespace 'cilium-test'..
  ℹ️  📜 Applying CiliumNetworkPolicy 'allow-from-cilium-ingress' to namespace 'cilium-test'..
  [-] Scenario [pod-to-ingress-service-allow-ingress-identity/pod-to-ingress-service]
  [.] Action [pod-to-ingress-service-allow-ingress-identity/pod-to-ingress-service/curl-0: cilium-test/client-75bff5f5b9-7mv47 (10.244.1.243) -> cilium-test/cilium-ingress-ingress-service (cilium-ingress-ingress-service.cilium-test:80)]
  ❌ command "curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 [http://cilium-ingress-ingress-service.cilium-test:80](http://cilium-ingress-ingress-service.cilium-test/)" failed: command terminated with exit code 22

...

📋 Test Report
❌ 2/60 tests failed (4/591 actions), 4 tests skipped, 0 scenarios skipped:
Test [pod-to-ingress-service]:
connectivity test failed: 2 tests failed
  ❌ pod-to-ingress-service/pod-to-ingress-service/curl-0: cilium-test/client-75bff5f5b9-7mv47 (10.244.1.243) -> cilium-test/cilium-ingress-ingress-service (cilium-ingress-ingress-service.cilium-test:80)
  ❌ pod-to-ingress-service/pod-to-ingress-service/curl-1: cilium-test/client2-88575dbb7-fvpb4 (10.244.1.77) -> cilium-test/cilium-ingress-ingress-service (cilium-ingress-ingress-service.cilium-test:80)
Test [pod-to-ingress-service-allow-ingress-identity]:
  ❌ pod-to-ingress-service-allow-ingress-identity/pod-to-ingress-service/curl-0: cilium-test/client-75bff5f5b9-7mv47 (10.244.1.243) -> cilium-test/cilium-ingress-ingress-service (cilium-ingress-ingress-service.cilium-test:80)
  ❌ pod-to-ingress-service-allow-ingress-identity/pod-to-ingress-service/curl-1: cilium-test/client2-88575dbb7-fvpb4 (10.244.1.77) -> cilium-test/cilium-ingress-ingress-service (cilium-ingress-ingress-service.cilium-test:80)
Error: Process completed with exit code 1.

man curl describes exit code 22 as:

       22     HTTP page not retrieved. The requested url was not found or returned another error with the HTTP error code being 400 or above. This return code only appears if -f, --fail is used.

I have a copy of the zip files that I can share, but apparently they're 60MB per failure these days 🙈 .

@joestringer joestringer added area/CI Continuous Integration testing issue or flake ci/flake This is a known failure that occurs in the tree. Please investigate me! labels Nov 28, 2023
@joestringer
Copy link
Member Author

@sayboras does this look familiar to you at all? I saw that you had written these tests originally.

@sayboras
Copy link
Member

The tests were added a while back, but not enabled in any of GHA CI jobs until recently. So I am kind of keeping an eye on for any flake. Let me take a closer loop, thanks for the ping.

Related PR: #29130

@pippolo84
Copy link
Member

pippolo84 commented Feb 1, 2024

Another hit here

[=] Test [outside-to-ingress-service] [63/69]
.
  [-] Scenario [outside-to-ingress-service/outside-to-ingress-service]
  [.] Action [outside-to-ingress-service/outside-to-ingress-service/curl-ingress-service-0: cilium-test/host-netns-non-cilium-gr224 (172.18.0.5) -> cilium-test/cilium-ingress-ingress-service (cilium-ingress-ingress-service.cilium-test:80)]
  ❌ command "curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 30 http://172.18.0.4:31000/" failed: error with exec request (pod=cilium-test/host-netns-non-cilium-gr224, container=host-netns-non-cilium): command terminated with exit code 22

Logs: logs_2036808.zip
Sysdump: too big to be uploaded here, I'll keep a copy

@sayboras
Copy link
Member

There is no reliable way to check if the new policy in envoy has been applied, this causes the flake in our CI for L7/Ingress related policy. One point worth mentioning is that envoy update is not incremental (i.e. it's state of the world every time), so the updating time is sometimes longer even only for one single policy.

@aditighag
Copy link
Member

Hit on - https://github.com/cilium/cilium/actions/runs/8287906154/job/22681325788

❌ 1/66 tests failed (1/980 actions), 9 tests skipped, 0 scenarios skipped:
Test [pod-to-ingress-service]:
  ❌ pod-to-ingress-service/pod-to-ingress-service/curl-2: cilium-test/client3-868f7b8f6b-rgqld (10.244.3.75) -> cilium-test/cilium-ingress-ingress-service (cilium-ingress-ingress-service.cilium-test:80)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake ci/flake This is a known failure that occurs in the tree. Please investigate me!
Projects
None yet
Development

No branches or pull requests

4 participants