Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: K8sServicesTest Checks service across nodes Tests NodePort BPF Tests with direct routing Tests NodePort with Maglev Tests NodePort #13839

Closed
Tracked by #13138
aditighag opened this issue Nov 2, 2020 · 8 comments
Assignees
Labels
area/CI Continuous Integration testing issue or flake ci/flake This is a known failure that occurs in the tree. Please investigate me! sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects

Comments

@aditighag
Copy link
Member

aditighag commented Nov 2, 2020

/home/jenkins/workspace/Cilium-PR-K8s-1.12-net-next/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:514
Request from testclient-sj9l5 pod to service http://10.107.142.225:10080 failed
Expected command: kubectl exec -n default testclient-sj9l5 -- /bin/bash -c 'fails=""; id=$RANDOM; for i in $(seq 1 10); do if curl --path-as-is -s -D /dev/stderr --fail --connect-timeout 5 --max-time 20 http://10.107.142.225:10080 -H "User-Agent: cilium-test-$id/$i"; then echo "Test round $id/$i exit code: $?"; else fails=$fails:$id/$i=$?; fi; done; if [ -n "$fails" ]; then echo "failed: $fails"; fi; cnt="${fails//[^:]}"; if [ ${#cnt} -gt 0 ]; then exit 42; fi' 
To succeed, but it failed:
Exitcode: 42 
Err: exit status 42
Stdout:

99c5ba2b_K8sServicesTest_Checks_service_across_nodes_Tests_NodePort_BPF_Tests_with_direct_routing_Tests_NodePort_with_Maglev_Tests_NodePort.zip

Hit in #13812

@aditighag aditighag added area/CI Continuous Integration testing issue or flake ci/flake This is a known failure that occurs in the tree. Please investigate me! labels Nov 2, 2020
@brb
Copy link
Member

brb commented Nov 3, 2020

Just two add more context, the 5/10 request from a client pod to the svc failed with the curl err code 28. This was testing bpf_sock, so it has nothing to do with Maglev.

@sayboras
Copy link
Member

Found in #14113 as well.

@stale
Copy link

stale bot commented Feb 22, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Feb 22, 2021
@qmonnet qmonnet removed the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Apr 30, 2021
gandro added a commit to gandro/cilium that referenced this issue May 31, 2021
This increases the curl connection timeout from 5 to 15 seconds to avoid
issues with IPCache propagation delay. On Cilium master an 1.10, it
seems that IPCache updates in CI can take up to 4-8 seconds.

CI flakes likely caused by the increased IPCache propagation delay:

 - cilium#13839
 - cilium#14959
 - cilium#15103
 - cilium#16237

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
@stale
Copy link

stale bot commented Jul 1, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jul 1, 2021
@stale
Copy link

stale bot commented Jul 21, 2021

This issue has not seen any activity since it was marked stale. Closing.

@stale stale bot closed this as completed Jul 21, 2021
@jrajahalme jrajahalme reopened this Jul 29, 2021
@stale stale bot removed stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. labels Jul 29, 2021
@joestringer
Copy link
Member

From community meeting: Sounds like when this fails, maybe local requests work but requests to remote backends fail.

@joestringer joestringer added this to To quarantine in 1.11 CI via automation Dec 1, 2021
@joestringer joestringer moved this from To quarantine to Investigating in 1.11 CI Dec 1, 2021
@brb brb added the sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. label Feb 17, 2022
@brb
Copy link
Member

brb commented Feb 17, 2022

The recent K8sServices refactoring might have resolved the flake. Closing for now.

@brb brb closed this as completed Feb 17, 2022
1.11 CI automation moved this from Investigating to Evaluate to exit quarantine Feb 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake ci/flake This is a known failure that occurs in the tree. Please investigate me! sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
No open projects
1.11 CI
Evaluate to exit quarantine
Development

No branches or pull requests

7 participants