Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unskip ETP=local+terminating endpoints #4253

Merged
merged 1 commit into from
Apr 30, 2024

Conversation

tssurya
Copy link
Member

@tssurya tssurya commented Apr 4, 2024

UPDATE:
Original fix here went into #4256
I am converting this PR to fix: #4253 (comment)

OUTDATED INFO ON OLD PR:
I think we should definitely spend some time fixing #4229

cc @jluhrsen this might be why you saw some lanes not running tests and finishing in 5mins?

- What this PR does and why is it needed
Accidental addition of | pipe during skipping makes it || which matches empty string which skips all tests :/
Not on all lanes I think shard, local, v6 ones only:

  • ovn-ci / e2e (shard-conformance, HA, local, ipv6, snatGW, 1br, ic-disabled) (pull_request)
  • ovn-ci / e2e (shard-conformance, noHA, local, dualstack, snatGW, 1br, ic-single-node-zones)

- Special notes for reviewers

- How to verify it

- Description for the changelog

@tssurya tssurya changed the title Remove extra || in SKIP formatting; Spme lanes were not running correctly and skipping ALL tests :/ Remove extra || in SKIP formatting; Some lanes were not running correctly and skipping ALL tests :/ Apr 4, 2024
Copy link

netlify bot commented Apr 4, 2024

Deploy Preview for subtle-torrone-bb0c84 ready!

Name Link
🔨 Latest commit b7d827a
🔍 Latest deploy log https://app.netlify.com/sites/subtle-torrone-bb0c84/deploys/662a7ddb34e2a10008d2e33b
😎 Deploy Preview https://deploy-preview-4253--subtle-torrone-bb0c84.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@tssurya tssurya added kind/bug All issues that are bugs and PRs opened to fix bugs area/e2e-testing area/ci CI infrastructure related issues; use ci-flakes for flakes, e2e-testing and unit-testing for tests labels Apr 4, 2024
@tssurya
Copy link
Member Author

tssurya commented Apr 4, 2024

2024-04-04T09:20:58.0658246Z   �[38;5;9m[FAIL]�[0m �[0m[sig-network] [Feature:IPv6DualStack] �[38;5;243mGranular Checks: Services Secondary IP Family [LinuxOnly] �[38;5;9m�[1m[It] should function for node-Service: udp�[0m
2024-04-04T09:20:58.0660127Z   �[38;5;243mtest/e2e/framework/network/utils.go:844�[0m
2024-04-04T09:20:58.0668448Z   �[38;5;9m[FAIL]�[0m �[0m[sig-network] [Feature:IPv6DualStack] �[38;5;243mGranular Checks: Services Secondary IP Family [LinuxOnly] �[38;5;9m�[1m[It] should update endpoints: udp�[0m
2024-04-04T09:20:58.0670475Z   �[38;5;243mtest/e2e/framework/network/utils.go:844�[0m
2024-04-04T09:20:58.0673858Z   �[38;5;9m[FAIL]�[0m �[0m[sig-network] [Feature:IPv6DualStack] �[38;5;243mGranular Checks: Services Secondary IP Family [LinuxOnly] �[38;5;9m�[1m[It] should function for pod-Service: sctp [Feature:SCTPConnectivity]�[0m
2024-04-04T09:20:58.0675988Z   �[38;5;243mtest/e2e/framework/network/utils.go:844�[0m
2024-04-04T09:20:58.0678092Z   �[38;5;9m[FAIL]�[0m �[0m[sig-network] [Feature:IPv6DualStack] �[38;5;243mGranular Checks: Services Secondary IP Family [LinuxOnly] �[38;5;9m�[1m[It] should function for node-Service: http�[0m
2024-04-04T09:20:58.0679908Z   �[38;5;243mtest/e2e/framework/network/utils.go:844�[0m
2024-04-04T09:20:58.0683108Z   �[38;5;9m[FAIL]�[0m �[0m[sig-network] [Feature:IPv6DualStack] �[38;5;243mGranular Checks: Services Secondary IP Family [LinuxOnly] �[38;5;9m�[1m[It] should be able to handle large requests: http�[0m
2024-04-04T09:20:58.0684994Z   �[38;5;243mtest/e2e/framework/network/utils.go:844�[0m
2024-04-04T09:20:58.0687307Z   �[38;5;9m[FAIL]�[0m �[0m[sig-network] [Feature:IPv6DualStack] �[38;5;243mGranular Checks: Services Secondary IP Family [LinuxOnly] �[38;5;9m�[1m[It] should function for service endpoints using hostNetwork�[0m
2024-04-04T09:20:58.0689246Z   �[38;5;243mtest/e2e/framework/network/utils.go:844�[0m
2024-04-04T09:20:58.0691596Z   �[38;5;9m[FAIL]�[0m �[0m[sig-network] [Feature:IPv6DualStack] �[38;5;243mGranular Checks: Services Secondary IP Family [LinuxOnly] �[38;5;9m�[1m[It] should function for client IP based session affinity: udp [LinuxOnly]�[0m
2024-04-04T09:20:58.0693909Z   �[38;5;243mtest/e2e/framework/network/utils.go:844�[0m
2024-04-04T09:20:58.0695970Z   �[38;5;9m[FAIL]�[0m �[0m[sig-network] [Feature:IPv6DualStack] �[38;5;243mGranular Checks: Services Secondary IP Family [LinuxOnly] �[38;5;9m�[1m[It] should function for pod-Service: udp�[0m
2024-04-04T09:20:58.0697749Z   �[38;5;243mtest/e2e/framework/network/utils.go:844�[0m
2024-04-04T09:20:58.0699723Z   �[38;5;9m[FAIL]�[0m �[0m[sig-network] [Feature:IPv6DualStack] �[38;5;243mGranular Checks: Services Secondary IP Family [LinuxOnly] �[38;5;9m�[1m[It] should update endpoints: http�[0m
2024-04-04T09:20:58.0701457Z   �[38;5;243mtest/e2e/framework/network/utils.go:844�[0m
2024-04-04T09:20:58.0703789Z   �[38;5;9m[FAIL]�[0m �[0m[sig-network] [Feature:IPv6DualStack] �[38;5;243mGranular Checks: Services Secondary IP Family [LinuxOnly] �[38;5;9m�[1m[It] should function for endpoint-Service: http�[0m
2024-04-04T09:20:58.0705625Z   �[38;5;243mtest/e2e/framework/network/utils.go:844�[0m
2024-04-04T09:20:58.0707708Z   �[38;5;9m[FAIL]�[0m �[0m[sig-network] [Feature:IPv6DualStack] �[38;5;243mGranular Checks: Services Secondary IP Family [LinuxOnly] �[38;5;9m�[1m[It] should function for endpoint-Service: udp�[0m
2024-04-04T09:20:58.0709529Z   �[38;5;243mtest/e2e/framework/network/utils.go:844�[0m
2024-04-04T09:20:58.0711551Z   �[38;5;9m[FAIL]�[0m �[0m[sig-network] [Feature:IPv6DualStack] �[38;5;243mGranular Checks: Services Secondary IP Family [LinuxOnly] �[38;5;9m�[1m[It] should function for pod-Service: http�[0m
2024-04-04T09:20:58.0713346Z   �[38;5;243mtest/e2e/framework/network/utils.go:844�[0m
2024-04-04T09:20:58.0716177Z   �[38;5;9m[FAIL]�[0m �[0m[sig-network] [Feature:IPv6DualStack] �[38;5;243mGranular Checks: Services Secondary IP Family [LinuxOnly] �[38;5;9m�[1m[It] should function for client IP based session affinity: http [LinuxOnly]�[0m
2024-04-04T09:20:58.0718296Z   �[38;5;243mtest/e2e/framework/network/utils.go:844�[0m
2024-04-04T09:20:58.0720396Z   �[38;5;9m[FAIL]�[0m �[0m[sig-network] [Feature:IPv6DualStack] �[38;5;243mGranular Checks: Services Secondary IP Family [LinuxOnly] �[38;5;9m�[1m[It] should be able to handle large requests: udp�[0m
2024-04-04T09:20:58.0722382Z   �[38;5;243mtest/e2e/framework/network/utils.go:844�[0m

:P seems like we have regressed?

@@ -125,9 +125,6 @@ if [ "$DUALSTACK_CONVERSION" == true ]; then
fi

if [ "$OVN_GATEWAY_MODE" == "local" ]; then
if [ "$SKIPPED_TESTS" != "" ]; then
SKIPPED_TESTS+="|"
fi
SKIPPED_TESTS+="should fallback to local terminating endpoints when there are no ready endpoints with externalTrafficPolicy=Local"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the other tests seem to have a newline that starts them, then groomTestList substitutes the newline for a |

Doesnt this need a newline as well?

Copy link
Member Author

@tssurya tssurya Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

speaking of which @ricky-rav I should not have to skip this test anymore right? don't we support terminating endpoints feature with ETP=local?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trozet FYI in lgw v6 lane:

ginkgo --nodes=10 '--focus=\[Conformance\]|\[sig-network\]' '--skip=\[Feature:Networking-Performance\]|\[Feature:PerformanceDNS\]|Disruptive|DisruptionController|\[sig-apps\]\sCronJob|\[sig-storage\]|\[Feature:Federation\]|should\shave\sipv4\sand\sipv6\sinternal\snode\sip|kube-proxy|should\sset\sTCP\sCLOSE_WAIT\stimeout|named\sport.+\[Feature:NetworkPolicy\]|should\screate\sa\sPod\swith\sSCTP\sHostPort|service.kubernetes.io/headless|should\sresolve\sconnection\sreset\sissue\s#74839|sig-api-machinery|\[Feature:NoSNAT\]|LoadBalancers\sshould|configMap\snameserver|ClusterDns\s\[Feature:Example\]|should\sset\sdefault\svalue\son\snew\sIngressClass|should\sprevent\sIngress\screation\sif\smore\sthan\s1\sIngressClass\smarked\sas\sdefault|validates\sthat\sthere\sis\sno\sconflict\sbetween\spods\swith\ssame\shostPort\sbut\sdifferent\shostIP\sand\sprotocol|\[Feature:Networking-IPv6\]|should\sprovider\sInternet\sconnection\sfor\scontainers\susing\sDNS|\[Feature:Networking-IPv4\]|IPBlock.CIDR\sand\sIPBlock.Except|Network.+should\sresolve\sconnection\sreset\sissue|NetworkPolicy.+should\sallow\segress\saccess\sto\sserver\sin\sCIDR\sblock|\[Feature:.*DualStack.*\]|should\sfallback\sto\slocal\sterminating\sendpoints\swhen\sthere\sare\sno\sready\sendpoints\swith\sexternalTrafficPolicy=Local|\[Serial\]' 

so I see the pipe getting added before the should fallback so we are good there I think ^ ?
[Feature:.*DualStack.*\]|should\sfallback\sto\slocal\sterminating\sendpoints\swhen\sthere\sare\sno\sready\sendpoints\swith\sexternalTrafficPolicy=Local|\[Serial\]'

Copy link
Contributor

@ricky-rav ricky-rav Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

speaking of which @ricky-rav I should not have to skip this test anymore right? don't we support terminating endpoints feature with ETP=local?

Looking at the title of the test, I was going to say that we should wait for this PR of mine #4170 to be merged.

But then I ran the test on KIND both on the PR image and the master image. It works in both cases. :-o

So I looked at the test case and it's quite bizarre: it tests a terminating endpoint for a service with ETP=local, but the clients from which it connects to the service are all WITHIN the cluster, so ETP=local means nothing since we're not generating any external traffic and that's why my current fix isn't needed for this test after all.
https://github.com/kubernetes/kubernetes/blob/master/test/e2e/network/service.go#L3043

In the upcoming weeks I should probably write a proper e2e test for this, possibly with Antonio's cloud load balancer for KIND.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha! so I guess I can remove skipping this then!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, let's remove it!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done! let's see if it passes on v4 and v6 lanes correctly.

@jluhrsen
Copy link
Contributor

jluhrsen commented Apr 4, 2024

cc @jluhrsen this might be why you saw some lanes not running tests and finishing in 5mins?

thank you for looking.

@martinkennelly
Copy link
Member

Looking into the failures but its time boxed to today.

@tssurya
Copy link
Member Author

tssurya commented Apr 5, 2024

also saving link here for @martinkennelly : https://github.com/ovn-org/ovn-kubernetes/actions/runs/8551587549/job/23431934438?pr=4253

rerunning that lane to ensure it was not flake,

@jluhrsen
Copy link
Contributor

jluhrsen commented Apr 5, 2024

Looking into the failures but its time boxed to today.

the Granular Checks: Services Secondary IP Family [LinuxOnly] ...... tests are failing in our d/s bare-metal dualstack
local GW job as well. Weirdly, they don't all fail every time, but at least a couple fail every time so the job went perma-fail
ever since 4.14. The job eventually got removed entirely so the only notice we had was this bug assigned to me that
i only recently started working on.

@tssurya tssurya changed the title Remove extra || in SKIP formatting; Some lanes were not running correctly and skipping ALL tests :/ Unskip ETP=local+terminating endpoints Apr 5, 2024
@tssurya
Copy link
Member Author

tssurya commented Apr 12, 2024

@ricky-rav PTAL

@tssurya
Copy link
Member Author

tssurya commented Apr 23, 2024

unrelated flakes:
AdminNetworkPolicyIngressSCTP and [FAIL] e2e egress IP validation [OVN network] Using different methods to disable a node's availability for egress Should validate the egress IP functionality against remote hosts [It] disabling egress nodes impeding GRCP health check both have issues, adding this to review tracker board.

@ricky-rav
Copy link
Contributor

I think there are still tests that somehow we don't run in IPv6, but we do in IPv4 and dual-stack. In particular I found a bunch (#4170 (comment)) that only failed when tested in the one IPv6 CI lane we have downstream. I'll investigate further once I'm done with the 4.12 backport for that same PR.

@tssurya
Copy link
Member Author

tssurya commented Apr 25, 2024

@ricky-rav yeah that's fine, the goal of this PR is to remove the skip we have currently which you mentioned is no longer needed right? or am I missing something? (note I don't intend to unskip all here)

also there is a bunch of issues that we skip for v6 in general

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
@coveralls
Copy link

Coverage Status

Changes unknown
when pulling b7d827a on tssurya:fix-ci-skip-parsing
into ** on ovn-org:master**.

@trozet trozet merged commit f8cadd0 into ovn-org:master Apr 30, 2024
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci CI infrastructure related issues; use ci-flakes for flakes, e2e-testing and unit-testing for tests area/e2e-testing kind/bug All issues that are bugs and PRs opened to fix bugs
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

6 participants