-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TRT-1637: Revert #2142 "OCPBUGS-32985,OCPBUGS-32925: Dockerfile: Bump OVS to 3.3.0-2" #2149
Conversation
@dgoodwin: This pull request references TRT-1637 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/payload-job periodic-ci-openshift-release-master-nightly-4.16-e2e-metal-ipi-ovn-ipv6 |
/hold Again, just a test, we're trying to find out what's taken out metal ipv6. |
@dgoodwin: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/0c187890-07da-11ef-97a3-4dcdc5dad773-0 |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: dgoodwin The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@dgoodwin: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/payload-job periodic-ci-openshift-release-master-nightly-4.16-e2e-metal-ipi-ovn-ipv6 4 |
@dgoodwin: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/12125f20-0804-11ef-941d-a1bee7b8f6d3-0 |
/payload-job periodic-ci-openshift-release-master-nightly-4.16-e2e-metal-ipi-ovn-ipv6 |
@dgoodwin: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/263f9fd0-0804-11ef-8c61-f66db269ab06-0 |
/payload-aggregate periodic-ci-openshift-release-master-nightly-4.16-e2e-metal-ipi-ovn-ipv6 4 |
@dgoodwin: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/6912dcf0-0804-11ef-9f59-92dc784163d1-0 |
FWIW, it's not impossible, but it is very unlikely for this change to cause the issue. This version bump only affects the OVN database servers and a few utilities, it doesn't affect the actual OVS precesses that are coming from RHCOS and didn't change. Unless you see OVN databases crashing, of course, but I would expect normal CI and QE to catch such an issue. |
And it also looks like the job was failing more than it succeeded for a long time with very similar symptoms of connection refused for the metalkube API. This one for example from the last Friday: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.16-e2e-metal-ipi-ovn-ipv6/1783874819863875584 And another one from the week before: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.16-e2e-metal-ipi-ovn-ipv6/1782797291946512384 (I didn't check all the failures in between, just looked at a couple random ones). |
Yes this was just a shot in the dark, it started failing payloads recently but that seems to have just been bad luck and it's been around for some time longer. 6 clean runs on this revert but its likely a 10-20% failure rate so that doesn't mean much. /close |
Thanks for taking a look and the analysis. |
@dgoodwin: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Reverts #2142 ; tracked by TRT-1637
Per OpenShift policy, we are reverting this breaking change to get CI and/or nightly payloads flowing again.
Test revert only, we don't know if this is our issue yet, but some metal jobs now are losing apiserver during install.
To unrevert this, revert this PR, and layer an additional separate commit on top that addresses the problem. Before merging the unrevert, please run these jobs on the PR and check the result of these jobs to confirm the fix has corrected the problem:
CC: @igsilya