-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ovirt e2e: fix nodeport issue on CI bug 1794714 #7614
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Gal-Zaidman The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
7edcdca
to
fe6c0d2
Compare
1bb6881
to
5d1df3f
Compare
This patch adds the workaround suggested on [1] to make nodeport work and avoid the conformance tests failures. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1794714 Signed-off-by: Gal Zaidman <gzaidman@redhat.com>
5d1df3f
to
260124d
Compare
/test pj-rehearse |
/assign @wking |
${TEST_COMMAND} | ||
|
||
# Runs an install | ||
- name: setup | ||
# A midstep till we have the installer work merged, then we | ||
# can use the CI artifact |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is your comment from #4340. Can you explain (ideally in the commit message), why you're removing it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will edit the commit message
done | ||
oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"managementState":"Managed","storage":{"emptyDir":{}}}}' | ||
oc --insecure-skip-tls-verify patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"managementState":"Managed","storage":{"emptyDir":{}}}}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you adding these? Don't we have a kubeconfig with the CA for the Kube API? If that CA doesn't match what the cluster serves us, I'd rather error out here instead of ignoring the mismatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
I will remove it, it worked fine before.
I have added it because for some reason the "get nodes" fail for me locally without it.
@@ -441,7 +438,11 @@ objects: | |||
wait "$!" | |||
install_exit_status=$? | |||
sleep 10m | |||
oc get co/image-registry | |||
# This is a workaround for https://bugzilla.redhat.com/show_bug.cgi?id=1794714 | |||
for n in $(oc --insecure-skip-tls-verify get nodes|awk '{print $1}'|grep worker);do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like operating on the table output of get nodes
. Can you use -o jsonpath=...
to have it spit out the field you want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure didn't know the option, I will try it
oc get co/image-registry | ||
# This is a workaround for https://bugzilla.redhat.com/show_bug.cgi?id=1794714 | ||
for n in $(oc --insecure-skip-tls-verify get nodes|awk '{print $1}'|grep worker);do | ||
oc -n default --insecure-skip-tls-verify debug node/$n --image=centos/tools -- ethtool --offload vxlan_sys_4789 tx off |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this is the sort of change that should be handled by a MachineConfig, although I don't know what you'd put in the config to apply this specific ethtool
tweak. Dropping into a debug session on each node and bumping things yourself seems very brittle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we talked about it we said that we don't want to apply it by default on a cluster because it is a tweak that disables checksum and it seems to us like a decision that we don't want by default, plus it is a high priority bug which affects most platforms, happens only when using OpenshiftSDN , and has network people working on it.
We wanted to apply the fix in the CI because it made network test flaky during conformance.
Not relevant any more, decided to implement a MCO fix |
This patch adds the workaround suggested on [1]
to make nodeport work and avoid the conformance
tests failures.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1794714
Signed-off-by: Gal Zaidman gzaidman@redhat.com