New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-27316: configure-ovs: generate profiles directly in /run #4042
OCPBUGS-27316: configure-ovs: generate profiles directly in /run #4042
Conversation
/retest |
/test ? |
@jcaamano: The following commands are available to trigger required jobs:
The following commands are available to trigger optional jobs:
Use
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/test e2e-metal-ipi |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
75d8b16
to
224b230
Compare
/test e2e-metal-ipi |
Up until now, configure-ovs would use nmcli to generate the profiles in /etc/NetworkManager/system-connections and then at the very end it would move them to /run/NetworkManager/system-connections The problem with this is that if configure-ovs was killed due to a hard power off, it could leave half baked profiles in /etc preventing a subsequent reboot from reaching a network-online state and running configure-ovs again. This changes makes use of the --temporary and --offline flags of nmcli to generate the profiles directly in /run to prevent that problem. If configure-ovs is explicitly configured to generate profiles in /etc it would then move them at the very end to /etc. This is subject to the problem state above but since it is not the default let's not worry too much about it. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
224b230
to
2003da5
Compare
/title OCPBUGS-27316: configure-ovs: generate profiles directly in /run |
/retitle OCPBUGS-27316: configure-ovs: generate profiles directly in /run |
/test e2e-metal-ipi |
@jcaamano: This pull request references Jira Issue OCPBUGS-27316, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/jira refresh |
@jcaamano: This pull request references Jira Issue OCPBUGS-27316, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/retest |
1 similar comment
/retest |
@cybertron PTAL |
/lgtm Makes sense to me and seems like ci is mostly happy with it. The dual stack failures don't particularly look related, but since that job runs so rarely I pushed #4265 to test it. |
/test e2e-metal-ipi-ovn-dualstack |
1 similar comment
/test e2e-metal-ipi-ovn-dualstack |
/assign @yuqi-zhang |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes seem fine, and CI looks good
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cybertron, jcaamano, yuqi-zhang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@jcaamano: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
3493ce2
into
openshift:master
@jcaamano: Jira Issue OCPBUGS-27316: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-27316 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Fix included in accepted release 4.16.0-0.nightly-2024-03-28-223620 |
local src_path | ||
src_path=$(mktemp) | ||
shift | ||
cat "$dst_path" > "$src_path" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the cat
vs. cp
vs. mv
is due to the selinux issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just because I used mktemp I think
cat "$dst_path" > "$src_path" | ||
rm -f "$dst_path" | ||
nmcli --offline c mod "$@" < "$src_path" > "$dst_path" | ||
rm -f "$src_path" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to rm -f "$src_path"
in a trap
handler in case nmcli --offline c mod
fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe handle_exit
cleans it up? The chmod 600
implies some secrecy in .nmconnection
files, so we don't want them lying around after failure?
Do we need to chmod 600 "${src_path}"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, will take a look and fix if needed
mod_nm_conn() { | ||
# the easiest thing to do here would be to use `nmcli c mod --temporary` | ||
# but there is a bug in selinux profiles that denies NM from performing | ||
# the operation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to track this bug here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should file one yeah, will do.
shopt -s nullglob | ||
new_conn_files=(${NM_CONN_CONF_PATH}/"${ovs_interface}"*) | ||
new_conn_files=(${NM_CONN_RUN_PATH}/"${ovs_interface}"*) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quotes "${NM_CONN_RUN_PATH}"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
constant with no spaces so there shouldn't be a problem for now I guess
Up until now, configure-ovs would use nmcli to generate the profiles in /etc/NetworkManager/system-connections and then at the very end it would move them to /run/NetworkManager/system-connections
The problem with this is that if configure-ovs was killed due to a hard power off, it could leave half baked profiles in /etc preventing a subsequent reboot from reaching a network-online state and running configure-ovs again.
This change makes use of the --temporary and --offline flags of nmcli to generate the profiles directly in /run to prevent that problem.
If configure-ovs is explicitly configured to generate profiles in /etc it would then move them at the very end to /etc. This is subject to the problem state above but since it is not the default let's not worry too much about it.