Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1971715: configure-ovs: fix nondeterministic master in slave profiles #2626

Merged
merged 1 commit into from
Jun 25, 2021

Conversation

jcaamano
Copy link
Contributor

@jcaamano jcaamano commented Jun 18, 2021

- What I did
configure-ovs sets the master device name as connection.master on slave
connection profiles when the default gateway being replaced is a master
device.

Those slave profiles may activate before the master profile and if so,
NM may chose non-deterministically any master profile that might provide
a connection for the master device name, which may be the old master
connection profile instead of the replacement connection profile. The
auto-connect priority does not seem to be considered as this might not
be regarded by NM as an auto-connect of the master profile in strict
sense.

Setting the master connection profile uuid as connection.master on slave
connection profiles ensures that the correct master profile is activated
when the slave auto-connects. This does require a bit more effort when
reverting the configuration.

Additionally, if there is any modification to the slave connection
profiles, those need to be persisted as well from
systemConnectionsMerged to system-connections

- How to verify it
Run ./configure-ovs OVNKubernetes on a device with a bond interface as default gateway, using the following NM configuration:

$ cat /etc/NetworkManager/conf.d/99-keyfiles.conf
[keyfile]
path=/etc/NetworkManager/systemConnectionsMerged

where /etc/NetworkManager/systemConnectionsMerged is an overlay fs as:

overlay on /etc/NetworkManager/systemConnectionsMerged type overlay (rw,relatime,seclabel,lowerdir=/etc/NetworkManager/system-connections,upperdir=/run/nm-system-connections,workdir=/run/nm-system-connections-work)

Verify that the applied network configuration persists accross reboots.

- Description for the changelog
Improved configuration of OVS with bond interfaces for OVNKuberentes

fixes: #2519
fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1971715

Signed-off-by: Jaime Caamaño Ruiz jcaamano@redhat.com

@openshift-ci openshift-ci bot added bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Jun 18, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 18, 2021

@jcaamano: This pull request references Bugzilla bug 1971715, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.0) matches configured target release for branch (4.9.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

No GitHub users were found matching the public email listed for the QA contact in Bugzilla (anusaxen@redhat.com), skipping review request.

In response to this:

Bug 1971715: configure-ovs: fix nondeterministic master in slave profiles

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jcaamano
Copy link
Contributor Author

/cc @trozet @msherif1234

@jcaamano
Copy link
Contributor Author

/retest

Copy link
Contributor

@trozet trozet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems OK to me, but want @msherif1234 to take a look

local old="$1"
local new="$2"
for conn_uuid in $(nmcli -g UUID connection show) ; do
if [ "$(nmcli -g connection.master connection show uuid "$conn_uuid")" != "$old" ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcaamano what if the old connection has a connection.master of device name instead of UUID, should we check that too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, we look for both things.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 22, 2021

@jcaamano: This pull request references Bugzilla bug 1971715, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.0) matches configured target release for branch (4.9.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

No GitHub users were found matching the public email listed for the QA contact in Bugzilla (anusaxen@redhat.com), skipping review request.

In response to this:

Bug 1971715: configure-ovs: fix nondeterministic master in slave profiles

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jcaamano
Copy link
Contributor Author

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 23, 2021

@jcaamano: An error was encountered querying GitHub for users with public email (anusaxen@redhat.com) for bug 1971715 on the Bugzilla server at https://bugzilla.redhat.com. No known errors were detected, please see the full error message for details.

Full error message. non-200 OK status code: 403 Forbidden body: "{\n \"documentation_url\": \"https://docs.github.com/en/free-pro-team@latest/rest/overview/resources-in-the-rest-api#abuse-rate-limits\",\n \"message\": \"You have triggered an abuse detection mechanism. Please wait a few minutes before you try again.\"\n}\n"

Please contact an administrator to resolve this issue, then request a bug refresh with /bugzilla refresh.

In response to this:

Bug 1971715: configure-ovs: fix nondeterministic master in slave profiles

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@trozet trozet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

# bring down old iface
nmcli device disconnect $iface

# bring up new connectionx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit typo on the connectionx

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 23, 2021
@trozet
Copy link
Contributor

trozet commented Jun 23, 2021

/assign @kikisdeliveryservice

configure-ovs sets the master device name as connection.master on slave
connection profiles when the default gateway being replaced is a master
device.

Those slave profiles may activate before the master profile and if so,
NM may chose non-deterministically any master profile that might provide
a connection for the master device name, which may be the old master
connection profile instead of the replacement connection profile. The
auto-connect priority does not seem to be considered as this might not
be regarded by NM as an auto-connect of the master profile in strict
sense.

Setting the master connection profile uuid as connection.master on slave
connection profiles ensures that the correct master profile is activated
when the slave auto-connects. This requires a bit more effort when
reverting the configuration.

Additionally, if there is any modification to the slave connection
profiles, those need to be persisted as well from
systemConnectionsMerged to system-connections

fixes: openshift#2519
fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1971715

Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 23, 2021
@trozet
Copy link
Contributor

trozet commented Jun 23, 2021

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 23, 2021
Copy link
Contributor

@sinnykumari sinnykumari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. OVS team has already reviewed the functionality

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 24, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jcaamano, sinnykumari, trozet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 24, 2021
@sinnykumari
Copy link
Contributor

/test e2e-agnostic-upgrade

jcaamano added a commit to jcaamano/machine-config-operator that referenced this pull request Jul 6, 2021
Pull request openshift#2626 did not account for a possible scenario where
networking configuration would be backed with ifcfg files instead of
nmconnection files. Attempting to persist slave nmconnection changes in
case of an overlay setup of NM configuration directory would fail.

Additionally opted for a strategy of cloning slave connection profiles
instead of modifying them directly to avoid any interference with MCO
existing configuration.

Also fixed shellcheck reported issues in 4.8 backport pull request openshift#2639.

fixes: openshift#2626
fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1971715

Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
jcaamano added a commit to jcaamano/machine-config-operator that referenced this pull request Aug 30, 2021
Pull request openshift#2626 did not account for a possible scenario where
networking configuration would be backed with ifcfg files instead of
nmconnection files. Attempting to persist slave nmconnection changes in
case of an overlay setup of NM configuration directory would fail.

Additionally opted for a strategy of cloning slave connection profiles
instead of modifying them directly to avoid any interference with MCO
existing configuration.

Also fixed shellcheck reported issues in 4.8 backport pull request openshift#2639.

fixes: openshift#2626
fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1971715

Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
(cherry picked from commit 6654fd1)
jcaamano added a commit to jcaamano/machine-config-operator that referenced this pull request Sep 20, 2021
Pull request openshift#2626 did not account for a possible scenario where
networking configuration would be backed with ifcfg files instead of
nmconnection files. Attempting to persist slave nmconnection changes in
case of an overlay setup of NM configuration directory would fail.

Additionally opted for a strategy of cloning slave connection profiles
instead of modifying them directly to avoid any interference with MCO
existing configuration.

Also fixed shellcheck reported issues in 4.8 backport pull request openshift#2639.

fixes: openshift#2626
fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1971715

Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
(cherry picked from commit 6654fd1)
jcaamano added a commit to jcaamano/machine-config-operator that referenced this pull request Sep 22, 2021
Pull request openshift#2626 did not account for a possible scenario where
networking configuration would be backed with ifcfg files instead of
nmconnection files. Attempting to persist slave nmconnection changes in
case of an overlay setup of NM configuration directory would fail.

Additionally opted for a strategy of cloning slave connection profiles
instead of modifying them directly to avoid any interference with MCO
existing configuration.

Also fixed shellcheck reported issues in 4.8 backport pull request openshift#2639.

fixes: openshift#2626
fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1971715

Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
(cherry picked from commit 6654fd1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants