OCPBUGS-24792: Make MC names deterministic #875

jmencak · 2023-12-12T08:39:02Z

OpenShift Nodes can be a part of only one custom MachineConfigPool. However, MCP configuration allows this not to be the case. This is caught by the machine-config-controller and reported as an error ( belongs to custom roles, cannot proceed with this Node).

In order to target an MCP with a configuration, NTO uses machineConfigLabels. However, one or more MCPs can select a particular single MC. This is due to the MCP's machineConfigSelector. This is another challenge scenario.

In the above two scenarios, it was possible for NTO to generate a randomly-named MC based on the membership of one of the matching MCPs. Then, a pruning function would mistakenly remove the other MCs, seemingly unused. This could result in a flip between the rendered MCs and cause a Node reboot.

This PR makes the process of establishing the names of the MC for the purposes of MachineConfigPool based matching deterministic.

Other changes/fixes:

Synced with MCO's latest getPoolsForNode() changes.
Logging in syncMachineConfigHyperShift().

Resolves: OCPBUGS-24792

OpenShift Nodes can be a part of only one custom MachineConfigPool. However, MCP configuration allows this not to be the case. This is caught by the machine-config-controller and reported as an error (<node> belongs to <N> custom roles, cannot proceed with this Node). In order to target an MCP with a configuration, NTO uses machineConfigLabels. However, one or more MCPs can select a particular single MC. This is due to the MCP's machineConfigSelector. This is another challenge scenario. In the above two scenarios, it was possible for NTO to generate a randomly-named MC based on the membership of one of the matching MCPs. Then, a pruning function would mistakenly remove the other MCs, seemingly unused. This could result in a flip between the rendered MCs and cause a Node reboot. This PR makes the process of establishing the names of the MC for the purposes of MachineConfigPool based matching deterministic. Other changes/fixes: - Synced with MCO's latest getPoolsForNode() changes. - Logging in syncMachineConfigHyperShift(). Resolves: OCPBUGS-24792

openshift-ci · 2023-12-12T08:39:18Z

@jmencak: GitHub didn't allow me to request PR reviews from the following users: jmencak.

Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

OpenShift Nodes can be a part of only one custom MachineConfigPool. However, MCP configuration allows this not to be the case. This is caught by the machine-config-controller and reported as an error ( belongs to custom roles, cannot proceed with this Node).

In order to target an MCP with a configuration, NTO uses machineConfigLabels. However, one or more MCPs can select a particular single MC. This is due to the MCP's machineConfigSelector. This is another challenge scenario.

In the above two scenarios, it was possible for NTO to generate a randomly-named MC based on the membership of one of the matching MCPs. Then, a pruning function would mistakenly remove the other MCs, seemingly unused. This could result in a flip between the rendered MCs and cause a Node reboot.

This PR makes the process of establishing the names of the MC for the purposes of MachineConfigPool based matching deterministic.

Other changes/fixes:

Synced with MCO's latest getPoolsForNode() changes.

Logging in syncMachineConfigHyperShift().

Resolves: OCPBUGS-24792

/cc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci · 2023-12-12T08:39:49Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jmencak

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [jmencak]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

jmencak · 2023-12-12T09:25:01Z

/cc @liqcui

openshift-ci-robot · 2023-12-12T11:34:43Z

@jmencak: This pull request references Jira Issue OCPBUGS-24792, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (4.16.0) matches configured target version for branch (4.16.0)
bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira (liqcui@redhat.com), skipping review request.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

OpenShift Nodes can be a part of only one custom MachineConfigPool. However, MCP configuration allows this not to be the case. This is caught by the machine-config-controller and reported as an error ( belongs to custom roles, cannot proceed with this Node).

In order to target an MCP with a configuration, NTO uses machineConfigLabels. However, one or more MCPs can select a particular single MC. This is due to the MCP's machineConfigSelector. This is another challenge scenario.

In the above two scenarios, it was possible for NTO to generate a randomly-named MC based on the membership of one of the matching MCPs. Then, a pruning function would mistakenly remove the other MCs, seemingly unused. This could result in a flip between the rendered MCs and cause a Node reboot.

This PR makes the process of establishing the names of the MC for the purposes of MachineConfigPool based matching deterministic.

Other changes/fixes:

Synced with MCO's latest getPoolsForNode() changes.

Logging in syncMachineConfigHyperShift().

Resolves: OCPBUGS-24792

/cc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

jmencak · 2023-12-12T11:36:13Z

@yanirq if you have any free cycles, I'd appreciate a review. Also asked @liqcui for a review and premerge testing. I'd like to get this in and backport ASAP.

openshift-ci-robot · 2023-12-12T11:36:49Z

@jmencak: This pull request references Jira Issue OCPBUGS-24792, which is valid.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (4.16.0) matches configured target version for branch (4.16.0)
bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira (liqcui@redhat.com), skipping review request.

In response to this:

OpenShift Nodes can be a part of only one custom MachineConfigPool. However, MCP configuration allows this not to be the case. This is caught by the machine-config-controller and reported as an error ( belongs to custom roles, cannot proceed with this Node).

In order to target an MCP with a configuration, NTO uses machineConfigLabels. However, one or more MCPs can select a particular single MC. This is due to the MCP's machineConfigSelector. This is another challenge scenario.

In the above two scenarios, it was possible for NTO to generate a randomly-named MC based on the membership of one of the matching MCPs. Then, a pruning function would mistakenly remove the other MCs, seemingly unused. This could result in a flip between the rendered MCs and cause a Node reboot.

This PR makes the process of establishing the names of the MC for the purposes of MachineConfigPool based matching deterministic.

Other changes/fixes:

Synced with MCO's latest getPoolsForNode() changes.

Logging in syncMachineConfigHyperShift().

Resolves: OCPBUGS-24792

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

yanirq · 2023-12-12T12:33:03Z

/cc @yanirq

pkg/operator/controller.go

yanirq · 2023-12-12T16:21:35Z

Code wise this looks good to me.
In terms of functionality, this kindof introduce a new ability to have more that one MCP for nodes with picking an MCP for applying tuned by an alphabetical rule.
The approach is not harmful but should be probably documented .

Would we like to have functional tests for this deterministic behavior ? Its a heavy test to perform so Im not saying Im in favor, just thinking out loud.

cc @MarSik In case you want to weigh in.

jmencak · 2023-12-12T16:38:20Z

Thank you for the review, much appreciated.

Code wise this looks good to me. In terms of functionality, this kindof introduce a new ability to have more that one MCP for nodes with picking an MCP for applying tuned by an alphabetical rule. The approach is not harmful but should be probably documented .

Good observation. However, that is not my aim. My aim was to stop the restart loop. I'd still like the customers did not use this, because then this basically wides the the pool of machines that need the same HW across all the pools that match. We should probably document that we strongly recommend they do not do this.

Would we like to have functional tests for this deterministic behavior ? Its a heavy test to perform so Im not saying Im in favor, just thinking out loud.

Possibly, but not sure whether in this repo too. We have similar tests in openshift/origin for this. And yes, it will be time consuming. Also, if as part of this PR, please note that I have a fix for OCPBUGS-24636 ready waiting for this to merge.

MarSik · 2023-12-13T07:48:03Z

pkg/operator/controller.go

+			return nil
+		}
+
+		if ok := c.allNodesAgreeOnBootcmdline(nodes); !ok {


If you want to support multiple pools matching the recommendation section of a single Tuned, then I think this loop has to extend further below this. Because you fail the whole sync on two MCPs not having the same kargs. Which they are not required to have, that condition only applies to each MCP separately.

If you want to support multiple pools matching the recommendation section of a single Tuned

@MarSik , I believe there is a fundamental misunderstanding what this change does. We are not adding support, we are fixing a potential cyclic reboot issue that was not fixed by previous code.

So my aim is not to support multiple MCPs, but we need to check all the matching MCPs affected by this. Would logging an error or warning when we see multiple matching MCPs be sufficient?

But it essentially does add the support for multiple MCPs.

You can block the sync when multiple MCPs are detected instead to make it clear that is is not supported. Not need to validate kargs in such case.

But it essentially does add the support for multiple MCPs.

Let's agree to disagree. It fixes a bug.

You can block the sync when multiple MCPs are detected instead to make it clear that is is not supported. Not need to validate kargs in such case.

No, we still need to check the kargs. You can have machines with different topology within a single MCP.

@MarSik , please see the "Enforce per-pool machineConfigLabels selectors" commit. Hopefully it resolves the issues you see with this PR.

MarSik · 2023-12-13T07:49:56Z

Hmm, we can either improve the support for multiple MCPs (like I commented in the code) or block the approach by either a check or by generating a sufficiently unique label set for the new MC so it is matched by a single MCP only.

jmencak · 2023-12-13T10:51:02Z

Hmm, we can either improve the support for multiple MCPs (like I commented in the code) or block the approach by either a check or by generating a sufficiently unique label set for the new MC so it is matched by a single MCP only.

As mentioned above, we are not adding support for multipe MCPs. Also, we cannot generate a MCP label, because that's what customers/PPC specifies.

jmencak · 2023-12-13T11:46:10Z

Liquan found an issue when manually running one test scenario. Cannot reproduce it, but let's see if it is real first.
/hold

liqcui · 2023-12-13T13:55:25Z

/lgtm
I reproduced the bug on ocp4.14 cluster, then tested the same steps using new pr code, the bug fixed using the new pr code. I also did a full regression testing using pr code, pre-merge testing passed

liqcui · 2023-12-13T13:56:21Z

Liquan found an issue when manually running one test scenario. Cannot reproduce it, but let's see if it is real first. /hold

It's not a real issue. re-test is ok

jmencak · 2023-12-13T14:03:12Z

It's not a real issue. re-test is ok

Thank you, still keeping the hold for now.

jmencak · 2023-12-14T07:25:48Z

Keeping a hold for now as I want to do more manual testing on SNO. Will unhold after that.

jmencak · 2023-12-14T09:23:15Z

Manual tests on SNO worked fine.
/hold cancel

MarSik · 2023-12-14T10:01:34Z

pkg/operator/controller.go

+		// Log an error and do not requeue, this is a configuration issue.
+		klog.Errorf("profile %v uses machineConfigLabels that match across multiple MCPs (%v); this is not supported",
+			profile.Name, printMachineConfigPoolsNames(pools))
+		return nil


Should this really return success? What was wrong on returning the error? (the other log update below on line 798)

In my view yes. You don't want to requeue this error 15 times until this is dropped and spam the logs. This is a configuration issue, user has been notified by logging an error. Requeing would not help anything.

Ack, so the only place we will report this at will be the NTO log?

Yes, for now.

MarSik · 2023-12-14T12:25:53Z

/lgtm

But we should get together and figure out a better error reporting scheme. Right now we only have the information about errors in logs only. Users do not notice those from the cluster management level.

jmencak · 2023-12-14T12:36:35Z

/lgtm

But we should get together and figure out a better error reporting scheme. Right now we only have the information about errors in logs only. Users do not notice those from the cluster management level.

I propose doing a similar thing MCO does when it detects issues with its MCPs, it raises alerts.

openshift-ci-robot · 2023-12-14T13:11:26Z

/retest-required

Remaining retests: 0 against base HEAD 29d73bd and 2 for PR HEAD 2963f66 in total

jmencak · 2023-12-14T18:24:04Z

/retest

jmencak · 2023-12-14T20:27:49Z

/retest

openshift-ci-robot · 2023-12-15T00:39:17Z

/retest-required

Remaining retests: 0 against base HEAD fd20f57 and 1 for PR HEAD 2963f66 in total

jmencak · 2023-12-15T06:23:17Z

/retest

openshift-ci · 2023-12-15T07:49:48Z

@jmencak: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci-robot · 2023-12-15T07:52:35Z

@jmencak: Jira Issue OCPBUGS-24792: All pull requests linked via external trackers have merged:

openshift/cluster-node-tuning-operator#875

Jira Issue OCPBUGS-24792 has been moved to the MODIFIED state.

In response to this:

OpenShift Nodes can be a part of only one custom MachineConfigPool. However, MCP configuration allows this not to be the case. This is caught by the machine-config-controller and reported as an error ( belongs to custom roles, cannot proceed with this Node).

In order to target an MCP with a configuration, NTO uses machineConfigLabels. However, one or more MCPs can select a particular single MC. This is due to the MCP's machineConfigSelector. This is another challenge scenario.

In the above two scenarios, it was possible for NTO to generate a randomly-named MC based on the membership of one of the matching MCPs. Then, a pruning function would mistakenly remove the other MCs, seemingly unused. This could result in a flip between the rendered MCs and cause a Node reboot.

This PR makes the process of establishing the names of the MC for the purposes of MachineConfigPool based matching deterministic.

Other changes/fixes:

Synced with MCO's latest getPoolsForNode() changes.

Logging in syncMachineConfigHyperShift().

Resolves: OCPBUGS-24792

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-bot · 2023-12-15T09:40:53Z

[ART PR BUILD NOTIFIER]

This PR has been included in build cluster-node-tuning-operator-container-v4.16.0-202312150911.p0.g778695d.assembly.stream for distgit cluster-node-tuning-operator.
All builds following this will include this PR.

* Make MC names deterministic OpenShift Nodes can be a part of only one custom MachineConfigPool. However, MCP configuration allows this not to be the case. This is caught by the machine-config-controller and reported as an error (<node> belongs to <N> custom roles, cannot proceed with this Node). In order to target an MCP with a configuration, NTO uses machineConfigLabels. However, one or more MCPs can select a particular single MC. This is due to the MCP's machineConfigSelector. This is another challenge scenario. In the above two scenarios, it was possible for NTO to generate a randomly-named MC based on the membership of one of the matching MCPs. Then, a pruning function would mistakenly remove the other MCs, seemingly unused. This could result in a flip between the rendered MCs and cause a Node reboot. This PR makes the process of establishing the names of the MC for the purposes of MachineConfigPool based matching deterministic. Other changes/fixes: - Synced with MCO's latest getPoolsForNode() changes. - Logging in syncMachineConfigHyperShift(). Resolves: OCPBUGS-24792 * Enforce per-pool machineConfigLabels selectors --------- Co-authored-by: Jiri Mencak <jmencak@users.noreply.github.com>

jmencak · 2023-12-18T08:26:13Z

/cherry-pick release-4.15

openshift-cherrypick-robot · 2023-12-18T08:26:56Z

@jmencak: new pull request created: #888

In response to this:

/cherry-pick release-4.15

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 12, 2023

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 12, 2023

openshift-ci bot requested a review from liqcui December 12, 2023 09:25

jmencak changed the title ~~WiP: Make MC names deterministic~~ OCPBUGS-24792: Make MC names deterministic Dec 12, 2023

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 12, 2023

openshift-ci bot requested a review from yanirq December 12, 2023 12:33

yanirq reviewed Dec 12, 2023

View reviewed changes

pkg/operator/controller.go Outdated Show resolved Hide resolved

MarSik reviewed Dec 13, 2023

View reviewed changes

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 13, 2023

openshift-ci bot assigned liqcui Dec 13, 2023

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 13, 2023

Enforce per-pool machineConfigLabels selectors

2963f66

openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Dec 13, 2023

openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 14, 2023

MarSik reviewed Dec 14, 2023

View reviewed changes

openshift-ci bot assigned MarSik Dec 14, 2023

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 14, 2023

openshift-merge-bot bot merged commit 778695d into openshift:master Dec 15, 2023
14 checks passed

jmencak deleted the 4.16-OCPBUGS-24792-overlapping-mcp branch December 15, 2023 08:42

rbaturov mentioned this pull request Dec 18, 2023

OCPBUGS-25595: Make MC names deterministic (#875) #887

Closed

openshift-cherrypick-robot mentioned this pull request Dec 18, 2023

[release-4.15] OCPBUGS-25596: Make MC names deterministic #888

Merged

OCPBUGS-24792: Make MC names deterministic #875

OCPBUGS-24792: Make MC names deterministic #875

Conversation

jmencak commented Dec 12, 2023 • edited

openshift-ci bot commented Dec 12, 2023

openshift-ci bot commented Dec 12, 2023

jmencak commented Dec 12, 2023

openshift-ci-robot commented Dec 12, 2023

jmencak commented Dec 12, 2023

openshift-ci-robot commented Dec 12, 2023

yanirq commented Dec 12, 2023

yanirq commented Dec 12, 2023

jmencak commented Dec 12, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarSik commented Dec 13, 2023

jmencak commented Dec 13, 2023

jmencak commented Dec 13, 2023 • edited

liqcui commented Dec 13, 2023

liqcui commented Dec 13, 2023

jmencak commented Dec 13, 2023

jmencak commented Dec 14, 2023

jmencak commented Dec 14, 2023

Choose a reason for hiding this comment

jmencak Dec 14, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarSik commented Dec 14, 2023

jmencak commented Dec 14, 2023

openshift-ci-robot commented Dec 14, 2023

jmencak commented Dec 14, 2023

jmencak commented Dec 14, 2023

openshift-ci-robot commented Dec 15, 2023

jmencak commented Dec 15, 2023

openshift-ci bot commented Dec 15, 2023

openshift-ci-robot commented Dec 15, 2023

openshift-bot commented Dec 15, 2023

jmencak commented Dec 18, 2023

openshift-cherrypick-robot commented Dec 18, 2023

jmencak commented Dec 12, 2023 •

edited

jmencak commented Dec 12, 2023 •

edited

jmencak commented Dec 13, 2023 •

edited

jmencak Dec 14, 2023 •

edited