Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Bug OCPBUGS-13301: add support for cpu partitioning on site config #1441

Merged

Conversation

eggfoobar
Copy link
Contributor

add new flag for cpu partitioning mode
add deprecation warning for situations where cpuset is defined for nodes

/hold
/cc @imiller0

@openshift-ci openshift-ci bot requested a review from imiller0 March 21, 2023 16:34
@openshift-ci openshift-ci bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 21, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 21, 2023

Hi @eggfoobar. Thanks for your PR.

I'm waiting for a openshift-kni member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@eggfoobar
Copy link
Contributor Author

@imiller0 Would love feedback on the deprecation warning, not sure if you feel it's too much. Was of two minds if it made sense to only apply it if both cpuset and cpuPartitioningMode are defined or weather we do a blanket warning to give plenty of heads up time for older installs.

@imiller0
Copy link
Contributor

/ok-to-test

@openshift-ci openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 27, 2023
@imiller0
Copy link
Contributor

@serngawy @sabbir-47 can you please review

ztp/siteconfig-generator/siteConfig/siteConfigHelper.go Outdated Show resolved Hide resolved
@@ -16,6 +16,7 @@ metadata:
agent-install.openshift.io/install-config-overrides: '{"networking":{"networkType":"OVNKubernetes"}}'
argocd.argoproj.io/sync-wave: "1"
ran.openshift.io/ztp-gitops-generated: '{}'
ran.openshift.io/ztp-warning/field-deprecation/cpuset: siteConfig.clusters[].node[].cpuset will be deprecated after OCP 4.15, please use siteConfig.clusters[].cpuPartitioningMode for OCP versions >= 4.13
Copy link
Contributor

@serngawy serngawy Mar 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpuPartitioningMode does not exist in the install-config-overrides but the warning annotation get generated ?
I think warning annotation get generated if the cpuSet is used otherwise no need for it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, since we won't know the version the customer chooses for openshift, figured if they include a cpuset we notify them that for 4.13 onwards they need to switch over to the new method.

@@ -273,3 +336,49 @@ func mergeJsonCommonKey(mergeWith, mergeTo, key string) (string, error) {
}
return string(newJson), nil
}

func applyWorkloadPinningInstallConfigOverrides(clusterSpec *Clusters) (result string, err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand this func to replace the use of getWorkloadManifest is that correct ? if yes we need to remove the machineConfig OR the machineConfig for workload will till remain ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The siteConfig tool (binary) can be used to deploy clusters on prior releases as well. This new feature shows up in 4.13, so if the user is deploying a 4.12 cluster they will need the MachineConfig, but if they are deploying 4.13 they need this annotation (and do not need the MachineConfig)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that if the user leaves "cpuset" out of their config we will automatically leave out the workload partitioning manifests. We will need to update the documentation to note that in 4.13+ the user specifies cpuPartitioningMode and not cpuset.

const (
deprecationCR = "AgentClusterInstall"
deprecationField = "cpuset"
deprecationMessage = "siteConfig.clusters[].node[].cpuset will be deprecated after OCP 4.15, please use siteConfig.clusters[].cpuPartitioningMode for OCP versions >= 4.13"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to have min message like "cpuset is deprecated instead use cpuPartitioningMode"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I minimized it but left the version information since we won't know which version the user chooses, would that be okay?

@serngawy
Copy link
Contributor

The ci job required the commit message to start with ztp

Copy link
Contributor

@imiller0 imiller0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Along with the changes to siteConfig generator, we need to update the reference siteconfigs to make use of the cpuPartitioningMode instead of cpuset. The PerformanceProfile in the reference configurations already set the cpu appropriately so only the SiteConfig CRs need to be updated.
https://github.com/openshift-kni/cnf-features-deploy/blob/master/ztp/gitops-subscriptions/argocd/example/siteconfig/example-sno.yaml
And the 3NC and standard clusters in the same directory.

add new flag for cpu partitioning mode
add deprecation warning for situations where cpuset is defined for nodes

Signed-off-by: ehila <ehila@redhat.com>
Signed-off-by: ehila <ehila@redhat.com>
@imiller0
Copy link
Contributor

/retest

Signed-off-by: ehila <ehila@redhat.com>
@eggfoobar
Copy link
Contributor Author

@imiller0 and @serngawy Tests are passing now, would mind giving this another look?

Copy link
Contributor

@imiller0 imiller0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 1, 2023
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 2, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 2, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: eggfoobar, imiller0, serngawy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@imiller0
Copy link
Contributor

imiller0 commented May 2, 2023

@eggfoobar can we remove the hold?

@eggfoobar
Copy link
Contributor Author

yup! We should be good to remove it.

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 3, 2023
@openshift-merge-robot openshift-merge-robot merged commit df92582 into openshift-kni:master May 3, 2023
4 checks passed
@imiller0
Copy link
Contributor

imiller0 commented May 4, 2023

/cherry-pick release-4.13

@openshift-cherrypick-robot

@imiller0: #1441 failed to apply on top of branch "release-4.13":

Applying: ztp: add support for cpu partitioning on site config
Using index info to reconstruct a base tree...
M	ztp/siteconfig-generator/siteConfig/siteConfig.go
Falling back to patching base and 3-way merge...
Auto-merging ztp/siteconfig-generator/siteConfig/siteConfig.go
CONFLICT (content): Merge conflict in ztp/siteconfig-generator/siteConfig/siteConfig.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 ztp: add support for cpu partitioning on site config
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-4.13

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@imiller0 imiller0 changed the title feat: add support for cpu partitioning on site config feat: Bug OCPBUGS-13301: add support for cpu partitioning on site config May 11, 2023
@openshift-ci-robot
Copy link
Collaborator

@eggfoobar: Jira Issue OCPBUGS-13301: Some pull requests linked via external trackers have merged:

The following pull requests linked via external trackers have not merged:

These pull request must merge or be unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with /jira refresh.

Jira Issue OCPBUGS-13301 has not been moved to the MODIFIED state.

In response to this:

add new flag for cpu partitioning mode
add deprecation warning for situations where cpuset is defined for nodes

/hold
/cc @imiller0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@imiller0
Copy link
Contributor

/jira refresh

@openshift-ci-robot
Copy link
Collaborator

@imiller0: Jira Issue OCPBUGS-13301: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-13301 has been moved to the MODIFIED state.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants