Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix kata node selector handling #275

Merged
merged 4 commits into from Mar 10, 2023

Conversation

pmores
Copy link
Contributor

@pmores pmores commented Mar 10, 2023

This PR aims to fix a long-standing problem where kataConfigPoolSelector allows users to use MatchExpressions but then the controller code handling the selector cannot process the MatchExpressions part and ends up effectively ignoring it.

Please check out the initial commit's message for a detailed explanation of the problem and the fix.

Fixes: KATA-1860
Fixes: KATA-1966
Closes: KATA-1965

Pavel Mores added 4 commits March 10, 2023 09:09
KataConfig.spec.kataConfigPoolSelector allows users to install kata on
just a subset of workers by specifying requirements that worker labels
have to match.  The selector's type is metav1.LabelSelector which means
it supports both the MatchLabels and MatchExpressions way of specifying
label requirements.  However, a couple of k8s resources which this
controller creates (namely PodSpec in Daemonset pod template and
RuntimeClass) use node selectors that only support MatchLabels.  The
current controller resolves this mismatch by only passing the
MatchLabels part of kataConfigPoolSelector to the node selectors that
don't support MatchExpressions.  Obviously, ignoring the
MatchExpressions part of kataConfigPoolSelector like this causes
disappointing results for users who opt for MatchExpressions to specify
their kata-enabled worker subset.

To overcome this problem we introduce two kinds of kata node selector.
The first one, called KataConfigNodeSelector in this patch, corresponds
basically to a full set of label requirements coming from a KataConfig
instance (specifically its kataConfigPoolSelector and - if enabled -
checkNodeEligibility members).  This kind of selector is only used
internally in the controller code to label matching nodes with the
"kata-oc" node role.

This node role label then allows us to form the other kind of kata node
selector, called simply NodeSelector here (mostly for historical reasons),
which fits comfortably into MatchLabels and can be passed to any
node-selecting k8s resource without loss of information.

This commit introduces a pair of helper functions for each kind of kata
node selector, one that defines it and one convenience wrapper to convert
it to a commonly used type, respectively.

Signed-off-by: Pavel Mores <pmores@redhat.com>
The unlabelNodes() change is a bit more involved since it also changes
the function signature (there's no need anymore to perform the selector
type conversion from metav1.LabelSelector to labels.Selector internally)
and fixes a flaw in the previous version which swallowed possible errors
from the selector type conversion.

Signed-off-by: Pavel Mores <pmores@redhat.com>
Note that only the MCP-constructing code actually needed update as the
other NodeSelector users (Daemonset and RuntimeClass) use the correct
function already.

Fixes: KATA-1860
Fixes: KATA-1965
Fixes: KATA-1966

Signed-off-by: Pavel Mores <pmores@redhat.com>
getNodeSelectorAsMapNoKataOc() has always been a temporary function to
bridge the gap until the proper solution is in place.

getNodeSelectorAsSelector() is obsolete as NodeSelector is never needed
as labels.Selector.  labels.Selector is a type that, unlike the closely
related LabelSelector, supports actually matching a set of labels against
it.  This is now only done with KataConfigNodeSelector, NodeSelector is
only ever used as a value of k8s resources' nodeSelector fields.

Signed-off-by: Pavel Mores <pmores@redhat.com>
@openshift-ci openshift-ci bot requested review from bpradipt and jensfr March 10, 2023 08:45
Copy link
Contributor

@bpradipt bpradipt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
Thanks @pmores

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 10, 2023
Copy link
Member

@gkurz gkurz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Thanks @pmores ! This was easier to review than the previous one indeed 😄

@openshift-ci
Copy link

openshift-ci bot commented Mar 10, 2023

@pmores: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/sandboxed-containers-operator-e2e af1441c link false /test sandboxed-containers-operator-e2e

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@bpradipt bpradipt merged commit 070d85b into openshift:main Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants