Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KATA-1444: remove nodeSelector from controller deployment #185

Merged
merged 1 commit into from Apr 4, 2022

Conversation

jensfr
Copy link
Contributor

@jensfr jensfr commented Mar 25, 2022

Currently we restrict the operator to run on master nodes only.
However, there is no technical reason for this setting. It
creates a problem when a cluster-admin sets the defaultNodeScheduler
field to schedule everything to workers. In that case our controller
could not be placed anywhere by the scheduler and would get stuck
in 'Pending' state.

By removing the nodeSelector field from our controller manifest we let
the scheduler place the controller pod also on worker nodes so that we
don't get into the above described situation anymore.

I tested this on clusters with

  • three control-plane and three worker nodes
  • a converged cluster where control-plane and worker nodes are on the
    same machines
  • a single-node cluster

and did not run into problems. The test procedure included deploying a
cluster, creating a kataconfig CR, running a workload with kata runtime
class, deleting the kataconfig CR and destroying the cluster.

Signed-off-by: Jens Freimann jfreimann@redhat.com

Currently we restrict the operator to run on master nodes only.
However, there is no technical reason for this setting. It
creates a problem when a cluster-admin sets the defaultNodeScheduler
field to schedule everything to workers. In that case our controller
could not be placed anywhere by the scheduler and would get stuck
in 'Pending' state.

By removing the nodeSelector field from our controller manifest we let
the scheduler place the controller pod also on worker nodes so that we
don't get into the above described situation anymore.

I tested this on clusters with
 - three control-plane and three worker nodes
 - a converged cluster where control-plane and worker nodes are on the
   same machines
 - a single-node cluster

and did not run into problems. The test procedure included deploying a
cluster, creating a kataconfig CR, running a workload with kata runtime
class, deleting the kataconfig CR and destroying the cluster.

Signed-off-by: Jens Freimann <jfreimann@redhat.com>
@jensfr jensfr requested a review from bpradipt March 25, 2022 13:31
@jensfr jensfr changed the title KATA-1444: remvoe nodeSelector from controller deployment KATA-1444: remove nodeSelector from controller deployment Mar 25, 2022
Copy link
Contributor

@bpradipt bpradipt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
Thanks @jensfr

@bpradipt bpradipt closed this Apr 4, 2022
@bpradipt bpradipt reopened this Apr 4, 2022
@openshift-ci
Copy link

openshift-ci bot commented Apr 4, 2022

@jensfr: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@bpradipt
Copy link
Contributor

bpradipt commented Apr 4, 2022

@jensfr closing and re-opening the PR did the trick and the tests ran successfully. Merging this

@bpradipt bpradipt merged commit 2fa0755 into openshift:master Apr 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants