New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: fleet agent deployment configures tolerations from cluster CR #1154
Conversation
Hey thanks. I guess this will not conflict with the existing tolerations in https://github.com/rancher/fleet/blob/master/pkg/agent/manifest.go#L234-L244? |
@@ -93,6 +94,10 @@ func Manifest(namespace string, agentScope string, opts ManifestOptions) []runti | |||
propagateDebug, _ := strconv.ParseBool(os.Getenv("FLEET_PROPAGATE_DEBUG_SETTINGS_TO_AGENTS")) | |||
debug := logrus.IsLevelEnabled(logrus.DebugLevel) && propagateDebug | |||
dep := agentDeployment(namespace, DefaultName, image, opts.AgentImagePullPolicy, DefaultName, false, debug) | |||
|
|||
// additional tolerations | |||
dep.Spec.Template.Spec.Tolerations = append(dep.Spec.Template.Spec.Tolerations, opts.AgentTolerations...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@manno it will preserve the tolerations defined by the agentDeployment
func and append anything that the user sent. However, it is possible for the user defined tolerations to override the default ones if they have the same key, but I don't see why someone would do that intentionally as the default ones are rancher namespaced.
@rajiteh could you provide ci tests for your pr? We try to improve our code base and having automated tests for new non trivial code changes helps a lot. Thanks in advance. |
f5fd0a3
to
e8a7f9b
Compare
@thardeck @manno it was a little too tough for me to grok how the e2e stuff worked in this repo, I can't seem to figure out if there is any code that tests agent deployment behaviour and I'm too lost to come up with my own. I decided to add some unit tests to verify the functionality instead. Let me know if that's enough for now. Thanks! |
@rajiteh I don't think we have anything to really check agent deployment. Our E2E tests mostly apply Gitrepos and use kubectl to check the outcome. |
Related:
This PR introduces the ability for fleet admins to configure tolerations that must be included in the fleet agent deployment by setting them in each cluster CR's spec.
A new configuration field
.agentTolerations
provide a way to define a list ofcorev1.Toleration
objects.This solves the problem of not being able to use Fleet in clusters which all the nodes are tainted (no schedulable nodes for agent deployment)
Additional Information
This solution let's the user/system that manages fleet Cluster resources to configure the tolerations for that particular cluster's agent deployment.
Tradeoff
In agent initiated registration flow, the initial agent deployment will need to replicate the same set of tolerations.
ie: if using helm then we would have to set
.kubectl.tolerations
andfleetAgent.tolerations
fieldsHowever, it can be argued that one could simply set a catch all toleration
- operator: Exists
in the agent initiated flow which guarantees that the agent will get scheduled in the downstream cluster. Once it has communicated with the fleet controller, the new bundle pushed out by the controller will contain the proper tolerations as defined by the Cluster CR which schedule the pods in the appropriate nodes.Potential improvement