Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFE] Agent Deployment Customization Support #41035

Closed
3 tasks done
snasovich opened this issue Mar 31, 2023 · 8 comments
Closed
3 tasks done

[RFE] Agent Deployment Customization Support #41035

snasovich opened this issue Mar 31, 2023 · 8 comments
Assignees
Labels
area/agent Issues that deal with the Rancher Agent area/imported-suc-managed K3s/RKE2 Imported Clusters area/provisioning-v2 Provisioning issues that are specific to the provisioningv2 generating framework internal JIRA To be used in correspondence with the internal ticketing system. kind/enhancement Issues that improve or augment existing functionality release-note Note this issue in the milestone's release notes security-required team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support
Milestone

Comments

@snasovich
Copy link
Collaborator

snasovich commented Mar 31, 2023

Is your feature request related to a problem? Please describe.

Users want a way to configure node affinity, resource limits, and schedule pods on specific nodes for the cluster and fleet agent deployments.

Describe the solution you'd like

Add the following configurable fields in the Rancher backend and UI:

  • node affinity (override existing)
  • resource and request limits (no existing)
  • node tolerations (append to existing)

These fields will be defined in a generic struct for the agent called AgentDeploymentCustomization. Users will be able to override the defaults for node affinity and resource limits for the cluster or fleet agent in the Rancher UI when creating or updating a cluster. There will already be default tolerations defined for both agents, but users will be able to add additional tolerations via the UI. These fields will then get passed to the cluster object and picked up by each agent deployment that will deploy pods onto the downstream cluster.

This feature touches RKE1, v2prov (RKE2/K3s) and hosted provider clusters (AKS / EKS / GKE) but is currently prioritized for EKS and RKE2.

Further design docs are available internally.

Additional context

Sub-tasks:

  • Data structures and new settings for defaults in a separate PR to unblock UI, target merge of 4/12/2023. Includes:

    • AgentDeploymentCustomization struct and adding it to both provisioning and management Cluster objects
    • cluster-agent-default-affinity setting with the default value matching the current affinity used for cluster-agent deployment
    • fleet-agent-default-affinity setting with the default value matching the current affinity used for fleet-agent deployment
  • Add backend logic to use values passed in ClusterAgentDeploymentCustomization and FleetAgentDeploymentCustomization in a separate PR, target merge of 4/26/2023. Includes:

    • Updating YAML manifest generation for cluster-agent deployment to use values passed
    • Sync of these ClusterAgentDeploymentCustomization and FleetAgentDeploymentCustomization fields between management and provisioning cluster objects (also includes using cluster-agent-default-affinity setting for affinity)
    • Application of values fromFleetAgentDeploymentCustomization field from provisioning cluster object to Fleet cluster object (also includes using fleet-agent-default-affinity setting for affinity)
  • Add validation tests (and possibly unit tests?)

SURE-4552

@snasovich snasovich added kind/enhancement Issues that improve or augment existing functionality area/agent Issues that deal with the Rancher Agent internal [zube]: Next Up team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support JIRA To be used in correspondence with the internal ticketing system. labels Mar 31, 2023
@snasovich snasovich added this to the 2023-Q2-v2.7x milestone Mar 31, 2023
@Oats87 Oats87 added area/provisioning-v2 Provisioning issues that are specific to the provisioningv2 generating framework area/imported-suc-managed K3s/RKE2 Imported Clusters labels Apr 10, 2023
@a-blender
Copy link
Contributor

a-blender commented Apr 13, 2023

@gaktive I've pushed a docker image annablender/rancher:cattle-agent-structs with updates from this PR that can be used to unblock UI. This is being done in lieu of merging schema changes into release/v2.7 without backend plumbing. The plan is to branch off the PR used to create this image, and add backend plumbing that way so all changes can be reviewed at once before deciding to merge.

@gaktive
Copy link
Member

gaktive commented Apr 13, 2023

Thanks @Anna-Blendermann! cc: @mantis-toboggan-md @aalves08

@a-blender
Copy link
Contributor

Cluster agent backend logic is now available in Rancher v2.7-head! Please wait for webhook updates as it needs the backend updates to persist data on the cluster object.

@aalves08
Copy link
Contributor

aalves08 commented May 4, 2023

@a-blender If I didn't understand wrong, there's a last step to be done yet to persist the data, right?

@a-blender
Copy link
Contributor

@aalves08 Webhook has been updated now, that was the last step for the backend.

@kinarashah
Copy link
Member

kinarashah commented May 5, 2023

@aalves08 Updated rancher so it pulls the latest webhook, which should fix the data persistence issue. Once https://drone-publish.rancher.io/rancher/rancher/9586 or a later build passes, you can pick it up with rancher/rancher:v2.7-head.

Update: build passed, available with rancher/rancher:v2.7-head

@slickwarren
Copy link
Contributor

QA has tested this feature with an internal test plan. All issues for this feature have been filed separately, and the main functionality is working as expected on v2.7-head (1eb478f)

@a-blender
Copy link
Contributor

a-blender commented May 19, 2023

@slickwarren Amazing thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/agent Issues that deal with the Rancher Agent area/imported-suc-managed K3s/RKE2 Imported Clusters area/provisioning-v2 Provisioning issues that are specific to the provisioningv2 generating framework internal JIRA To be used in correspondence with the internal ticketing system. kind/enhancement Issues that improve or augment existing functionality release-note Note this issue in the milestone's release notes security-required team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support
Projects
None yet
Development

No branches or pull requests

9 participants