Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flavors with matching names should have identical labels/taints #59

Closed
ahg-g opened this issue Feb 24, 2022 · 7 comments · Fixed by #133
Closed

Flavors with matching names should have identical labels/taints #59

ahg-g opened this issue Feb 24, 2022 · 7 comments · Fixed by #133
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@ahg-g
Copy link
Contributor

ahg-g commented Feb 24, 2022

A capacity can borrow resources from flavors matching the names of ones defined in the capacity. Those flavors with matching names should also have identical labels and taints.

One solution is to define a cluster-scoped object API that represents resource flavors that capacities refer to by name when setting a quota. It would look like this:

type ResourceFlavorSpec struct {  
  // the object name serves as the flavor name, e.g., nvidia-tesla-k80. 

  // resource is the resource name, e.g., nvidia.com/gpus.   
  Resource v1.ResourceName  

  // labels associated with this flavor. Those labels are matched against or  
  // converted to node affinity constraints on the workload’s pods.  
  // For example, cloud.provider.com/accelerator: nvidia-tesla-k80.  
  Labels map[string]string  

  // taints associated with this constraint that workloads must explicitly   
  // “tolerate” to be able to use this flavor.  
  // e.g., cloud.provider.com/preemptible="true":NoSchedule  
  Taints      []Taint
}

This will avoid duplicating labels/taints on each capacity and so makes it easier to create a cohort of capacities with similar resources.

The downside is of course now we have another resource that the batch admin needs to deal with. But I expect that the number of flavors will typically be small.

@ahg-g ahg-g added kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Feb 24, 2022
@alculquicondor
Copy link
Contributor

I would add that the ResourceFlavor shouldn't be mandatory.

OTOH, that means that we should proceed with scheduling even if we can't find a matching ResourceFlavor object. This can lead to jobs scheduling on flavors they shouldn't, specially during restarts. Is this acceptable?

@alculquicondor
Copy link
Contributor

alculquicondor commented Mar 1, 2022

Are we settled on having a ResourceFlavor CRD?

@ahg-g
Copy link
Contributor Author

ahg-g commented Mar 1, 2022

I think we should do that, yes.

@alculquicondor
Copy link
Contributor

/assign

@alculquicondor
Copy link
Contributor

/unassign

leaving this for now (feel free to take it)

@ahg-g
Copy link
Contributor Author

ahg-g commented Mar 8, 2022

/assign

@ahg-g
Copy link
Contributor Author

ahg-g commented Mar 14, 2022

I would delay this until after 0.0.1; I think it is more important to focus on testing and verifying scale now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants