Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
KEP of pod affinity/anti-affinity supports Gt and Lt operators
- Loading branch information
Showing
1 changed file
with
171 additions
and
0 deletions.
There are no files selected for viewing
171 changes: 171 additions & 0 deletions
171
...heduling/20190312-pod-affinity-and-anti-affinity-support-gt-and-lt-operators.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,171 @@ | ||
--- | ||
title: Pod affinity/anti-affinity supports Gt and Lt operators | ||
authors: | ||
- "@wgliang" | ||
owning-sig: sig-scheduling | ||
reviewers: | ||
- "@bsalamat" | ||
- "@k82cn" | ||
- "@Huang-Wei" | ||
approvers: | ||
- "@bsalamat" | ||
- "@k82cn" | ||
creation-date: 2019-02-22 | ||
last-updated: 2019-03-12 | ||
status: provisional | ||
--- | ||
|
||
# Pod affinity/anti-affinity supports Gt and Lt operators | ||
|
||
## Table of Contents | ||
|
||
* [Summary](#summary) | ||
* [Motivation](#motivation) | ||
* [Goals](#goals) | ||
* [Non-Goals](#non-goals) | ||
* [Proposal](#proposal) | ||
* [User Stories](#user-stories) | ||
* [Risks and Mitigations](#risks-and-mitigations) | ||
* [Design Details](#design-details) | ||
* [Content](#content) | ||
* [Test Plan](#test-plan) | ||
* [Graduation Criteria](#graduation-criteria) | ||
* [Implementation History](#implementation-history) | ||
|
||
## Summary | ||
|
||
Extend the `Pod` affinity/anti-affinity operators to support `Gt` and `Lt` to provide | ||
users with more elegant Pod label selection capabilities. | ||
|
||
## Motivation | ||
|
||
We know that `Node` affinity/anti-affinity currently supports `In`, `NotIn`, `Exists`, | ||
`DoesNotExist`, `Gt`, `Lt`. But Pod affinity/anti-affinity only works with regular | ||
label selectors: `In`, `NotIn`, `Exists`, `DoesNotExist`. | ||
|
||
This is not an ideal situation if users want to put pods based on the label range. | ||
`Pod` affinity/anti-affinity support for `Gt` and `Lt` operators will give users more | ||
control. | ||
|
||
### Goals | ||
|
||
- `Pod` affinity/anti-affinity support for `Gt` and `Lt` operators. | ||
- `Gt` and `Lt` will have the same status and influence as the original operators (`In`, | ||
`NotIn`, `Exists`, `DoesNotExist`). | ||
- `Gt` and `Lt` will work with `requiredDuringSchedulingIgnoredDuringExecution`(predicate, hard requirements) and `preferredDuringScheduling`(priority, soft requirements). | ||
|
||
### Non-Goals | ||
|
||
- Just add `Gt` and `Lt` operator functions without changing their original definition. | ||
- Not changing the behavior of other label selectors, such as `ReplicaSets`, `Daemonsets`, etc. | ||
|
||
## Proposal | ||
|
||
### User Stories | ||
|
||
As an application developer, I want my application pods to be scheduled onto | ||
one node that has pod with the "foo" tag and the tag value between "20" and "30". | ||
|
||
- if we use the original `In` operator to implement, then we will write affinity | ||
like this: | ||
```yaml | ||
spec: | ||
affinity: | ||
podAffinity: | ||
requiredDuringSchedulingIgnoredDuringExecution: | ||
- labelSelector: | ||
matchExpressions: | ||
- key: foo | ||
operator: In | ||
values: | ||
- 20 | ||
- 21 | ||
- 22 | ||
- 23 | ||
... | ||
- 28 | ||
- 29 | ||
- 30 | ||
topologyKey: failure-domain.beta.kubernetes.io/zone | ||
``` | ||
|
||
This is not an ideal solution. A promising solution is to provide users with `Gt` and `Lt` | ||
operators, giving users the ability to specify the scope of the tag. Users can achieve | ||
this in this way: | ||
```yaml | ||
spec: | ||
affinity: | ||
podAffinity: | ||
requiredDuringSchedulingIgnoredDuringExecution: | ||
- labelSelector: | ||
matchExpressions: | ||
- key: foo | ||
operator: Gt | ||
values: | ||
- 20 | ||
- key: foo | ||
operator: Lt | ||
values: | ||
- 30 | ||
topologyKey: failure-domain.beta.kubernetes.io/zone | ||
``` | ||
|
||
### Risks and Mitigations | ||
|
||
Along with this feature, the biggest risk should be performance. This may also be the reason | ||
why the `Gt` and `Lt` operators are not supported when Pod affinity/anti-affinity is | ||
first proposed. In order to understand the impact of these changes, we need to understand the their performance implications.So we will perform performance/benchmark tests on this change and backward compatibility test. | ||
|
||
As part of the mitigations, the good news is that the performance of the entire scheduler has been greatly improved (https://github.com/kubernetes/kubernetes/pull/74041#issuecomment-466191359), and the processing of Pod affinity/anti-affinity has been surprisingly Optimized(https://github.com/kubernetes/kubernetes/pull/67788). | ||
|
||
## Design Details | ||
### Content | ||
We will change the `LabelSelector` of `PodAffinityTerm` to the newly implemented `PodSelector` instead | ||
of `metav1.LabelSelector`: | ||
|
||
```go | ||
type PodAffinityTerm struct { | ||
LabelSelector *PodSelector | ||
... | ||
} | ||
``` | ||
|
||
API of `PodSelector` is defined as below: | ||
|
||
```go | ||
type PodSelector struct { | ||
MatchLabels map[string]string `json:"matchLabels,omitempty" protobuf:"bytes,1,rep,name=matchLabels"` | ||
// +optional | ||
MatchExpressions []PodSelectorRequirement `json:"matchExpressions,omitempty" protobuf:"bytes,2,rep,name=matchExpressions"` | ||
} | ||
|
||
type PodSelectorRequirement struct { | ||
Key string `json:"key" patchStrategy:"merge" patchMergeKey:"key" protobuf:"bytes,1,opt,name=key"` | ||
Operator PodSelectorOperator `json:"operator" protobuf:"bytes,2,opt,name=operator,casttype=LabelSelectorOperator"` | ||
// +optional | ||
Values []string `json:"values,omitempty" protobuf:"bytes,3,rep,name=values"` | ||
} | ||
|
||
type PodSelectorOperator string | ||
|
||
const ( | ||
PodSelectorOpIn PodSelectorOperator = "In" | ||
PodSelectorOpNotIn PodSelectorOperator = "NotIn" | ||
PodSelectorOpExists PodSelectorOperator = "Exists" | ||
PodSelectorOpDoesNotExist PodSelectorOperator = "DoesNotExist" | ||
PodSelectorOpGt PodSelectorOperator = "Gt" | ||
PodSelectorOpLt PodSelectorOperator = "Lt" | ||
) | ||
``` | ||
|
||
### Test Plan | ||
|
||
_To be filled until targeted at a release._ | ||
|
||
### Graduation Criteria | ||
|
||
_To be filled until targeted at a release._ | ||
|
||
## Implementation History | ||
|
||
- 2019-03-12: Initial KEP sent out for reviewing. |