New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ceph: merge toleration for osd/prepareOSD pod if specified both places #8566
Conversation
test result on cluster on pvc when toleration applied at spec:
placement:
all:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- ap-southeast-1a
- ap-southeast-1b
- ap-southeast-1c
- ap-southeast-1d
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
placement:
all:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- ap-southeast-1a
- ap-southeast-1b
- ap-southeast-1c
- ap-southeast-1d
tolerations:
- effect: NoSchedule
key: key1
operator: Equal
value: value1
rook-ceph-mgr-a-656447c65b-p6lbd 1/1 Running 0 15m
rook-ceph-mon-a-669b744887-phqh6 1/1 Running 0 19m
rook-ceph-mon-b-78f498cd68-mqq26 1/1 Running 0 19m
rook-ceph-mon-c-6fdff8fd4f-9hhd9 1/1 Running 0 17m
rook-ceph-operator-6768ff8467-7lbkd 1/1 Running 0 29m
rook-ceph-osd-0-546ff4bb8b-4g9zp 1/1 Running 0 14m
rook-ceph-osd-1-9c855657b-5st5z 1/1 Running 0 14m
rook-ceph-osd-2-7874bdbd85-ldqqd 1/1 Running 0 14m
rook-ceph-osd-prepare-set1-data-0cvh5z-z8mwn 0/1 Completed 0 15m
rook-ceph-osd-prepare-set1-data-1k765m-vmvd4 0/1 Completed 0 15m
rook-ceph-osd-prepare-set1-data-2gt4pb-9dbw7 0/1 Completed 0 15m |
test result on PVC cluster when tolerations applied at storage:
storageClassDeviceSets:
---
placement:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- ap-southeast-1a
- ap-southeast-1b
- ap-southeast-1c
- ap-southeast-1d
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
---
preparePlacement:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- ap-southeast-1a
- ap-southeast-1b
- ap-southeast-1c
- ap-southeast-1d
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule" storage:
---
placement:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- ap-southeast-1a
- ap-southeast-1b
- ap-southeast-1c
- ap-southeast-1d
tolerations:
- effect: NoSchedule
key: key1
operator: Equal
value: value1
---
preparePlacement:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- ap-southeast-1a
- ap-southeast-1b
- ap-southeast-1c
- ap-southeast-1d
---
tolerations:
- effect: NoSchedule
key: key1
operator: Equal
value: value1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also show the resulting OSD pod spec with a cluster with tolerations on both all
and the storageClassDeviceSets.placement
?
different tolerations at |
56c3a63
to
638937e
Compare
test result on PVC cluster with toleration at both
spec:
placement:
all:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- ap-southeast-1a
- ap-southeast-1b
- ap-southeast-1c
- ap-southeast-1d
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
storage:
storageClassDeviceSets:
placement:
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
tolerations:
- effect: NoSchedule
key: key1
operator: Equal
value: value1
- effect: NoSchedule
key: key1
operator: Equal
value: value1
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 5
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
topologySpreadConstraints:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- rook-ceph-osd
|
I'll test for non PVC cluster tomorrow again with the latest minor changes |
It doesn't hurt to have the same toleration twice, but the scenario is that different tolerations would be specified at the two levels. If you test a different toleration it will also confirm that the same toleration isn't just being duplicated. |
spec:
placement:
all:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- ap-southeast-1a
- ap-southeast-1b
- ap-southeast-1c
- ap-southeast-1d
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
storage:
storageClassDeviceSets:
placement:
tolerations:
- key: "key2"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
tolerations:
- effect: NoSchedule
key: key2
operator: Equal
value: value1
- effect: NoSchedule
key: key1
operator: Equal
value: value1
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 5
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300 this validate the changes, in |
b42e697
to
5a2f4b7
Compare
if with.NodeAffinity != nil { | ||
ret.NodeAffinity = with.NodeAffinity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still, this two line of code is not perfect or have some issue. It is overriding node affinity when discovering nodes, but I don't think it will cause a problem for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a separate method for MergeOsd()
? Seems like merging the tolerations should always be done in the Merge()
method so it would apply to all daemons, then no need for this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, merging toleration in merge()
method will be better, and also it will be applied to other pods like OSD, MGR, MON
test result on non-PVC cluster spec:
placement:
all:
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
osd:
tolerations:
- key: "key2"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
prepareosd:
tolerations:
- key: "key2"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
tolerations:
- effect: NoSchedule
key: key2
operator: Equal
value: value1
- effect: NoSchedule
key: key1
operator: Equal
value: value1
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
|
if with.NodeAffinity != nil { | ||
ret.NodeAffinity = with.NodeAffinity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a separate method for MergeOsd()
? Seems like merging the tolerations should always be done in the Merge()
method so it would apply to all daemons, then no need for this method.
f1909e6
to
61d2e7f
Compare
earlier, `ApplyToPodSpec()` was only taking one toleration and ignoring tolerations from `placement.ALL()`. this commit merge toleration for Mgr,Mon,Osd pod example, for osd it will merge spec.placement.all and storageDeviceClassSets.Placement(in case of pvc) or spec.placement.osd(in case of non-pvc's). Signed-off-by: subhamkrai <srai@redhat.com>
61d2e7f
to
e1f232e
Compare
ceph: merge toleration for osd/prepareOSD pod if specified both places (backport #8566)
earlier,
ApplyToPodSpec()
was only taking one toleration and ignoringtolerations from
placement.ALL()
.Description of your changes:
this commit merge toleration for Mgr,Mon,Osd pod
For example, for osd it will merge
spec.placement.all and
storageDeviceClassSets.Placement(in case of pvc) or
spec.placement.osd(in case of non-pvc's).
Signed-off-by: subhamkrai srai@redhat.com
TODO
Test on PVC cluster
Test on non-PVC cluster
Which issue is resolved by this Pull Request:
Resolves #
Checklist:
make codegen
) has been run to update object specifications, if necessary.