New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sts per Pod Name Label #55329
Sts per Pod Name Label #55329
Conversation
@@ -37,6 +37,10 @@ import ( | |||
// maxUpdateRetries is the maximum number of retries used for update conflict resolution prior to failure | |||
const maxUpdateRetries = 10 | |||
|
|||
// StatefulSetPodNameLabel is label added to each Pod in a StatefulSet. Its purpose is to allow Services to select | |||
// individual Pods in the StatefulSet. | |||
const StatefulSetPodNameLabel = "statefulSetPodName" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have conventions for machine-managed label names? For controller revision, we use controller.kubernetes.io/hash
. For failure domains, we use failure-domain.beta.kubernetes.io/region
. For Deployment, we use pod-template-hash
.
Also, should we put this constant in types.go
for a given API version like the others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, should we put this constant in types.go for a given API version like the others?
The label is not version specific, and, by convention, the versioning seems to only apply to annotations. If we were to move it, I think the only other appropriate home would well_know_labels.go in the apis package.
I just want the name to be easily usable statefulset.kubernetes.io/podName
is fine with me, but as you point out we don't have conventions. It would be good to start a trend though. Do you have thoughts on a better name?
@@ -112,7 +116,8 @@ func identityMatches(set *apps.StatefulSet, pod *v1.Pod) bool { | |||
return ordinal >= 0 && | |||
set.Name == parent && | |||
pod.Name == getPodName(set, ordinal) && | |||
pod.Namespace == set.Namespace | |||
pod.Namespace == set.Namespace && | |||
pod.Labels[StatefulSetPodNameLabel] == pod.Name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we lose anything by making the label just the ordinal, instead of the whole Pod name?
I think it's more flexible. For example, you could select the ordinal=0 Pod across multiple StatefulSets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the pod name is namespace unique. The ordinal is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Selecting the same Pod across multiple sts's defeats the purpose of the label.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. The ordinal label would be insufficient, even if combined with the other selector labels, because controllers can have overlapping selectors. Agreed that selecting across multiple StatefulSets doesn't meet any need we've ever heard of.
func updateIdentity(set *apps.StatefulSet, pod *v1.Pod) { | ||
pod.Name = getPodName(set, getOrdinal(pod)) | ||
pod.Namespace = set.Namespace | ||
|
||
pod.Labels[StatefulSetPodNameLabel] = pod.Name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that pod.Labels
has become nil
again at this point? Do we enforce anywhere that Pods must have at least one label?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it shouldn't but it doesn't hurt to check
/retest |
I found some written conventions that seem to apply here:
This indicates we should avoid adding a prefix for the label key, but we should allow the user to override the label key with a setting in StatefulSet.Spec. For backward compatibility, maybe we should set the default key as "" (meaning don't add a label at all) for pre-v1 versions, so the behavior stays the same. For v1, we can set a non-empty default key.
Given this I suggest |
This is not applicable as it pertains to auto generated labels based on hashes and UUIDs that have no semantic meaning. e.g. Job controllers override.
I don't think we should allow name overriding without a more compelling reason. We should prefer simplicity over the potential to introduce misconfiguration unless there is an actual use case to support. If we find one, we can add an override switch in a backward compatible way.
SGTM |
@@ -37,6 +37,10 @@ import ( | |||
// maxUpdateRetries is the maximum number of retries used for update conflict resolution prior to failure | |||
const maxUpdateRetries = 10 | |||
|
|||
// StatefulSetPodNameLabel is label added to each Pod in a StatefulSet. Its purpose is to allow Services to select | |||
// individual Pods in the StatefulSet. | |||
const StatefulSetPodNameLabel = "statefulset.kubernetes.io/podName" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably put this in types.go, like where we placed pod-template-hash
@@ -185,13 +190,18 @@ func initIdentity(set *apps.StatefulSet, pod *v1.Pod) { | |||
// Set these immutable fields only on initial Pod creation, not updates. | |||
pod.Spec.Hostname = pod.Name | |||
pod.Spec.Subdomain = set.Spec.ServiceName | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove this
@@ -37,6 +37,10 @@ import ( | |||
// maxUpdateRetries is the maximum number of retries used for update conflict resolution prior to failure | |||
const maxUpdateRetries = 10 | |||
|
|||
// StatefulSetPodNameLabel is label added to each Pod in a StatefulSet. Its purpose is to allow Services to select | |||
// individual Pods in the StatefulSet. | |||
const StatefulSetPodNameLabel = "statefulset.kubernetes.io/podName" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most workload controllers use xxx-xxx-xxx
format (instead of xxx.xxxx.xxx/xxxx
) as label key (e.g. pod-template-hash, controller-revision-hash).
Deployment hash label was named pod-template-hash
instead of something like deployment.kubernetes.io/podTemplateHash
, because the latter is lengthy and more suitable as annotations -- we expect users to type label name (e.g. kubectl get name -l
) more frequently than annotations.
This is not documented AFAIK. The convention may have changed.
Does this fix #28660? |
@kow3ns wrote:
That particular paragraph may be specific to hash-style labels, but it's an application of the general convention proposed above it:
Of course, it's only the author's opinion of what the convention should be. Also, it seems we have decided not to follow it for Job and Deployment, so I'm willing to chalk it up to out-of-date documentation. However, I still think it's important that this behavior be controlled by a field that's off by default for pre-v1. Unlike annotations, labels are the user's domain, so adding a new label is a change in behavior. What do you think? @janetkuo wrote:
I think it makes a workaround possible, but a full fix for that issue would include a feature that creates a Service per StatefulSet Pod for you. |
ffa1624
to
4aab534
Compare
@janetkuo I think it closes #28660 as will not fix. We will never materializes Services on behalf of the user. We will provide, as a part of the StatefulSet's Pods' identity, a label that allows Services to be attached to the Pods in order to allow users to configure the Pods' networking as appropriate to their use case. @enisoc The label should be considered to be part of the Pods identity. Having a unique identity that is not selectable has become tantamount to a defect for many users. Service per Pod is the right way to route traffic to StatefulSet Pods where each Pod needs to advertise a unique network identity, and, without this label as part of the Pods identity, there is no viable way to do this. If adding the label was a destructive update I would definitely want an opt in switch, but the addition of the label to running Pods is a non-destructive update that should not adversely impact users. |
I still think that adding the label is new behavior according to the letter of the API conventions, but I'm willing to believe that it's non-destructive in all but the most idiosyncratic of setups. In addition, we are only "breaking" pre-v1 APIs. @janetkuo Do you have an opinion on the above? |
/retest |
@@ -28,6 +28,7 @@ const ( | |||
StatefulSetRevisionLabel = ControllerRevisionHashLabelKey | |||
DeprecatedRollbackTo = "deprecated.deployment.rollback.to" | |||
DeprecatedTemplateGeneration = "deprecated.daemonset.template.generation" | |||
StatefulSetPodNameLabel = "statefulset-pod-name" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be prefixed. We have absolutlely no consistency across teh codebase but it should be either:
statefulset.kubernetes.io/pod-name
or
kubernetes.io/statefulset-pod-name
I lean towards the former.
revision hash should probably be, also, but it's harder to fix retroactively.
attach a Service to an individual Pod.
d7d5fcc
to
1ce4ef8
Compare
/approve |
/retest |
[MILESTONENOTIFIER] Milestone Pull Request Needs Attention @enisoc @janetkuo @kow3ns @thockin @kubernetes/sig-apps-misc Action required: During code slush, pull requests in the milestone should be in progress. Note: If this pull request is not resolved or labeled as Pull Request Labels
|
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: janetkuo, kow3ns, thockin Associated issue: 44103 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
Automatic merge from submit-queue (batch tested with PRs 55340, 55329, 56168, 56170, 56105). If you want to cherry-pick this change to another branch, please follow the instructions here. |
StatefulSets have pod name labels since Kubernetes v1.9 [1] Minimum supportted version in the project is 1.17.2+ The sync pod-name hook is provided in the example repo [2] for older versions where Sts pod-name-labels are not set by default [1] kubernetes/kubernetes#55329 [2] https://github.com/GoogleCloudPlatform/metacontroller/tree/master/examples/service-per-pod
StatefulSets have pod name labels since Kubernetes v1.9 [1] Minimum supportted version in the project is 1.17.2+ The sync pod-name hook is provided in the example repo [2] for older versions where Sts pod-name-labels are not set by default [1] kubernetes/kubernetes#55329 [2] https://github.com/GoogleCloudPlatform/metacontroller/tree/master/examples/service-per-pod
What this PR does / why we need it:
StatefulSet controller will add a label for each Pod in the StatefulSet. The label is of the form
statefulset.kubernetes.io/pod-name: <pod.Name>
. This allows a unique service to be created for each Pod in the StatefulSet.Fixes #44103, #28660