Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix StatefulSet set-based selector bug #59365

Merged
merged 4 commits into from
Feb 27, 2018

Conversation

ayushpateria
Copy link
Contributor

@ayushpateria ayushpateria commented Feb 5, 2018

What this PR does / why we need it:
ControllerRevisions were using selectors as the labels, in case of set-based selectors, the helper function to convert selectors to labels would break. This PR uses pod labels for ControllerRevision labels instead of selectors.
Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #59266

Special notes for your reviewer:
I'm trying to learn Kubernetes codebase and would be happy to make changes if anything is off.
Release note:

Fix StatefulSet to work with set-based selectors.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 5, 2018
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 5, 2018
@lavalamp
Copy link
Member

lavalamp commented Feb 5, 2018

Please add a test :)

@lavalamp
Copy link
Member

lavalamp commented Feb 5, 2018

/assign @cheftako

func getPodLabelsFromData(data runtime.RawExtension) map[string]string {
var dat map[string]interface{}
labelMap := make(map[string]string)
if err := json.Unmarshal(data.Raw, &dat); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we know that the raw extension will always be json and not say protobuf?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cheftako Please have a look at my recent commit :)

var dat map[string]interface{}
labelMap := make(map[string]string)
if err := json.Unmarshal(data.Raw, &dat); err != nil {
return labelMap
Copy link
Member

@cheftako cheftako Feb 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the raw data does not parse we should not be silently swallowing that error. Additionally we clearly are missing tests for cases where either the key or value are invalid.

Copy link
Contributor Author

@ayushpateria ayushpateria Feb 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will a log warning work? I didn't want it to break like previously if the problem is only with parsing of labels, as they aren't being used anywhere anyway.

return labelMap
}
for key, value := range labels {
labelMap[key] = value.(string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make sure to validate both the key and value to make sure they are valid label key and label values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we required to validate key and values again? Because they must have already been validated when creating the StatefulSet (or pods).

@0xmichalis
Copy link
Contributor

@kubernetes/sig-apps-pr-reviews

@k8s-ci-robot k8s-ci-robot added the sig/apps Categorizes an issue or PR as relevant to SIG Apps. label Feb 6, 2018
if err != nil {
return nil, err
}
labelMap := getPodLabelsFromData(data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should require the caller of NewControllerRevision to provide the labels, rather than making assumptions about the contents of data.

cc @kow3ns @janetkuo

Copy link
Contributor Author

@ayushpateria ayushpateria Feb 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@enisoc please check :)

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 8, 2018
@@ -453,4 +451,4 @@ func (fh *fakeHistory) ReleaseControllerRevision(parent metav1.Object, revision
}
}
return clone, fh.indexer.Update(clone)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this diff

@@ -316,8 +316,13 @@ func newRevision(set *apps.StatefulSet, revision int64, collisionCount *int32) (
if err != nil {
return nil, err
}
podLabels := set.Spec.Template.Labels
if podLabels == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

podLabels should never be nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had this doubt, did the check anyway to be on safer side. Removed the check now. Please see the recent commit.

@0xmichalis
Copy link
Contributor

lgtm

also needs a unit test that uses a set-based selector

@ayushpateria
Copy link
Contributor Author

@Kargakis I've modified newStatefulSet() to include a set based selector, this should ensure controller revisions don't fail when stateful sets are having set-based selectors.

if err != nil {
t.Fatal(err)
}
ss1Rev2.Namespace = ss1.Namespace
ss1Rev3, err := NewControllerRevision(ss1, parentKind, sel1, rawTemplate(&ss1.Spec.Template), 2, ss1.Status.CollisionCount)
ss1Rev3, err := NewControllerRevision(ss1, parentKind, ss1.Spec.Template.Labels, rawTemplate(&ss1.Spec.Template), 3, ss1.Status.CollisionCount)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestSortControllerRevisions was failing because of the incorrect version number (2 was being used for ss1Rev3).

@enisoc
Copy link
Member

enisoc commented Feb 19, 2018

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Feb 19, 2018
if err != nil {
return nil, err
}
labelMap := podLabels
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to make a copy here; otherwise, podLabels gets mutated when cr.Labels is mutated in L91.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this means template labels (of the given StatefulSet) gets mutated.

@@ -56,20 +56,17 @@ func ControllerRevisionName(prefix string, hash uint32) string {
}

// NewControllerRevision returns a ControllerRevision with a ControllerRef pointing to parent and indicating that
// parent is of parentKind. The ControllerRevision has labels matching selector, contains Data equal to data, and
// parent is of parentKind. The ControllerRevision has labels matching pod, contains Data equal to data, and
Copy link
Member

@janetkuo janetkuo Feb 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CR's labels is a copy of template labels; "matching" is between selector and labels.

// has a Revision equal to revision. The collisionCount is used when creating the name of the ControllerRevision
// so the name is likely unique. If the returned error is nil, the returned ControllerRevision is valid. If the
// returned error is not nil, the returned ControllerRevision is invalid for use.
func NewControllerRevision(parent metav1.Object,
parentKind schema.GroupVersionKind,
selector labels.Selector,
podLabels map[string]string,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name it templateLabels

cr, err := history.NewControllerRevision(set,
controllerKind,
selector,
podLabels,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

templateLabels

t.Fatal(err)
}
ss1Rev1, err := NewControllerRevision(ss1, parentKind, sel1, rawTemplate(&ss1.Spec.Template), 1, nil)
sel1 := labels.Set(ss1.Spec.Template.Labels).AsSelector()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think sel1 should remain what it was (pulling the actual selector from the StatefulSet). This selector ends up being used to list ControllerRevisions. If the selector has MatchExpressions, it's important that sel1 includes those as well.

Copy link
Contributor Author

@ayushpateria ayushpateria Feb 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The selector selects ControllerRevisions on the basis of labels, earlier because the labels of controller revisions and selectors of StatefulSet were same, we could use StatefulSet selector. But now the labels of CRs are the template labels, so I changed the CR's selector to that.
Every StatefulSet selector should have a matching pod label (which is also CR's label), so I think the previous version would work as well, but it's not consistent with labels logic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's always safe to use the StatefulSet's selector, because we ensure through validation that it selects the pod template labels. Since we now apply the pod template labels to revisions too, we can also expect that the StatefulSet's selector will match those revisions.

On the other hand, generating a selector purely from the template labels will not always give the right result. Here is a contrived example of that:

...
metadata:
  name: ss1
spec:
  selector:
    matchLabels:
      component: mything
    matchExpressions:
    - {key: role, operator: DoesNotExist}
  template:
    metadata:
      labels:
        component: mything
...
---
...
metadata:
  name: ss2
spec:
  selector:
    matchLabels:
      component: mything
      role: test
  template:
    metadata:
      labels:
        component: mything
        role: test

If you generate a selector directly from the pod template labels, ss1 would end up also selecting ss2's revisions, whereas ss1's actual selector means that it will avoid selecting ss2's pods. We want the behavior for selecting pods and revisions to be consistent with each other, so we respect the user's intention.

I called this example contrived because it's probably a bad idea for the user to do this. However, the API allows it, so we need to make sure we do the right thing.

if err != nil {
return nil, err
}
labelMap := podLabels
Copy link
Member

@enisoc enisoc Feb 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable isn't needed anymore since there's no conversion happening. We can just directly use the input parameter as Labels: podLabels.

Nevermind, see comment above by @janetkuo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

podLabels is not being used anywhere later, I agree with what you struck out. Am I missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janetkuo pointed out in the other thread that we mutate this map below on this line:

cr.Labels[ControllerRevisionHashLabel] = strconv.FormatInt(int64(hash), 10)

Since map types are references, this means we're mutating the map that the caller passed in to us. So it turns out we do need to make our own labelMap var after all, to hold a deep copy of podLabels. Unless you can find a utility helper for this, you'll need to just allocate a new map and then range loop over the input map to copy each entry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I was thinking that even if it gets mutated it's not affecting anything. But yeah, we should not mutate anything that caller passes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not affecting anything

It actually mutates template labels (set.Spec.Template.Labels) the caller passes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks.

t.Fatal(err)
}
ss1Rev1, err := NewControllerRevision(ss1, parentKind, sel1, rawTemplate(&ss1.Spec.Template), 1, nil)
sel1 := labels.Set(ss1.Spec.Template.Labels).AsSelector()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

@@ -1695,6 +1604,13 @@ func newStatefulSet(replicas int, name string, uid types.UID, labels map[string]
Spec: apps.StatefulSetSpec{
Selector: &metav1.LabelSelector{
MatchLabels: labels,
MatchExpressions: []metav1.LabelSelectorRequirement{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this change to the test would have prevented the exact manifestation of the bug we hit (converting a new-style, set-based selector into a string and then trying to parse it as an old-style, map-only selector). However, there are other possible manifestations of the general problem (forgetting about set-based selectors) that this test won't prevent in the future. For example, if we regress to pulling the labels from MatchLabels without converting via string format/parse, this test won't catch that.

I think we'll catch more potential regressions if we include test cases with no MatchLabels at all, so we're sure it will break if any link in the chain ignores the set-based MatchExpressions. We could do that by converting all entries in labels into MatchExpressions (instead of just the first pair), and leaving MatchLabels empty.

Also, we should leave a comment here saying that we are intentionally using a selector with empty MatchLabels to make sure MatchExpressions works as expected, to reduce the chance that someone will weaken our test in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@enisoc Updated! :)

cr, err := history.NewControllerRevision(set,
controllerKind,
selector,
podLabels,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can directly put set.Spec.Template.Labels here.

@kow3ns
Copy link
Member

kow3ns commented Feb 26, 2018

/approve

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 26, 2018
@janetkuo
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 26, 2018
Copy link
Member

@enisoc enisoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ayushpateria, enisoc, janetkuo, kow3ns

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kow3ns kow3ns added this to the v1.10 milestone Feb 26, 2018
@kow3ns kow3ns added approved-for-milestone area/stateful-apps kind/bug Categorizes issue or PR as related to a bug. labels Feb 26, 2018
@kow3ns kow3ns added this to Done in Workloads Feb 26, 2018
@kow3ns kow3ns moved this from Done to In Progress in Workloads Feb 26, 2018
@enisoc
Copy link
Member

enisoc commented Feb 26, 2018

The verify failure looks like #60447. Looks like e2e-gce didn't even run.

/test pull-kubernetes-e2e-gce

@BenTheElder
Copy link
Member

fix PR for verify merged
/test pull-kubernetes-verify

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 249ecab into kubernetes:master Feb 27, 2018
Workloads automation moved this from In Progress to Done Feb 27, 2018
cr, err := history.NewControllerRevision(set,
controllerKind,
selector,
set.Spec.Template.Labels,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the selector doesn't match these labels?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validation prevents a StatefulSet with a selector that doesn't match its own template labels from being created. apps/v1 StatefulSet's selector cannot be changed after creation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/stateful-apps cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/apps Categorizes an issue or PR as relevant to SIG Apps. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
Workloads
  
Done
Development

Successfully merging this pull request may close these issues.

set-based selectors are broken in statefulsets
10 participants