Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] fix probe-level termination grace period bug and add test cases #827

Conversation

raisaat
Copy link

@raisaat raisaat commented Jun 23, 2021

/kind feature

@openshift-ci-robot openshift-ci-robot added the backports/unvalidated-commits Indicates that not all commits come to merged upstream PRs. label Jun 23, 2021
@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. labels Jun 23, 2021
@openshift-ci-robot
Copy link

@raisaat: the contents of this pull request could not be automatically validated.

The following commits could not be validated and must be approved by a top-level approver:

@openshift-ci
Copy link

openshift-ci bot commented Jun 23, 2021

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: raisaat
To complete the pull request process, please assign marun after the PR has been reviewed.
You can assign the PR to them by writing /assign @marun in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot requested review from mfojtik and soltysh June 23, 2021 22:53
Copy link

@ehashman ehashman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be opened against kubernetes/kubernetes, but I put a preliminary review here while we figure out the patch rebase.

// new pod should not have grace period if the feature gate is disabled
t.Errorf("new pod should not have grace period if the feature is disabled")

case reflect.DeepEqual(newPod, newPodInfo.pod()):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this makes sense as a case.

Let's write down our behaviour matrix and then figure out the test cases.

  • Feature flag is on:
    • Pod has grace period: unchanged
    • Pod does not have grace period: unchanged
  • Feature flag is off:
    • Pod has grace period: should be dropped
    • Pod does not have grace period: unchanged

Thus, I'd expect our cases to be:

  1. if enabled || !oldHasGracePeriod: should be unchanged
  2. Now we know it's disabled. if oldHasGracePeriod: should be dropped
  3. We must also check, if none of the above are triggered, if !enabled && newHasGracePeriod, as we'd always want to error in that case, since the field should have been blanked out.

Copy link
Author

@raisaat raisaat Jun 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think https://github.com/raisaat/kubernetes/blob/feature-probe-termination-grace-period-issue2238/pkg/api/pod/util_test.go#L1225 takes care of case 3, since we know that if case 1 fails, !enabled is true so !enabled && newPodHasGracePeriod can be written as just newPodHasGracePeriod.

I was thinking case 3 takes care of case 2, since case 3 makes sure that the new state of the old pod has the grace period blanked out (i.e. newPodHasGracePeriod is false) when the feature is disabled regardless of whether the old pod had a grace period or not. If this is not the case, which variable would check whether the grace period field has been dropped for case 2?


// old pod should never be changed
if !reflect.DeepEqual(oldPod, oldPodInfo.pod()) {
t.Errorf("old pod changed: %v", cmp.Diff(oldPod, oldPodInfo.pod()))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change the diff function?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I left this part unchanged. It was there already: https://github.com/kubernetes/kubernetes/blob/master/pkg/api/pod/util_test.go#L1208

// new pod should be changed
if reflect.DeepEqual(newPod, newPodInfo.pod()) {
t.Errorf("new pod was not changed")
defer featuregatetesting.SetFeatureGateDuringTest(t, utilfeature.DefaultFeatureGate, features.ProbeTerminationGracePeriod, enabled)()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch.

@openshift-ci
Copy link

openshift-ci bot commented Jun 24, 2021

@raisaat: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/verify-commits f330f5e link /test verify-commits
ci/prow/unit f330f5e link /test unit
ci/prow/k8s-e2e-gcp f330f5e link /test k8s-e2e-gcp
ci/prow/e2e-gcp f330f5e link /test e2e-gcp
ci/prow/e2e-gcp-upgrade f330f5e link /test e2e-gcp-upgrade

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 23, 2021
@openshift-ci
Copy link

openshift-ci bot commented Sep 23, 2021

@raisaat: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 23, 2021
@ehashman
Copy link

/close

@openshift-ci
Copy link

openshift-ci bot commented Sep 23, 2021

@ehashman: Closed this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot closed this Sep 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backports/unvalidated-commits Indicates that not all commits come to merged upstream PRs. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants