Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execution retry and timeout of lifecycle hooks #46426

Closed

Conversation

PengTaoWW
Copy link
Contributor

@PengTaoWW PengTaoWW commented May 25, 2017

What this PR does / why we need it:
I have an improvement on Kubernetes, in our production environment, run perstop hook when removing pod, but no matter how quickly the result of this pod is removed, my improvement is that when Perstop fails, it will retry until the execution s
ucceeds, or the timeout is removed. I have implemented a version of this feature myself. if there's a problem with my improvement, please let me know, thank you.
and this time i clear code .just my patch not auto-generated code
Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #

Special notes for your reviewer:
please help me ,this is my first submit code .
Release note:

containers:
  - name: lifecycle
    image: busybox
    lifecycle:
      postStart:
        exec:
          command:
            - "touch"
            - "/var/log/lifecycle/post-start"
      preStop:
        httpGet:
          path: "/abort"
          port: 8080
        isRetryUntilSuccess: true
 This PR adds a switch to control whether Prestop runs multiple times until it succeeds, or the timeout exits
The handler adds a bool value switch, which is similar to the Exec, HTTP, and Tcpsocket. But after these three switches,

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: PengTaoWW
We suggest the following additional approver: brendandburns

Assign the PR to them by writing /assign @brendandburns in a comment when ready.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 25, 2017
@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label May 25, 2017
@k8s-ci-robot
Copy link
Contributor

Hi @PengTaoWW. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with @k8s-bot ok to test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 25, 2017
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels May 25, 2017
@PengTaoWW
Copy link
Contributor Author

/assign @brendandburns

@PengTaoWW
Copy link
Contributor Author

@timstclair @brendandburns Is there anything else I need to solve?

},
"retryUntilSuccess": {
"$ref": "v1.Handler",
"description": "PreStop is called immediately before a container is terminated. The container is terminated after the handler completes. The reason for termination is passed to the handler. Regardless of the outcome of the handler, the container is eventually terminated. Other management of the container blocks until the hook completes. More info: http://kubernetes.io/docs/user-guide/container-environment#hook-details"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/PreStop/RetryUntilSuccess

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my bad ,i will fixed this

defer close(done)
defer utilruntime.HandleCrash()
if msg, err := m.runner.Run(containerID, pod, containerSpec, containerSpec.Lifecycle.RetryUntilSuccess); err != nil {
glog.Errorf("preStop hook for container %q failed: %v", containerSpec.Name, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/preStop/retryUntilSuccess

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wil fixed this

}

}
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the hook never succeeds? The container will get stuck, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after some times, code line 505 will run .not get stuck

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code line 505 will run , never get stuck

}()

select {
case <-time.After(time.Duration(gracePeriod) * time.Second):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiangpengzhao this line code will run in after long time . not stuck

}

}
}()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after some times, code line 505 will run .not get stuck

defer close(done)
defer utilruntime.HandleCrash()
if msg, err := m.runner.Run(containerID, pod, containerSpec, containerSpec.Lifecycle.RetryUntilSuccess); err != nil {
glog.Errorf("preStop hook for container %q failed: %v", containerSpec.Name, err)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wil fixed this

@brendandburns
Copy link
Contributor

Hrm,
Rather than adding a brand new lifecycle event, why not add a boolean to enable this behavior for the existing pre-stop handler. (default false) That seems much cleaner.

@brendandburns
Copy link
Contributor

(oops, didn't mean to close...)

@PengTaoWW
Copy link
Contributor Author

PengTaoWW commented May 28, 2017

@brendandburns So you mean

containers:
  - name: lifecycle
    image: busybox
    lifecycle:
      postStart:
        exec:
          command:
            - "touch"
            - "/var/log/lifecycle/post-start"
      preStop:
        isRetryUntilSuccess: True
        httpGet:
          path: "/abort"
          port: 8080

like this ?

@PengTaoWW
Copy link
Contributor Author

@brendandburns i write a version code for add a boolean to enable this behavior for the existing pre-stop handler.
@xiangpengzhao old version code will never stop ,i fix it. and this version can stop.thank you for your help

@PengTaoWW
Copy link
Contributor Author

@PengTaoWW
Copy link
Contributor Author

@timstclair @brendandburns @xiangpengzhao Is this version all right?

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 18, 2017
@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jul 21, 2017
@k8s-github-robot k8s-github-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 24, 2017
@PengTaoWW
Copy link
Contributor Author

@timstclair @xiangpengzhao @brendandburns sorry about I lost the version of my local clean code, and now I have only the version after the code is generated. What am I going to do?

@PengTaoWW
Copy link
Contributor Author

@timstclair @xiangpengzhao @brendandburns sorry about I lost the version of my local clean code, and now I have only the version after the code is generated. What am I going to do?

1 similar comment
@PengTaoWW
Copy link
Contributor Author

@timstclair @xiangpengzhao @brendandburns sorry about I lost the version of my local clean code, and now I have only the version after the code is generated. What am I going to do?

@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 29, 2017
@PengTaoWW PengTaoWW force-pushed the retryUntilSuccessHookClean branch 2 times, most recently from 5cb7fa1 to e02fc71 Compare July 30, 2017 01:47
@PengTaoWW
Copy link
Contributor Author

/test pull-kubernetes-e2e-kops-aws
/test pull-kubernetes-verify
/test pull-kubernetes-unit

@PengTaoWW PengTaoWW force-pushed the retryUntilSuccessHookClean branch 2 times, most recently from 866acae to c35be2d Compare July 30, 2017 14:54
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Jul 30, 2017

@PengTaoWW: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-kops-aws 0eb0bb4 link /test pull-kubernetes-e2e-kops-aws
pull-kubernetes-verify 0eb0bb4 link /test pull-kubernetes-verify

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot
Copy link

@PengTaoWW PR needs rebase

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 1, 2017
@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 23, 2017
@k8s-github-robot
Copy link

This PR hasn't been active in 90 days. Closing this PR. Please reopen if you would like to work towards merging this change, if/when the PR is ready for the next round of review.

cc @PengTaoWW @brendandburns @timstclair

You can add 'keep-open' label to prevent this from happening again, or add a comment to keep it open another 90 days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants