Create 0013-graceful-sidecar-termination.md#191
Create 0013-graceful-sidecar-termination.md#191shahnewazrifat wants to merge 3 commits intotektoncd:masterfrom
Conversation
TEP for introducing the control to enable graceful sidecar termination (skipping Nop Image Swap)
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
| ## Summary | ||
|
|
||
| Currently, the sidecar feature in a tekton pipeline/taskrun is not behaving like a typical kubernetes sidecar should. | ||
| The sidecar gets terminated immediately when the pipeline fails or completes without giving the sidecar enough time to |
There was a problem hiding this comment.
nit: pipeline fails or completes -> taskrun fails or completes
|
|
||
| ### User Stories | ||
|
|
||
| A `Pipeline` author can add a Sidecar with `waitForSidecar` enabled (true) and expect `natural death` of his/her sidecar. |
There was a problem hiding this comment.
This is a design detail I think. User Stories might be something like:
As a Task author I want to add a sidecar that uploads the results of tests to external storage after the tests complete (even if they fail) so that I can review the results later.
As a Task author I want to add a sidecar that builds a report from the outcomes of some Steps and submits that report when the Task has reached a completed state so that the reports can be aggregated by an external system.
There was a problem hiding this comment.
Do you plan to update the User Stories with items like that?
|
|
||
| A `Pipeline` author can add a Sidecar with `waitForSidecar` enabled (true) and expect `natural death` of his/her sidecar. | ||
|
|
||
| ## Design Details |
There was a problem hiding this comment.
It would be good to include a description of how a running sidecar would be treated if a TaskRun timeout is reached.
There was a problem hiding this comment.
Current implementation doesn't have it, but I believe, in such cases it should be force terminated.
What does Tekton do for other containers at this point?
There was a problem hiding this comment.
https://github.com/tektoncd/pipeline/blob/master/docs/taskruns.md#configuring-the-failure-timeout
The TaskRun's Pod is deleted if a timeout is hit. So I think the only thing to do here would be to add a line that describes this behaviour remaining the same even if the user has waitForSidecar: true. Something like "If a TaskRun's timeout is reached then Sidecars are terminated regardless of the state of this flag."
ghost
left a comment
There was a problem hiding this comment.
I've left a bunch of comments. I'm generally in favor of LGTMing this TEP for the feature but don't totally agree with the motivations behind it. Would also like to see the formatting cleaned up to match others and a couple of design issues addressed. Also very curious to hear from other @tektoncd/core-maintainers !
| Having no way to alter the behavior was the second motivation. | ||
|
|
||
| Consider this task: | ||
| Task: |
| do what it originally intended to do (for example, upload logs, process artifacts etc.) | ||
| In some cases the sidecar should be able to choose when to terminate itself. A sidecar is just a kubernetes container. | ||
| It's able to terminate itself with the command completion. | ||
|
|
|
|
||
| ## Summary | ||
|
|
||
| Currently, the sidecar feature in a tekton taskrun is not behaving like a typical kubernetes sidecar should. |
There was a problem hiding this comment.
I have to confess I'm less swayed by discussion revolving around what "typical" Kubernetes patterns are than I am useful features for Tekton users. Tekton already takes a very opinionated and non-typical approach to all of the containers inside the Pods it creates. As just one example of this opinionated approach: the containers for Steps are forced to execute sequentially, which is very non-typical by k8s standards. Any broad statement like Tekton should adhere to typical Kubernetes workload semantics don't, in and of themselves, carry a ton of water when we're discussing a Tekton-focused feature (at least in my view, which probably isn't shared by everyone).
Having said that I think there are some very strong use cases supported by this TEP. I'd much prefer to see this doc focused on those usage-driven improvements.
If we are going to include an appeal to authority here and say that Tekton should support this feature simply because there's an existing precedent in Kubernetes then I'd like to see reference to a doc that outlines what the typical Kubernetes sidecar behavior is. A possible point of reference might be the seemingly-stalled Sidecars lifecycle KEP: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/0753-sidecarcontainers.md which outlines what Sidecars might eventually become if they're ever officially adopted by Kubernetes.
There was a problem hiding this comment.
I'm with @sbwsg here, I think this sentence could go, there is not should or should not here, only "ways" to do things
There was a problem hiding this comment.
Yes, I think I was driven by the facts that came to my mind rather than thinking it from Tekton perspective 😄
Thank you guys!
Shall amend 🙇
|
|
||
| ### User Stories | ||
|
|
||
| A `Pipeline` author can add a Sidecar with `waitForSidecar` enabled (true) and expect `natural death` of his/her sidecar. |
There was a problem hiding this comment.
Do you plan to update the User Stories with items like that?
| ## Motivation | ||
|
|
||
| The first motivation is the counter intuitive nature of the sidecars. As a developer, I did not expect my sidecar to be | ||
| terminated in such way. I had to check the docs to find out that the sidecar was being terminated using a Nop Image. |
There was a problem hiding this comment.
The behaviour is documented and has been how sidecars work since they were added to Tekton. I can understand this being surprising but similarly to my comment above about "typical" k8s behaviours I don't totally agree that this should be the primary motivation for the feature. Ideally in my view we should frame this TEP primarily around usefulness to Tekton's users.
There was a problem hiding this comment.
Agreed with @sbwsg , sidecars are what they are now. What we should put in motivation here is what is the benefit for users of having a way to allow sidecar to gracefully terminate. The fact that the current behaviour of sidecar is confusing or not shouldn't be a motivation on its own 😝
There was a problem hiding this comment.
Thanks @sbwsg and @vdemeester ! This TEP is really helping me learn how to do this! I was mixing up a proposal with implementation details. I'll rethink the whole thing and try to make it better 😄
|
|
||
| **Pros/Cons:** | ||
| Pros: | ||
| Discussed above. |
There was a problem hiding this comment.
It would be great to follow the suggested format of the TEP, remove mention of Pros/Cons and just use the Advantages and Drawbacks sections.
| Discussed above. | ||
|
|
||
| Cons: | ||
| Can't think of any. Unless the pipeline author adds a sidecar with this flag enabled and the sidecars' cmd is an infinite one. Causing the pipeline to hang |
There was a problem hiding this comment.
I think this is a good drawback to call out. Suggest formatting it something like:
## Drawbacks
* When the waitForSidecar flag is used and a timeout of 0 is specified for the TaskRun, a Sidecar that fails to terminate could result in a TaskRun or PipelineRun executing forever until a user manually intervenes to stop the Pod.
| forever. | ||
|
|
||
| ## Alternatives | ||
| Add WaitForSidecar flag to TaskRunSpec |
There was a problem hiding this comment.
Suggest formatting this a bit differently just to make it a bit clearer on what's being proposed and rejected. Something like:
Alternatives
Add Sidecar stopping configuration to TaskRuns instead of Tasks
Add configuration to TaskRuns that allows per-TaskRun configuration of the sidecar stopping behavior. See example YAML below. This alternative is inferior to the proposal above because it does not allow per-Sidecar configuration of the stopping behavior. It also puts the decision to stop Sidecars into the hands of a Task's user instead of the Task's author.
apiVersion: tekton.dev/v1beta1 kind: TaskRun spec: # note: setting the waitForSidecar flag on the TaskRun instead of Task waitForSidecar: true taskRef: kind: Task name: task-test-gitops-0800ac6-c7298ca-28
|
|
||
| ## References | ||
|
|
||
| * https://github.com/tektoncd/pipeline/issues/1131 |
There was a problem hiding this comment.
Consider adding a link to your design doc as well for more context.
| name: e2e-report | ||
| ``` | ||
|
|
||
| Golang Example: |
There was a problem hiding this comment.
I'm not sure this adds much that isn't already covered by the yaml example. Is there anything specific here you're trying to demonstrate wrt the proposed behavior or syntax?
There was a problem hiding this comment.
Not really. It's just there as an example.
|
|
||
| ## Summary | ||
|
|
||
| Currently, the sidecar feature in a tekton taskrun is not behaving like a typical kubernetes sidecar should. |
There was a problem hiding this comment.
I'm with @sbwsg here, I think this sentence could go, there is not should or should not here, only "ways" to do things
| It's able to terminate itself with the command completion. | ||
|
|
||
|
|
||
| This proposal is introducing a flag `waitForSidecar` which is a per-sidecar configuration that will allow users to set |
There was a problem hiding this comment.
We shouldn't name a flag or field here.
This proposal is introducing a way to allow users to set, per-sidecar, the nature of termination of a sidecar inside a task should be enough. A flag or field is implementation detail, that can be tackle below 😉
| ## Motivation | ||
|
|
||
| The first motivation is the counter intuitive nature of the sidecars. As a developer, I did not expect my sidecar to be | ||
| terminated in such way. I had to check the docs to find out that the sidecar was being terminated using a Nop Image. |
There was a problem hiding this comment.
Agreed with @sbwsg , sidecars are what they are now. What we should put in motivation here is what is the benefit for users of having a way to allow sidecar to gracefully terminate. The fact that the current behaviour of sidecar is confusing or not shouldn't be a motivation on its own 😝
| The first motivation is the counter intuitive nature of the sidecars. As a developer, I did not expect my sidecar to be | ||
| terminated in such way. I had to check the docs to find out that the sidecar was being terminated using a Nop Image. | ||
|
|
||
| Having no way to alter the behavior was the second motivation. |
There was a problem hiding this comment.
This could turn into positive (aka, Having a way to alter …)
| ### Non-Goals | ||
| This proposal is not introducing any breaking change to current behavior. Sidecars still terminate using Nop Image by default when the pipeline is completed/failed. |
There was a problem hiding this comment.
This is not really a non-goal. A non-goal could be "this proposal doesn't aim to replace the current sidecar behavior" or something like that.
|
@shahnewazrifat just checking in here - are you still working on this TEP? Cheers! |
|
I'm going to close this for now. We can reopen if/when work on this TEP resumes! |
|
/close |
|
@sbwsg: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |

TEP for introducing the control to enable graceful sidecar termination (skipping Nop Image Swap)