Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Phase out TaskRun's .status.podName #1689

Closed
imjasonh opened this issue Dec 4, 2019 · 24 comments
Closed

Proposal: Phase out TaskRun's .status.podName #1689

imjasonh opened this issue Dec 4, 2019 · 24 comments
Labels
area/api Indicates an issue or PR that deals with the API. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@imjasonh
Copy link
Member

imjasonh commented Dec 4, 2019

Background

Today, a TaskRun identifies the name of its underlying Pod primarily using .status.podName field. This is used in the TaskRun controller to determine whether a Pod has already been created for a given TaskRun, and by the CLI (and probably the dashboard) to get logs from the Pod.

The Pod that gets created also gets an OwnerReference pointing to the TaskRun that owns it (that it was created from). Owner References are largely used to identify resources that can be garbage collected (e.g., when a TaskRun is deleted, also delete its Pod).

Proposal

Instead of storing the TaskRun<->Pod relationship on both objects, only store the OwnerReference on the Pod. In the TaskRun controller, we can use the PodsLister to list Pods owned by this TaskRun (of which there should be at most one). The CLI and dashboard can also perform this lookup when printing logs or other Pod details (resource utilization, etc.).

One benefit of this is that it removes a Kubernetesism from the Tekton API -- TaskRuns might not always be executed using Pods, and encouraging a reliance on .status.podName will make that change harder to adapt to in the future.

Other Kubernetesisms (e.g., Step States, embedding the corev1.Container type) can be phased out alongside .status.podName, to make the Tekton API more execution-platform-agnostic.

@bobcatfish
Copy link
Collaborator

I like the idea of phasing out kubernetesisms but also I use the pod name super frequently when trying to debug TasksRuns, is there a shorthand I can use to get the pod from the TaskRun if we remove this?

@imjasonh
Copy link
Member Author

imjasonh commented Dec 4, 2019

If tkn learns to use the OwnerReference instead, tkn tr logs should just keep working. Maybe tkn tr pod is in order?

There's probably some gross bash one-liner that'd work instead of the gross bash one-liner I use today (kubectl logs $(kubectl get taskrun <foo> -ojsonpath={.status.podName}))

@vdemeester
Copy link
Member

/area api
/kind feature
/kind api-change

@tekton-robot tekton-robot added area/api Indicates an issue or PR that deals with the API. kind/feature Categorizes issue or PR as related to a new feature. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API labels Dec 5, 2019
@dibyom dibyom added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Mar 12, 2020
@ghost
Copy link

ghost commented Apr 24, 2020

Was this fixed by #1709? Is it safe to close?

@bobcatfish
Copy link
Collaborator

I think so! I know I've stopped using relying on it at least XD

@vdemeester
Copy link
Member

I am pretty sure it's not, following #1709 we had #1971 and #1975

/cc @imjasonh because I don't think we re-re-verted the change ?

Also, the field is still their and maybe relied on (I know OpenShift Devconsole still relies on it for now).

@vdemeester vdemeester reopened this Apr 24, 2020
@AlanGreene
Copy link
Member

We're also using it in dashboard to construct the API for logs. If there's another approach we can take to do this (ideally without having to list pods to find a match) we could switch.

@vdemeester
Copy link
Member

We're also using it in dashboard to construct the API for logs. If there's another approach we can take to do this (ideally without having to list pods to find a match) we could switch.

Using tekton.dev/* labels on pods 👼

@AlanGreene
Copy link
Member

Using tekton.dev/* labels on pods 👼

That's exactly what I was hoping to avoid since it's another resource type we need to request and process rather than just using a field that's already available to us in the TaskRun. We can make it work though if the podName field is going away.

@imjasonh
Copy link
Member Author

It's not going away if people strongly depend on it. 😅

I guess this was more of a discussion/what-if, so if there's enough pushback I'm happy to reconsider.

@vdemeester
Copy link
Member

It's not going away if people strongly depend on it. sweat_smile

I guess this was more of a discussion/what-if, so if there's enough pushback I'm happy to reconsider.

I am not pushing back 😝 I was just saying that it's still there an some tool are still using it. We can still remove it (I wouldn't mind tbh), but we will need to re-follow the deprecation policy (and hopefully make sure we don't break ourselves 😝 )

@afrittoli
Copy link
Member

Along with this, I think we should add a new condition to the TaskRun - i.e. Running - so that at reconcile level we can know that we should pull updates from the underlying resource (i.e. pod).

@josephlewis42
Copy link
Contributor

I think it's worth keeping the podName on the status for two reasons:

  • The performance penalty of removing it and expecting a client to reach to the underlying resource to get logs is high. In my use-case this could require listing hundreds or thousands of Pods. This is especially true if the other K8s-isms are going to be removed in which case debugging will mean going to the Pod to investigate.
  • As mentioned in the proposal, OwnerReferences are largely used for GC. It's an assumption that there will never be more than one Pod that matches any criteria you put out. It's handy to be able to attach a resource to another one knowing it'll be GC'd when the owner is deleted.

If the Tekton API needs to be portable, you could use a TypedLocalObjectReference instead. That would allow the specificity of a pointer without tying it to any particular runtime.

@dibyom dibyom added this to the Pipelines v0.14 milestone May 18, 2020
@afrittoli afrittoli removed this from the Pipelines v0.14 milestone Jun 15, 2020
@bobcatfish
Copy link
Collaborator

@afrittoli mentioned custom tasks in this context as well (#215) - if we want to customtaskruns to have a similar structure to taskruns, maybe we dont want podName in the status?

@afrittoli
Copy link
Member

I will remove the label for now, until we have an agreement on it.
@imjasonh perhaps would you like to bring it back to the API WG for discussion?

@afrittoli afrittoli removed the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Jun 15, 2020
@tekton-robot
Copy link
Collaborator

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Aug 14, 2020
@tekton-robot
Copy link
Collaborator

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vdemeester
Copy link
Member

/remove-lifecycle rotten
/remove-lifecycle stale
/reopen

@tekton-robot tekton-robot reopened this Aug 17, 2020
@tekton-robot
Copy link
Collaborator

@vdemeester: Reopened this issue.

In response to this:

/remove-lifecycle rotten
/remove-lifecycle stale
/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tekton-robot tekton-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Aug 17, 2020
@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 15, 2020
@vdemeester
Copy link
Member

/remove-lifecycle stale
/lifecycle frozen

@tekton-robot tekton-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 16, 2020
@lbernick
Copy link
Member

Discussed in today's API WG: having the pod name in the TaskRun status is useful for retrieving logs and for dashboard/cli. Longer term better solution is to leverage the results API, to provide a way for people to retrieve statuses and logs of TaskRuns. Once we have something like this in place, we will revisit removing this field from the TaskRun API.

/close

@tekton-robot
Copy link
Collaborator

@lbernick: Closing this issue.

In response to this:

Discussed in today's API WG: having the pod name in the TaskRun status is useful for retrieving logs and for dashboard/cli. Longer term better solution is to leverage the results API, to provide a way for people to retrieve statuses and logs of TaskRuns. Once we have something like this in place, we will revisit removing this field from the TaskRun API.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Indicates an issue or PR that deals with the API. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
Status: Done
Development

No branches or pull requests

9 participants