Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental Custom Task Proposal: Pipeline to TaskRun #447

Closed
bobcatfish opened this issue Jun 3, 2021 · 12 comments
Closed

Experimental Custom Task Proposal: Pipeline to TaskRun #447

bobcatfish opened this issue Jun 3, 2021 · 12 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@bobcatfish
Copy link
Contributor

bobcatfish commented Jun 3, 2021

This is a proposal to create a new experimental custom task that will allow people to define a pipeline, but run it in one pod, by transforming the pipeline into a taskrun with an embedded task spec.

The proposed project would be available at tektoncd/experimental/pipeline-to-taskrun (or alternatively tektoncd/experimental/pipeline-in-pod but I think that sounds a bit more ambitious than that actually is XD).

Only a subset of pipeline functionality will be supported, but it should be enough to give folks a similar experience to what pipelineresources provide, specifically being able to have containers (defined via tasks instead of pipelineresources hardcoded into the pipeline controller) that run before and after the steps of another task.

The initial owners would be: @bobcatfish @wlynch @jerop (lemme know if you'd like to join the owners!!)

Goals

The goals of this experimental custom task are:

  • To get feedback from users on whether or not this is a good direction for solving the problems in TEP-0044 (and/or the former TEP-0046 about colocation of tasks and workspaces)
  • To get feedback from users about whether or not this is enough functionality to pursue deprecating PipelineResources, and if not, what additional functionality they need (link to doc on the final chapter)
  • Or barring either of those, to see if there is desire for a feature like this in general, either as a top level pipelines feature or a standalone custom task

You can see a demo of the POC for this custom task in the API working group on May 3, 2021.

Supported pipeline features

Since this would be implemented by translating a Pipeline into a Task spec in a TaskRun, not all Pipeline features would be supported, and initially only a small subset would be supported.

The custom task would initially support:

  • Sequential tasks in a pipeline
  • exposing results at the pipeline level
  • params
  • workspaces

It may later support but would not initially support:

  • passing results between tasks
  • sidecars (if we support this, all would have to start up simultaneously which may not be the desired behavior)
  • workspaces:
  • these fields would be easy to support 1 of but not clear how to handle cases where more than one task declares them (since in the taskrun they would apply to the entire task):
    • step templates
    • timeout
    • retries
  • Specifying Pipelines via Bundles
  • limitranges i think i was thinking of resource requests, and since those are per step they should just work
  • timeouts there's no pipelinerun at play so this would be up to the run to specify

Since ultimately this is translating into a task, it would not support the following (unless we change the design substantially which could happen later, see potential progression below):

  • parallel tasks
  • pipelineresources
  • when expressions (and conditions)
  • custom tasks
  • finally tasks (would really like to figure out a way to make this work tho since not being able to do this kind of functionality with PipelineResources was always a big problem - maybe if we allow step failure (TEP-0040) we can use that to make finally steps work??)

What comes next?

I’d like to start with this custom task and then progress like so:

  1. Add this initial custom task with very few features
  2. Expand the custom task as described in the previous section (add more supported features, e.g. improving deduplication)
  3. Try this custom task out in dogfooding and gather feedback
  4. Assuming good feedback, propose promoting this to a top level Pipelines feature (at which point we’d have to decide on the syntax, and we'd have to decide on things like: how is this represented in the pipelinerun status)
  5. Add as a top level Pipelines feature (if folks agree)
  6. Decide whether to leave as-is (subset of functionality by translating into taskrun) or expand to translate directly into a pod, eventually expanding to support all pipeline features directly in a pod (could be a progression)
@bobcatfish
Copy link
Contributor Author

PTAL @vdemeester @afrittoli @dibyom @abayer 🙏

@afrittoli
Copy link
Member

This is a proposal to create a new experimental custom task that will allow people to define a pipeline, but run it in one pod, by transforming the pipeline into a taskrun with an embedded task spec.

I think it's great to experiment on this, thank for driving this forward.

From a model point of view, I would prefer a solution where we maintain the existing relationships between Tekton resources (N Tasks -> 1 Pipeline -> 1 PipelineRun -> N TaskRun) and what changes is the ration / relationship between those resources and the underlying k8s abstractions (Pods).

I'm not 100% sure how to implement that practically though, especially through custom tasks.
One idea could be to have a CustomTask that describes the chunk of pipeline that it implements (a subset of the pipeline).
The corresponding Run would manage the Pod directly and it would create one TaskRun for each Task. Those TaskRuns would read their status from the Pod but that might require changes to the TaskRun controller.
Eventually the group of tasks that can run in a single Pod would be managed by the PipelineRun controller.
For instance when the pipeline run controller identifies a group of tasks that can co-exists in a Pod, it would create the Pod first and then the TaskRuns associated.

The proposed project would be available at tektoncd/experimental/pipeline-to-taskrun (or alternatively tektoncd/experimental/pipeline-in-pod but I think that sounds a bit more ambitious than that actually is XD).

Only a subset of pipeline functionality will be supported, but it should be enough to give folks a similar experience to what pipelineresources provide, specifically being able to have containers (defined via tasks instead of pipelineresources hardcoded into the pipeline controller) that run before and after the steps of another task.

+1

I'm not sure we will / shall ever get to a point where any pipeline can run in a Pod.
I think we should focus on grouping chunks of Pipelines that meet special requirement and need and have the pipeline run controller deciding to run those in a single pod. ideally it would be mostly transparent to users.

The initial owners would be: @bobcatfish @wlynch @jerop (lemme know if you'd like to join the owners!!)

Goals

The goals of this experimental custom task are:

  • To get feedback from users on whether or not this is a good direction for solving the problems in TEP-0044 (and/or the former TEP-0046 about colocation of tasks and workspaces)
  • To get feedback from users about whether or not this is enough functionality to pursue deprecating PipelineResources, and if not, what additional functionality they need (link to doc on the final chapter)
  • Or barring either of those, to see if there is desire for a feature like this in general, either as a top level pipelines feature or a standalone custom task

You can see a demo of the POC for this custom task in the API working group on May 3, 2021.

Supported pipeline features

Since this would be implemented by translating a Pipeline into a Task spec in a TaskRun, not all Pipeline features would be supported, and initially only a small subset would be supported.

The custom task would initially support:

  • Sequential tasks in a pipeline
  • exposing results at the pipeline level
  • params
  • workspaces

It may later support but would not initially support:

  • passing results between tasks

  • sidecars (if we support this, all would have to start up simultaneously which may not be the desired behavior)

  • workspaces: combining mountPaths + volumeclaimtemplates (see tektoncd/pipeline#3440 - it is not possible to have two different workspace declarations in the taskspec which are mapped to one volumeClaimTemplate at runtime)

  • these fields would be easy to support 1 of but not clear how to handle cases where more than one task declares them (since in the taskrun they would apply to the entire task):

    • step templates
    • timeout
    • retries

Since ultimately this is translating into a task, it would not support the following (unless we change the design substantially which could happen later, see potential progression below):

  • parallel tasks
  • pipelineresources
  • when expressions (and conditions)

What comes next?

I’d like to start with this custom task and then progress like so:

  1. Add this initial custom task with very few features
  2. Expand the custom task as described in the previous section (add more supported features, e.g. improving deduplication)
  3. Try this custom task out in dogfooding and gather feedback
  4. Assuming good feedback, propose promoting this to a top level Pipelines feature (at which point we’d have to decide on the syntax)
  5. Add as a top level Pipelines feature (if folks agree)
  6. Decide whether to leave as-is (subset of functionality by translating into taskrun) or expand to translate directly into a pod, eventually expanding to support all pipeline features directly in a pod (could be a progression)

LGTM!!

@vdemeester
Copy link
Member

LGTM 👍🏼

@dibyom
Copy link
Member

dibyom commented Jun 4, 2021

LGTM!

@bobcatfish
Copy link
Contributor Author

@afrittoli that's an interesting idea - I think it's definitely worth adding to our list of alternatives in TEP-0044 - I'll see if I can sketch out what it would look like to implement that and add it.

bobcatfish added a commit to bobcatfish/experimental that referenced this issue Jul 9, 2021
This commit adds a custom task that allows user to run simple sequential
Pipelines as one TaskRun - which means the Pipeline can refer to
multiple Tasks but run on only one pod.

It only supports a subset of Pipeline functionality (more detail on what
and why in the README) but is enough that folks can do a lot of what
they would have previously used PipelineResources for, e.g. doing a git
clone and then doing something with the data, in the same pod, and emit
results such as the exact commit sha used.

Next steps will be to expand the functionality supported, get feedback,
and if the feedback is good, promote this to a top level Pipeline API
feature.

Experimental project proposal: tektoncd/community#447
bobcatfish added a commit to bobcatfish/experimental that referenced this issue Jul 21, 2021
This commit adds a custom task that allows user to run simple sequential
Pipelines as one TaskRun - which means the Pipeline can refer to
multiple Tasks but run on only one pod.

It only supports a subset of Pipeline functionality (more detail on what
and why in the README) but is enough that folks can do a lot of what
they would have previously used PipelineResources for, e.g. doing a git
clone and then doing something with the data, in the same pod, and emit
results such as the exact commit sha used.

Next steps will be to expand the functionality supported, get feedback,
and if the feedback is good, promote this to a top level Pipeline API
feature.

Experimental project proposal: tektoncd/community#447
bobcatfish added a commit to bobcatfish/experimental that referenced this issue Jul 21, 2021
This commit adds a custom task that allows user to run simple sequential
Pipelines as one TaskRun - which means the Pipeline can refer to
multiple Tasks but run on only one pod.

It only supports a subset of Pipeline functionality (more detail on what
and why in the README) but is enough that folks can do a lot of what
they would have previously used PipelineResources for, e.g. doing a git
clone and then doing something with the data, in the same pod, and emit
results such as the exact commit sha used.

Next steps will be to expand the functionality supported, get feedback,
and if the feedback is good, promote this to a top level Pipeline API
feature.

Experimental project proposal: tektoncd/community#447
bobcatfish added a commit to bobcatfish/experimental that referenced this issue Aug 13, 2021
This commit adds a custom task that allows user to run simple sequential
Pipelines as one TaskRun - which means the Pipeline can refer to
multiple Tasks but run on only one pod.

It only supports a subset of Pipeline functionality (more detail on what
and why in the README) but is enough that folks can do a lot of what
they would have previously used PipelineResources for, e.g. doing a git
clone and then doing something with the data, in the same pod, and emit
results such as the exact commit sha used.

Next steps will be to expand the functionality supported, get feedback,
and if the feedback is good, promote this to a top level Pipeline API
feature.

Experimental project proposal: tektoncd/community#447
@mattmoor
Copy link
Member

So I've been looking at Hermekton a bit, and it seems like there's a fundamental incompatibility with this because there isn't a way to blend 3 tasks (even sequential) that do: fetch inputs (non-hermetic) -> run build (hermetic) -> publish outputs (non-hermetic).

It's possible that I'm misunderstanding the API or how a sequence of taskruns is translated into a single taskrun, but this feels like a bit of a gap for what I'm feeling will become an increasingly interesting feature 🤔

I've been noodling on how to model exactly these kinds of flows with hermekton a bit, and this came up as an option, so I'd be curious if you'd thought about this and whether I'm missing something? thanks!

@bobcatfish
Copy link
Contributor Author

hey @mattmoor ! looking at the hermekton docs it looks like what we've currently got implemented doesnt (yet - im guessing this is to enable some initial experimentation?) measure up to what TEP-0025 hermekton proposes, which would give you the ability to specify at least at a task level a hermetic execution mode.

So I'm thinking that if we update the hermekton support to match the TEP, then we could support what you're describing via this functionality - if we wanted to support it in this current form (pipeline to taskrun) we'd want to expand TEP-0025 to also supply the config option at a step level but that seems like it could be a reasonable addition to me and it seems very doable given that this seems to be implemented at the level of the individual step (https://github.com/tektoncd/pipeline/blob/b7fa888082bd20e08edc3b89b6b4d52ed4f00651/cmd/entrypoint/runner.go#L62-L63).

Also (see "What comes next?" in the PR description) my hope is that eventually this would be done by translating directly from a Pipeline to a pod without the TaskRun in the middle which might make things a bit simpler.

@mattmoor
Copy link
Member

which would give you the ability to specify at least at a task level a hermetic execution mode

It already does this, just with an annotation instead of a field. IIRC the e2e tests for this annotate a TaskRun with an annotation whose value is "hermetic" on the TaskRun.

we'd want to expand TEP-0025 to also supply the config option at a step level

This seems awfully verbose, but yes you'd need to ability to specify things at a step level in order to continue to do this translation. This expands to effectively everything currently specified at the TaskRun level, which spirals fast (the kinds of things that scare me are things like serviceAccountName!).

my hope is that eventually this would be done by translating directly from a Pipeline to a pod without the TaskRun in the middle

While I agree this is possible, I suspect it will result in a lot of redundancy (which often leads to drift/rot), so I'm a little concerned if that's the intended direction of this.

Given that this is also a subset of pipeline functionality (and its unclear whether it's even possible to reach 100% conformance), I'm also trying to understand the eventual goal of this work a bit better.

bobcatfish added a commit to bobcatfish/experimental that referenced this issue Aug 24, 2021
I had tried to add the custom task and OWNERS file all at once in
tektoncd#770 but @jerop pointed
out it's probably a better approach to just add the OWNERS file first,
which would allow getting the initial approval needed to kick off the
project and from then on the project OWNERS can take care of reviews
(and she documented this in
tektoncd#782 !)

So this commit adds just the OWNERS (more welcome if anyone else is
interested!) for the pipeline to taskrun custom task, and there will be
a follow-up PR with the initial controller logic.

Project proposal: tektoncd/community#447
bobcatfish added a commit to bobcatfish/experimental that referenced this issue Aug 24, 2021
…ustom task 🎁

I had tried to add the custom task and OWNERS file all at once in
tektoncd#770 but @jerop pointed
out it's probably a better approach to just add the OWNERS file first,
which would allow getting the initial approval needed to kick off the
project and from then on the project OWNERS can take care of reviews
(and she documented this in
tektoncd#782 !)

So this commit adds just the OWNERS (more welcome if anyone else is
interested!) for the pipeline to taskrun custom task, and there will be
a follow-up PR with the initial controller logic.

Project proposal: tektoncd/community#447
tekton-robot pushed a commit to tektoncd/experimental that referenced this issue Aug 25, 2021
…ustom task 🎁

I had tried to add the custom task and OWNERS file all at once in
#770 but @jerop pointed
out it's probably a better approach to just add the OWNERS file first,
which would allow getting the initial approval needed to kick off the
project and from then on the project OWNERS can take care of reviews
(and she documented this in
#782 !)

So this commit adds just the OWNERS (more welcome if anyone else is
interested!) for the pipeline to taskrun custom task, and there will be
a follow-up PR with the initial controller logic.

Project proposal: tektoncd/community#447
bobcatfish added a commit to bobcatfish/experimental that referenced this issue Aug 27, 2021
This commit adds a custom task that allows user to run simple sequential
Pipelines as one TaskRun - which means the Pipeline can refer to
multiple Tasks but run on only one pod.

It only supports a subset of Pipeline functionality (more detail on what
and why in the README) but is enough that folks can do a lot of what
they would have previously used PipelineResources for, e.g. doing a git
clone and then doing something with the data, in the same pod, and emit
results such as the exact commit sha used.

Next steps will be to expand the functionality supported, get feedback,
and if the feedback is good, promote this to a top level Pipeline API
feature.

Experimental project proposal: tektoncd/community#447
bobcatfish added a commit to bobcatfish/experimental that referenced this issue Sep 2, 2021
This commit adds a custom task that allows user to run simple sequential
Pipelines as one TaskRun - which means the Pipeline can refer to
multiple Tasks but run on only one pod.

It only supports a subset of Pipeline functionality (more detail on what
and why in the README) but is enough that folks can do a lot of what
they would have previously used PipelineResources for, e.g. doing a git
clone and then doing something with the data, in the same pod, and emit
results such as the exact commit sha used.

Next steps will be to expand the functionality supported, get feedback,
and if the feedback is good, promote this to a top level Pipeline API
feature.

Experimental project proposal: tektoncd/community#447
tekton-robot pushed a commit to tektoncd/experimental that referenced this issue Sep 2, 2021
This commit adds a custom task that allows user to run simple sequential
Pipelines as one TaskRun - which means the Pipeline can refer to
multiple Tasks but run on only one pod.

It only supports a subset of Pipeline functionality (more detail on what
and why in the README) but is enough that folks can do a lot of what
they would have previously used PipelineResources for, e.g. doing a git
clone and then doing something with the data, in the same pod, and emit
results such as the exact commit sha used.

Next steps will be to expand the functionality supported, get feedback,
and if the feedback is good, promote this to a top level Pipeline API
feature.

Experimental project proposal: tektoncd/community#447
@tekton-robot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 18, 2021
@tekton-robot
Copy link
Contributor

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 18, 2021
@tekton-robot
Copy link
Contributor

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link
Contributor

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

6 participants