Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design: Conditional execution #27

Closed
bobcatfish opened this issue Sep 10, 2018 · 35 comments
Closed

Design: Conditional execution #27

bobcatfish opened this issue Sep 10, 2018 · 35 comments
Assignees
Labels
design This task is about creating and discussing a design

Comments

@bobcatfish
Copy link
Collaborator

bobcatfish commented Sep 10, 2018

The work for this task is to design this feature and present one or more proposals (before implementing).

Expected Behavior

Pipelines need to be able to express Tasks that execute conditionally, e.g.:

  • If a Task fails, do X (e.g. notify slack)
  • If anything in the Pipeline fails, do Y (e.g. notify slack)
  • Regardless of success/failure, do Z (e.g. cleanup)

Additionally (as pointed out by @abayer !) Pipelines may need more complex logic, e.g.:

  • Only run this Task/follow this branch of tasks if we're not on a pull request (e.g. only publish an image, etc., from master branch)

This means we need to be able to express conditional execution generally against the state of the Pipeline. We can start with a specific subset, e.g.:

  • What Tasks have run
  • What Tasks have failed
  • The state of the Resources available to this Task (i.e. where you'd get information such as the branch)

Actual Behavior

At the moment, execution of a Pipeline will continue to traverse DAG the user has expressed until either the Pipeline completes or one of the Tasks fails (a step exits with a non-0 code).

Additional Info

For other designs:

This doc has a (very) rough start at expressing some of the requirements for this (note doc visible to members of knative-dev@googlegroups.com).

@bobcatfish
Copy link
Collaborator Author

One possibility: all Tasks and Pipelines could have a finally clause, which could be provided with params/env variables which indicate whether there was a success, failure, abort, etc. (vs. providing "on failure", "on success", and "finally" as separate concepts)

@abayer
Copy link
Contributor

abayer commented Oct 26, 2018

I think there's probably a need for both "do X after a Task or Pipeline if $SOMETHING is true" and "only run this Task or Pipeline if $SOMETHING is true before we start" - the most obvious use case that we see for the latter in Jenkins Declarative Pipelines is "only run this stage if we're not on a pull request" and the like. Having different environments, etc for PR runs does cut down on the need for that to some extent, but there's always going to be some fairly sizable chunks of work that people will want to do if-and-only-if some condition is met.

@bobcatfish
Copy link
Collaborator Author

Interesting point @abayer ! So it sounds like you would recommend that we make our requirements for this design a bit more general, e.g. instead of If a Task fails, do X (e.g. notify slack) it should be If Y is true, do X, with some mechanism to provide state information which includes things such as which Tasks have failed, succeeded, and maybe any other info available to a Pipeline (e.g. which git revision/branch is being used) ?

@abayer
Copy link
Contributor

abayer commented Oct 29, 2018

@bobcatfish - exactly! Otherwise, people will end up needing to have separate but nearly identical Pipelines for a PR build vs a master build etc. What other information needs to be available is a wider question, of course, and I've got a whole spiel I'm working up on extensibility in that area as well as others. =)

@bobcatfish
Copy link
Collaborator Author

Makes sense to me @abayer - I'll update the task description!

@bobcatfish bobcatfish added the design This task is about creating and discussing a design label Nov 6, 2018
@jchesterpivotal
Copy link

We hit a similar PRs -> many pipelines problem in Concourse, which has been a standing pain point. The solution being worked on is "spaces". A resource can pump versions into spaces which don't need to be defined when the pipeline is set.

Take git commits, since PRs are the motivating case. Right now, a git resource follows a single branch. A PR resource can follow commits in PRs. But what I don't get is a separate pipeline, unless I either have (a) a shared pipeline for everything, (b) separate pipelines for PRs vs main or (c) tooling that creates and destroys pipelines on need.

None of these is particularly palatable. The first is confusing and difficult to set up, the second introduces duplication and busywork, the third is cool but requires an investment in tooling and testing that might not be commensurate with the value. Most CF teams settle on (b), given the relative rareness of PR-based activity. Teams with a PR-based workflow gravitate towards (c). Teams working in private repos enjoy the luxury of ignoring the problem altogether.

The space concept would essentially add (c) as a feature to Concourse itself. A git resource can create a "space" for each PR that lands, which essentially allows a parallel pipeline to be created on demand, but which shares its history with the overall pipeline it belongs to.

Other examples of spaces I've seen discussed are upstream dependency versions and IaaSes. So you might create a new space when a new Ruby interpreter version lands (a problem buildpacks faces for literally hundreds of upstream dependencies), or have spaces to corral a single pipeline that runs deployment, upgrade and smoke tests across multiple platforms (Master Pipeline has this problem).

@abayer
Copy link
Contributor

abayer commented Nov 8, 2018

@jchesterpivotal - yeah, Jenkins' multibranch Pipelines are basically (c). I should really get more familiar with Concourse. =)

@bobcatfish
Copy link
Collaborator Author

We hit a similar PRs -> many pipelines problem in Concourse, which has been a standing pain point.

The way we've been talking about handling this is a decoupling Resources from Pipelines (if we can), discussion in #200 - so you can use the same Pipeline with different Resources.

@bobcatfish
Copy link
Collaborator Author

another use case here that @fejta brought up yesterday:

conditionally executing based on which files changed in a commit, e.g. if you only changed markdown files, maybe you don't want to run your whole test suite

my gut reaction is that the theoretical "metadata" that is provided to each Pipeline's Task can include information like this, and the Pipeline's Task can express if there are certain file changes it is interested in or not.

@abayer
Copy link
Contributor

abayer commented Nov 9, 2018

@bobcatfish - hey, yeah, I had a note for myself scribbled down to try to understand what role "changelog"s could play in this world. Actually, that's part of a broader area I'm not clear on - historical data from run to run. I'm still getting my head around the CRD and API model here, so I don't know if we'll be recording run metadata (like, in this example, the commit hash for the git resource used in a particular run, which we'd need to calculate the changes in a subsequent run) in the PipelineRun's CRD or some other data store. Either way, that metadata definitely needs to be available to the Tasks - not just SCM info, but, say, whether the previous run was successful or not (people do like their "build fixed!" notifications, and you can't do that if you can't tell what the previous run's result was), etc...

@bobcatfish
Copy link
Collaborator Author

Actually, that's part of a broader area I'm not clear on - historical data from run to run

This is something that probably isn't clear in our current docs or plans, but you can see there is a hint of it in the fact that there is a runs section in the results configuration:

https://github.com/knative/build-pipeline/blob/4270f16403e90cfd9f900110f37a078736fdfd9c/examples/pipelineparams.yaml#L9-L12

The idea was that this is a location where we would persist a copy of all of the Runs (i.e. TaskRun, PipelineRun) and probably all of the related resources that were used to execute that Run.

  • @tejal29 who is owning the results portion of the Pipeline design in general.

@abayer
Copy link
Contributor

abayer commented Nov 20, 2018

@tejal29 @bobcatfish - might be worth taking a look at Jenkins X's PipelineActivity CRD - see https://jenkins-x.io/architecture/custom-resources/ and https://github.com/jenkins-x/jx/blob/master/pkg/apis/jenkins.io/v1/types_pipeline.go. @rawlingsj can speak more to it - I'm just starting to dig in.

@abayer
Copy link
Contributor

abayer commented Jan 23, 2019

It seems to me that "after this Task or Pipeline completes, if some condition is true, do $SOME_THING" is mechanically distinct from "before executing this Task, check if some condition is true, and only execute this Task if it is true" - they could use similar mechanisms for checking conditions, for sure, but I've had a hard time trying to design both aspects in one fell swoop. Would it be reasonable to others for me to split out the "check before executing a Task" part (the equivalent of Jenkins' when configuration) into its own ticket, or should I stick to designing both aspects together?

@abayer
Copy link
Contributor

abayer commented Jan 23, 2019

I also just realized that the "only execute this Task if some condition is true" is basically a special case of what I'm talking about in #233 - that's for "wait until some condition is true before executing this Task", but we're going to want a timeout and what should happen if the condition is never true (i.e., skip the Task and continue executing other Tasks, or fail the whole run) on that anyway, so this scenario is just "wait until some conditions is true... except don't actually wait, just check once and then skip execution", i.e., a timeout of 0 and skipping rather than failing. Now I need to go revisit my initial design notes. =)

@abayer
Copy link
Contributor

abayer commented Jan 23, 2019

...and that means I owe @bobcatfish an apology for saying that #233 wasn't related to this. =)

@abayer
Copy link
Contributor

abayer commented Jan 23, 2019

Ok, thinking out loud a bit here as to how the design and implementation work for the whole scope of this could be split up. First, I'm gonna use shorthand here for the three different use cases for conditional behavior, and because I've been neck-deep in building/maintaining/etc the Jenkins Declarative Pipeline stuff, I'm going to reuse the terminology there for two of 'em:

  • wait until - this one doesn't have a direct analogue in Declarative, but shorthand is still useful. This is Design and implement paused Task execution until condition is met #233 - before a TaskRun starts, check if some condition is satisfied. If it is satisfied, run like normal. If it isn't satisfied, keep checking that condition periodically until either it is satisfied, or we've waited too long and we either skip the TaskRun or abort the whole PipelineRun depending on how things are configured.
  • post - After a TaskRun (or a PipelineRun) finishes, if some condition is satisfied, do something.
  • when - Before a TaskRun actually starts, check if some condition is satisfied. If the condition is satisfied, run the TaskRun as normal, otherwise skip this TaskRun. As I mentioned in a comment above, this could basically be looked at as a special case of wait until, without with the waiting/checking again, and always skipping the TaskRun rather than aborting the PipelineRun.

Now that I've rambled about terminology, on to the separate design/implementation threads...

  • The condition checking mechanism
    • We'd definitely want the same mechanism/interface to be used for all three use cases. While not all conditions may end up being relevant in all three use cases, there's certain to be enough overlap that it'd be goofy not to have condition checking be a common interface used by all three.
    • What kind of condition checks are we going to need?
      • We'd want to be able to check the results of earlier TaskRuns and the PipelineRun as a whole. In practice, this'll probably largely be used for post, and more specifically in a simple form of "if this TaskRun or PipelineRun failed/succeeded, do X". I feel like this might also be relevant in a more general form for when - not 100% sure if that'll really make sense, but better to do this as something that can be used by when as well as post than to paint ourselves into a corner down the road.
      • For post, we'll also want an unconditional condition. =) i.e., "do X after this TaskRun regardless of its result". Doesn't really apply to when or wait until, since that's the behavior if you're not specifying when or wait until anyway.
      • We'll definitely want to let Resources provide condition checks, for things like "am I on this branch", or "are we building a PR", or "does this file exist in this storage resource". My gut instinct is to put this on PipelineResourceInterface in some form, but that doesn't quite feel right to me in practice, since the actual condition check and arguments passed to it will differ from Resource to Resource, and a Resource may well have multiple condition checks available. This would be relevant in all three use cases.
      • I think we'll want to allow for checking some value on some other CRD as a condition as well - perhaps we define a new CRD type for other services to write to in order to record the result of some external process. This could allow things like checking whether a security scan has finished, or another PipelineRun has completed, and would be useful in all three use cases (particularly wait until, since that's literally the scenario I created Design and implement paused Task execution until condition is met #233 to address!).
      • The last condition check I can come up with right now is to basically evaluate a step - that would allow for running an arbitrary command on a container, so that you could do things like query an external API and use the result to decide what to do.
      • We also will want to be able to combine multiple condition checks - "after this TaskRun finishes, do X if the TaskRun succeeded and we're building a PR", e.g. I believe we'll want what basically amounts to and, or, and not.

...ok, this is getting a bit long, so I'm going to stop here, get more thoughts organized, and come back later today or tomorrow. =)

@abayer
Copy link
Contributor

abayer commented Jan 24, 2019

In re: the "checking some value on some other CRD" condition - I'm pretty sure that we want something like this, but I don't know exactly how we'd use it yet, so this could be tabled until later.

@abayer
Copy link
Contributor

abayer commented Jan 29, 2019

@bobcatfish - any thoughts on my comment above? I'm not at all sure yet how to actually implement it, mind you. =)

@bobcatfish
Copy link
Collaborator Author

I am so sorry this is taking so long to review @abayer 😓going to prioritize this for next week 🤞🤞🤞

@bobcatfish
Copy link
Collaborator Author

Okay sorry again @abayer finally got a chance to review this, thanks for the thorough write-up! Now that I've got my head in this space, I'll try to be way more prompt with responding so you won't be blocked anymore :D

Mostly I'd say I agree with everything you've said above!

We'd definitely want the same mechanism/interface to be used for all three use cases.

I totally agree!

We'll definitely want to let Resources provide condition checks

What I'm wondering is if we can define one way of expressing these conditions, and use that for all conditional cases? i.e. if:

  • The unit we are conditionally executing is always a Task
  • The unit we use to express that conditional execution is something else (maybe a "condition" CRD?)
  • There is some intuitive way to hook up these conditions (/me waves hands)

(I'm thinking that putting those 3 features together could satisfy all of post, wait_until and when? maybe I'm over simplifying tho!)

we define a new CRD type for other services to write to in order to record the result of some external process.

Totally agree - a few ideas that come to mind:

  • Some kind of "metadata" is provided to all downstream Tasks (maybe this is literally the "status" section of the PipelineRun?)
  • Tasks write json to a specific location, that is added as metadata for future Tasks to consume

The unit we use to express that conditional execution is something else

I think it would be very interesting if we could allow a lot of flexibility here: i.e. if someone wanted to, they could write a Python script (i.e. we don't force a new language on them, they can use the language of their choice), but we provide some kind of simple default (and maybe allow something super flexible like rego even 😇 )

@abayer
Copy link
Contributor

abayer commented Feb 5, 2019

Sounds right to me - post, wait_until, and when are really just different "if $CONDITION, then do $X" cases - different $X, different places in the flow, etc, but same condition checking mechanisms and capabilities, so yeah, some abstraction for defining conditions and their arguments is exactly what I think we need. =)

@bobcatfish
Copy link
Collaborator Author

So for next steps @abayer what do you think about making a doc where we can distill your list above
into some requirements (pretty close as is, those are great use cases anyway!) and brainstorm some alternative approaches? :D

Looks like we have a couple ideas already:

  1. Some kind of condition CRD which controls whether or not a Task executes, based on some metadata
  2. Explicitly adding post, when, wait_until type fields to Tasks
  3. Similar to (2), but on Resources
  4. Similar to (2) but via a kind of step on Tasks
  5. ?

@abayer
Copy link
Contributor

abayer commented Feb 6, 2019

Will do!

@hrishin
Copy link
Member

hrishin commented Mar 12, 2019

Hey @bobcatfish @abayer,

Thanks! for having this feature discussion. A kind request, May I know where we are up to in terms of pipeline spec design?

One of the simple use case we are trying to address is add to "manual approval" gate in pipeline execution.
So pipeline says

pipeline:
 task1: 
 task2: 
 task3:
 task4:
 - wait: manual-approval  

So pipeline execution waits to start task4 until its approved by user or something like that.

One of the way pipeline could relay on pipelineresource where it would evaluate all parameters to true before pipeline proceeds to execute further tasks.
e.g. pipeline spec would look something like this

apiVersion: tekton.dev/v1alpha1
kind: PipelineResource
metadata:
  name: manual-approval
spec:
  type: wait-until                # could provide better key name
  params:
  - name: approved
    value: false
---
apiVersion: tekton.dev/v1alpha1
kind: Pipeline
metadata:
  name: gated-pipeline
spec:
  tasks:
  - name: foo-task
    resources:
      inputs:
      - name: approval
        resource: manual-approval

Really not sure if its right to mutate pipelineresource or achieve it using pipelinerun, once user approves it, a pipeline would start to execute the task which is in a pending state.

@abayer
Copy link
Contributor

abayer commented Mar 20, 2019

/assign @abayer

@abayer
Copy link
Contributor

abayer commented Mar 20, 2019

@abayer
Copy link
Contributor

abayer commented Mar 25, 2019

Doc now has a bunch of stream-of-consciousness thoughts and some very rough mockups for CRD changes. As I mention in the stream of consciousness bit, I think that manual approval and "wait for condition" in general probably shouldn't be in the first push - we should start with "execute this taskrun if..." and "execute these steps after a taskrun or pipelinerun if...", making sure the condition checking is extensible so that we can add "wait until..." later. "wait until..." will necessitate some sort of pubsub/message bus/eventing system that would allow external services/apps to report their completion/result so that the condition checking in the TaskRun controller could pick that up, and I'm not nearly qualified to design/implement (or even describe, really) what that should look like. =)

@bobcatfish
Copy link
Collaborator Author

I think we should consider this issue completed once we have relative agreement re. the design, then we can open separate issues to tackle the implementation.

Note @abayer that @kimsterv and I are planning to try to flesh out some more requirements next week and collaborate on the doc :D

@bobcatfish bobcatfish self-assigned this Apr 25, 2019
@bobcatfish bobcatfish added this to the Pipelines 0.4 milestone Apr 25, 2019
chmouel pushed a commit to chmouel/tektoncd-pipeline that referenced this issue Apr 29, 2019
@bobcatfish bobcatfish assigned dibyom and unassigned bobcatfish Apr 30, 2019
@bobcatfish
Copy link
Collaborator Author

@kimsterv has put together a doc solely for requirements: https://docs.google.com/document/d/1C3crx89VXbeIqAem1648IxP3SCWsQvsgjM0r2vmiPIY (shared with tekton-dev)

@dibyom
Copy link
Member

dibyom commented May 28, 2019

Lots of great ideas above! I'm working on distilling them into a design doc (and a quick PoC) -- mostly based on @abayer 's ideas around the condition CRD + @vincent-pli 's suggestion of using containers to do the evaluation.

At a high level, here is what I think the initial design should look like:

  • Conditions are specified between tasks in a pipeline and are evaluated before the first step in a Task is run.
  • Condition evaluation is just a container step with the exit code determining if the condition is true or false.
  • The condition step has access to all of the task's resources plus additional metadata such as the pipelinerun status (e.g. run this task only if a previous task is successful).
  • We'll provide simple built in checks (e.g. strings equal) to start with though users can extend it by providing their own Condition CRD

@bobcatfish
Copy link
Collaborator Author

Looks like a great scoping to me - @dibyom do you want to review this in the working group meeting tomorrow?

@dibyom
Copy link
Member

dibyom commented May 28, 2019

Yes! I'll get something written up by then!
Edit: Doc is here: https://docs.google.com/document/d/1iTb0rDDlgQcPbiYIGJ-atkaML8aCyPCyhfucCPJmNLs/edit#

@dibyom
Copy link
Member

dibyom commented Jun 26, 2019

Closing since we have a design in place and issue to track (#1137) the implementation.
/close

@tekton-robot
Copy link
Collaborator

@dibyom: Closing this issue.

In response to this:

Closing since we have a design in place and issues to track the implementation.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

bobcatfish added a commit to bobcatfish/pipeline that referenced this issue Jul 30, 2019
@dibyom has been contributing significant work including variable path
resource substitution tektoncd#877 and
his continued work on one of our oldest and most sought after features:
conditional execution (tektoncd#27)!

He has also reviewed 44+ PRs in the Pipelines repo:
https://tekton.devstats.cd.foundation/d/46/pr-reviews-by-contributor?orgId=1&var-period=d&var-repo_name=tektoncd%2Fpipeline&var-reviewers="dibyom"
tekton-robot pushed a commit that referenced this issue Aug 5, 2019
@dibyom has been contributing significant work including variable path
resource substitution #877 and
his continued work on one of our oldest and most sought after features:
conditional execution (#27)!

He has also reviewed 44+ PRs in the Pipelines repo:
https://tekton.devstats.cd.foundation/d/46/pr-reviews-by-contributor?orgId=1&var-period=d&var-repo_name=tektoncd%2Fpipeline&var-reviewers="dibyom"
@oiricaud
Copy link

oiricaud commented Jul 7, 2020

Has this been implemented yet? And if so... what version of Tekton.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design This task is about creating and discussing a design
Projects
None yet
Development

No branches or pull requests

7 participants