Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEP-0040 implementation - specifying onError in a step #4106

Merged
merged 1 commit into from Aug 11, 2021

Conversation

pritidesai
Copy link
Member

@pritidesai pritidesai commented Jul 20, 2021

Changes

This PR implements TEP-0040 - ignore a step error 🛎.

When a step in a task results in a failure, the rest of the steps in the task are skipped and the taskRun is declared a failure. If you would like to ignore such step errors and continue executing the rest of the steps in the task, you can specify onError for such a step.

onError can be set to either continue or fail as part of the step definition. If onError is set to continue, the entrypoint sets the original failed exit code of the script in the container terminated state. A step with onError set to continue does not fail the taskRun and continues executing the rest of the steps in a task.

For example,

steps:
  - image: docker.io/library/golang:latest
    name: ignore-unit-test-failure
    onError: continue
    script: |
      go test .

The original failed exit code will be part of the run status:

 "steps": [
    {
      "container": "step-ignore-unit-test-failure",
      "imageID": "...",
      "name": "ignore-unit-test-failure",
      "terminated": {
        "containerID": "...",
        "exitCode": 1,
        "finishedAt": "2021-06-21T18:22:05Z",
        "reason": "Completed",
        "startedAt": "2021-06-21T18:22:05Z"
      }
    },
  ],

And will be available in a file /tekton/steps/0/exitCode.

This is an alpha feature. The enable-api-fields feature flag must be set to "alpha" to specify onError for a step.

This commit includes following changes:

  • Changing entrypoint to include three new flags on_error, step_metadata_dir, and step_metadata_dir_link.
  • Adding a new function as part of the runner CreateDirWithSymlink.
  • Creating a volume /tekton/steps/.
  • Supporting a path variable $(steps.step-.exitCode.path) and $(steps.step-unnamed-.exitCode.path).
  • API additions - onError - while defining a step.
  • Writing exitCode at /tekton/steps/step-/exitCode or /tekton/steps/step-unnamed-/exitCode.
  • Set the exitCode of a terminated state to a non-zero exit code.
  • Doc, unit test, and examples for this feature.

/kind feature

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

  • Docs included if any changes are user facing
  • Tests included if any functionality added or changed
  • Follows the commit message standard
  • Meets the Tekton contributor standards (including
    functionality, content, code)
  • Release notes block below has been filled in or deleted (only if no user facing changes)

Release Notes

Use `onError` for a step if you would like to ignore a step error and continue executing the rest of the steps in the task. Set `onError` to `continue` to ignore a step error. When `onError` is set to `continue`,  the entrypoint sets the original failed exit code of the script in the container terminated state. A `step` with `onError` set to `continue` does not fail the `taskRun` and continues executing the rest of the steps in a task.

@tekton-robot tekton-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. labels Jul 20, 2021
@tekton-robot tekton-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jul 20, 2021
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/entrypoint/post_writer.go 0.0% 63.6% 63.6
internal/builder/v1beta1/step.go 42.1% 38.1% -4.0
pkg/apis/pipeline/v1beta1/task_validation.go 97.0% 97.0% 0.1
pkg/entrypoint/entrypointer.go 66.1% 71.0% 4.9
pkg/pod/entrypoint.go 85.9% 87.2% 1.3
pkg/pod/pod.go 87.6% 87.4% -0.2
pkg/pod/status.go 91.8% 90.8% -1.0
pkg/reconciler/taskrun/taskrun.go 76.9% 77.0% 0.1

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/entrypoint/post_writer.go 0.0% 63.6% 63.6
internal/builder/v1beta1/step.go 42.1% 38.1% -4.0
pkg/apis/pipeline/v1beta1/task_validation.go 97.0% 97.0% 0.1
pkg/entrypoint/entrypointer.go 66.1% 71.0% 4.9
pkg/pod/entrypoint.go 85.9% 87.2% 1.3
pkg/pod/pod.go 87.6% 87.4% -0.2
pkg/pod/status.go 91.8% 90.8% -1.0
pkg/reconciler/taskrun/taskrun.go 76.9% 77.0% 0.1

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/entrypoint/post_writer.go 0.0% 63.6% 63.6
internal/builder/v1beta1/step.go 42.1% 38.1% -4.0
pkg/apis/pipeline/v1beta1/task_validation.go 97.0% 97.0% 0.1
pkg/entrypoint/entrypointer.go 66.1% 71.0% 4.9
pkg/pod/entrypoint.go 85.9% 87.2% 1.3
pkg/pod/pod.go 87.6% 87.4% -0.2
pkg/pod/status.go 91.8% 90.8% -1.0
pkg/reconciler/taskrun/taskrun.go 76.9% 77.0% 0.1

@pritidesai pritidesai added this to the Pipelines v0.27 milestone Jul 20, 2021
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/entrypoint/post_writer.go 0.0% 63.6% 63.6
internal/builder/v1beta1/step.go 42.1% 38.1% -4.0
pkg/apis/pipeline/v1beta1/task_validation.go 97.0% 97.0% 0.1
pkg/entrypoint/entrypointer.go 66.1% 71.0% 4.9
pkg/pod/entrypoint.go 85.9% 87.2% 1.3
pkg/pod/pod.go 87.6% 87.4% -0.2
pkg/pod/status.go 91.8% 90.8% -1.0
pkg/reconciler/taskrun/taskrun.go 76.9% 77.0% 0.1

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/entrypoint/post_writer.go 0.0% 63.6% 63.6
internal/builder/v1beta1/step.go 42.1% 38.1% -4.0
pkg/apis/pipeline/v1beta1/task_validation.go 97.0% 97.0% 0.1
pkg/entrypoint/entrypointer.go 66.1% 71.0% 4.9
pkg/pod/entrypoint.go 85.9% 87.2% 1.3
pkg/pod/pod.go 87.6% 87.4% -0.2
pkg/pod/status.go 91.8% 90.8% -1.0
pkg/reconciler/taskrun/taskrun.go 76.9% 77.0% 0.1

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/entrypoint/post_writer.go 0.0% 61.9% 61.9
internal/builder/v1beta1/step.go 42.1% 38.1% -4.0
pkg/apis/pipeline/v1beta1/task_validation.go 97.0% 97.0% 0.1
pkg/entrypoint/entrypointer.go 66.1% 71.0% 4.9
pkg/pod/entrypoint.go 85.9% 87.2% 1.3
pkg/pod/pod.go 87.6% 87.4% -0.2
pkg/pod/status.go 91.8% 90.8% -1.0
pkg/reconciler/taskrun/taskrun.go 76.9% 77.0% 0.1

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/entrypoint/post_writer.go 0.0% 61.9% 61.9
internal/builder/v1beta1/step.go 42.1% 38.1% -4.0
pkg/apis/pipeline/v1beta1/task_validation.go 97.0% 97.0% 0.1
pkg/entrypoint/entrypointer.go 66.1% 71.0% 4.9
pkg/pod/entrypoint.go 85.9% 87.2% 1.3
pkg/pod/pod.go 87.6% 87.4% -0.2
pkg/pod/status.go 91.8% 90.8% -1.0
pkg/reconciler/taskrun/taskrun.go 76.9% 77.0% 0.1

@vdemeester vdemeester self-assigned this Jul 22, 2021
stepPath = flag.String("step_path", "", "Relative step path, creates the specified path under /tekton/steps/ which "+
"can be used to store the step metadata e.g. <step-name> in /tekton/steps/<step-name>/")
stepPathLink = flag.String("step_path_link", "", "Relative step path, creates a symbolic link to the "+
"specified step path e.g. <step-index> in /tekton/steps/<step-index>")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three optional flags here introduced here:

  • onError - this flag is initialized to either continue or fail based on the step definition
  • stepPath - the relative path specified in this flag is created under /tekton/steps/
  • stepPathLink - the relative path specified in this flag is created as a symbolic link to stepPath

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning about stepPath and stepPathLink ? (aka are they required for this feature, if yes, how/why?)

}
// log and exit only if a step error must cause run failure
if e.OnError != entrypoint.ContinueOnError {
log.Fatalf("Error executing command (ExitError): %v", err)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking if onError is set to continue to allow a step error. By default, the entrypoint considers a container exiting with non-zero as error and hence not included any special logic for onError set to fail. We can change this logic if needed.

@@ -20,3 +20,41 @@ func (*realPostWriter) Write(file string) {
log.Fatalf("Creating %q: %v", file, err)
}
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two more functions are introduced (1) to create a file and write the specified content in the file (2) to create a directory and a symbolic link.

e.WritePostFile(e.PostFile, nil)
e.WriteExitCodeFile(e.StepPathLink, "0")
default:
// for a step without continue on error and any error, write a post file with .err
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

creating an InternalTektonResultType with a failed exit code similar to StartedAt time which is then read while updating the pod status.

@pritidesai
Copy link
Member Author

The changes are ready for review, I will add more unit tests if we agree with the overall implementation 🙏 Appreciate your comments 🥺

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few question, more on design than on code (things I kinda missed during the TEP, that's fine though 😛)

stepPath = flag.String("step_path", "", "Relative step path, creates the specified path under /tekton/steps/ which "+
"can be used to store the step metadata e.g. <step-name> in /tekton/steps/<step-name>/")
stepPathLink = flag.String("step_path_link", "", "Relative step path, creates a symbolic link to the "+
"specified step path e.g. <step-index> in /tekton/steps/<step-index>")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning about stepPath and stepPathLink ? (aka are they required for this feature, if yes, how/why?)

steps:
- image: docker.io/library/golang:latest
name: ignore-unit-test-failure
onError: continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realizing now that… we could use onError with the debug feature too, like onError: debug 😅

docs/tasks.md Outdated
Comment on lines 344 to 358
#### Accessing Step's `exitCode` in subsequent `Steps`

A step can access the exit code of any previous step using the `path` similar to a task result, for example:

```shell
$(steps.step-<step-name>.exitCode.path)
```

The `exitCode` of a step without any name can be referenced using:

```shell
$(steps.step-unnamed-<step-index>.exitCode.path)
```

If you would like to use the tekton internal path, you can access the exit code by reading the file
(which is not recommended though since the path might change in the future):

```shell
cat /tekton/steps/step-<step-name>/exitCode
```

And, access the step exit code without a step name:

```shell
cat /tekton/steps/step-unnamed-<step-index>/exitCode
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a little bit weird to me, as for example, the index for unnamed might appear confusing : step-unnamed-2 is the 2nd unnamed step or the 2nd step (and thus the first unnamed) ?

Maybe we could have /tekton/steps/by-index/<index>/… and /tekton/steps/by-name/<name>/… (and use link or copy there) ?

Copy link
Member Author

@pritidesai pritidesai Aug 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a little bit weird to me, as for example, the index for unnamed might appear confusing : step-unnamed-2 is the 2nd unnamed step or the 2nd step (and thus the first unnamed) ?

This is designed to match how the containers are named. step-unnamed-2 is the third step (and the first unnamed).

The containers are named step-<name of the step> if the step has a name else they are named step-unnamed-<step-index> where <step-index> is the index of the step in the entire list of the steps.

kubectl get tr test-taskrun-4m59w -o json | jq .status.steps | jq '.[] | .container'
"step-step0"
"step-unnamed-1"
"step-step2"

Maybe we could have /tekton/steps/by-index/<index>/… and /tekton/steps/by-name/<name>/… (and use link or copy there)?

Let me know if it still sounds confusing, we can make it better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this is less confusing I think (especially if it follows the same).
I still think having that /tekton/steps/by-index/… and …/by-name/… would make sense 😛 (I may be biased by linux, but I am a fan of this type of construct in the /dev virtual fs)

Comment on lines 121 to 126
// create a step path pipeline.StepPath/step-<step-name>/ if a step name is specified
// or pipeline.StepPath/step-unnamed-<step-index>/ if a step name is missing
// Create a symlink:
// ln -s pipeline.StepPath/<step-index> pipeline.StepPath/<step-name>
e.PostWriter.CreatePath(e.StepPath, e.StepPathLink)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the future, I think we should write in /tekton/steps/by-index/<index>/… and then link to elsewhere (either by name, or to keep backward compat, …)

@@ -25,6 +25,8 @@ import (
"path/filepath"
"strings"

"github.com/tektoncd/pipeline/pkg/apis/pipeline"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra empty line 😝

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, I left a bunch of nitty code / style comments but it looks pretty close to me!

}
}

// CreatePath creates the specified path and a symbolic link to the path
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest CreateDirWithSymlink or similar as the func name. CreatePath is nice and concise but the expected behaviour is much more specific than the name implies I think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @sbwsg for the suggestion, have renamed it to CreateDirWithSymlink 👍

return
}
if err := os.MkdirAll(source, 0770); err != nil {
log.Fatalf("Creating file path %q: %v", source, err)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tiny nit: suggest Creating directory %q: %v since it's MkdirAll

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done ✅

cmd/entrypoint/post_writer.go Show resolved Hide resolved
@@ -142,6 +142,8 @@ of how this directory is used:
* These folders are [part of the Tekton API](../api_compatibility_policy.md):
* `/tekton/results` is where [results](#results) are written to
(path available to `Task` authors via [`$(results.name.path)`](../variables.md))
* `/tekton/steps` is where the `step` exitCodes are written to
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. Would also love to see a section in this doc talking a bit more in-depth about the implementation details of step exit codes / onerror if possible? e.g. purpose of both the directory & the symlink, how the exitcode is shared between steps, etc...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs/tasks.md Outdated
A step can access the exit code of any previous step using the `path` similar to a task result, for example:

```shell
$(steps.step-<step-name>.exitCode.path)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the exit code is inside the file pointed to by path, is that right? So to read the exit code you need to cat $(steps.step-X.exitCode.path)? Could the line before this read something more like, A step can access the exit code of any previous step by reading the file pointed to by the exitCode path variable:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, yup, reworded and makes more sense now 🤣

OnError string
// StepPath is the path to that step directory where any step related metadata can be stored
StepPath string
// StepPathLink is the path which needs to be linked to the StepPath
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be useful to explain the intended purpose of the symlink too I think. Without the context it's a bit tricky to understand why it would be useful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// create a step path pipeline.StepPath/step-<step-name>/ if a step name is specified
// or pipeline.StepPath/step-unnamed-<step-index>/ if a step name is missing
// Create a symlink:
// ln -s pipeline.StepPath/<step-index> pipeline.StepPath/<step-name>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we rename the method from CreatePath to CreateDirAndSymlink (or similar) I expect this comment could explain the purpose of creation rather than the mechanics. E.g. could read something like, // Create the directory where we will store the exit codes (and eventually other metadata) of Steps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup make sense, done ✅

// set it to "fail" to indicate the entrypoint to exit the taskRun if the container exits with non zero exit code
// set it to "continue" to indicate the entrypoint to continue executing the rest of the steps irrespective of the container exit code
OnError string
// StepPath is the path to that step directory where any step related metadata can be stored
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: suggest StepDirectory or StepDir just to be super explicit. Also suggest for StepDirLink or similar.

Copy link
Member Author

@pritidesai pritidesai Aug 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed it to StepMetadataDir to match the following suggestion 👍

"Set to \"fail\" to declare a failure with a step error and stop executing the rest of the steps.")
stepPath = flag.String("step_path", "", "Relative step path, creates the specified path under /tekton/steps/ which "+
"can be used to store the step metadata e.g. <step-name> in /tekton/steps/<step-name>/")
stepPathLink = flag.String("step_path_link", "", "Relative step path, creates a symbolic link to the "+
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: suggest step_dir and step_dir_link (or maybe even step_metadata_dir? too much?) just to be super explicit about usage. Mirrors the explicitness of wait_file / post_file (though not termination_path 😭 )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, no harm in being a little more verbose, renamed to step_metadata_dir and step_metadata_dir_link

stepPath = flag.String("step_path", "", "Relative step path, creates the specified path under /tekton/steps/ which "+
"can be used to store the step metadata e.g. <step-name> in /tekton/steps/<step-name>/")
stepPathLink = flag.String("step_path_link", "", "Relative step path, creates a symbolic link to the "+
"specified step path e.g. <step-index> in /tekton/steps/<step-index>")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a tad confused about <step-index> in /tekton/steps/<step-index>. Does the user pass in just <step-index> or the full /tekton/steps/0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup the user pass in just the <step-index>, added an explanation in the developers doc - https://github.com/tektoncd/pipeline/pull/4106/files#diff-fb2c200965217efdd737c2d18433086a2a6cd879a5b0213fe8253cb14546be6fR590

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, we might have a slight mismatch to tackle here. In the new unit tests we're passing the absolute paths over to the entrypoint from the pod package.

I think passing the absolute paths is a fine way to go, I just wanna make sure the arg help text here lines up with that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @sbwsg I am so sorry to confuse you 😭 it is taking the absolute path similar to the others i.e.

				"-wait_file", filepath.Join(mountPoint, fmt.Sprintf("%d", i-1)),
				"-post_file", filepath.Join(mountPoint, fmt.Sprintf("%d", i)),
				"-termination_path", terminationPath,
				"-step_metadata_dir", filepath.Join(pipeline.StepsDir, name),
				"-step_metadata_dir_link", filepath.Join(pipeline.StepsDir, fmt.Sprintf("%d", i)),
			}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the doc and this file as well. While implementing I had a relative path in mind but going with how the other flags are implemented, chose to go with absolute path and forgot all about it 😢

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, makes total sense, cheers @pritidesai !

@pritidesai
Copy link
Member Author

What's the reasoning about stepPath and stepPathLink ? (aka are they required for this feature, if yes, how/why?)

They are required for this feature since this feature implements creating a file exitCode under the stepPath. stepPath is the flag which notifies the entrypoint what directory to create under /tekton/steps/ where the step related metadata can be stored, starting with exitCode in this PR and anything more in the future.

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 10, 2021
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to introduce a test of some kind that checks how results are treated in the event of failure. I see that the pipelinerun example includes a result from a step but it would be super useful to have some test code ensuring it's returned as expected. This might be best suited as an alpha integration test?

There are also 2 things which I think might be worth documenting as part of this PR:

  1. Behavior of Task Results written by Steps that continue on error. (probably as part of tasks.md?)
  2. Interaction of this feature with the debug/breakpoint feature. It looks like specifying breakpoint behaviour overrides the onError behaviour? (again probably worth capturing in tasks.md?)

Otherwise this lgtm!

@@ -104,6 +120,10 @@ func (e Entrypointer) Go() error {
_ = logger.Sync()
}()

// Create the directory where we will store the exit codes (and eventually other metadata) of Steps.
// Create a symlink to the directory for easier access by the index instead of a step name.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 nice, thanks a lot for tackling that.

@bobcatfish
Copy link
Collaborator

Hey @pritidesai ! Just a drive by comment, I added a comment to the TEP PR for this (tektoncd/community#463 (comment)), I think there's some behavior worth fleshing out in the TEP - maybe you already dealt with this in the PR and it's just a matter of updating the TEP, or maybe we need to discuss further.

@tekton-robot tekton-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 10, 2021
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/entrypoint/post_writer.go 0.0% 77.8% 77.8
internal/builder/v1beta1/step.go 42.1% 38.1% -4.0
pkg/apis/pipeline/v1beta1/task_validation.go 97.0% 97.0% 0.1
pkg/entrypoint/entrypointer.go 66.1% 71.0% 4.9
pkg/pod/entrypoint.go 85.9% 87.2% 1.3
pkg/pod/pod.go 87.1% 86.9% -0.2
pkg/pod/status.go 91.8% 90.8% -1.0
pkg/reconciler/taskrun/resources/apply.go 99.1% 99.2% 0.0
pkg/reconciler/taskrun/taskrun.go 77.5% 77.6% 0.1

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/entrypoint/post_writer.go 0.0% 77.8% 77.8
internal/builder/v1beta1/step.go 42.1% 38.1% -4.0
pkg/apis/pipeline/v1beta1/task_validation.go 97.0% 97.0% 0.1
pkg/entrypoint/entrypointer.go 66.1% 71.0% 4.9
pkg/pod/entrypoint.go 85.9% 87.2% 1.3
pkg/pod/pod.go 87.1% 86.9% -0.2
pkg/pod/status.go 91.8% 90.8% -1.0
pkg/reconciler/taskrun/resources/apply.go 99.1% 99.2% 0.0
pkg/reconciler/taskrun/taskrun.go 77.5% 77.6% 0.1

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/entrypoint/post_writer.go 0.0% 77.8% 77.8
internal/builder/v1beta1/step.go 42.1% 38.1% -4.0
pkg/apis/pipeline/v1beta1/task_validation.go 97.0% 97.0% 0.1
pkg/entrypoint/entrypointer.go 66.1% 71.0% 4.9
pkg/pod/entrypoint.go 85.9% 87.2% 1.3
pkg/pod/pod.go 87.1% 86.9% -0.2
pkg/pod/status.go 91.8% 90.8% -1.0
pkg/reconciler/taskrun/resources/apply.go 99.1% 99.2% 0.0
pkg/reconciler/taskrun/taskrun.go 77.5% 77.6% 0.1

@pritidesai
Copy link
Member Author

It would be great to introduce a test of some kind that checks how results are treated in the event of failure. I see that the pipelinerun example includes a result from a step but it would be super useful to have some test code ensuring it's returned as expected. This might be best suited as an alpha integration test?

I have added an additonal e2e test here, please let me know if you find any additional tests which we can include 🙏

There are also 2 things which I think might be worth documenting as part of this PR:

  1. Behavior of Task Results written by Steps that continue on error. (probably as part of tasks.md?)

Added a section in tasks.md

  1. Interaction of this feature with the debug/breakpoint feature. It looks like specifying breakpoint behaviour overrides the onError behaviour? (again probably worth capturing in tasks.md?)

Yes that is correct, breakpoint overrides the step error functionality and breakpoint feature provides different tools to mark a step as failure/success. Have captured it briefly here

Otherwise this lgtm!

thanks @sbwsg for reviewing the PR

this commit implements tep-0049 - ignore a step error

When a `step` in a `task` results in a failure, the rest of the steps in the
`task` are skipped and the `taskRun` is declared a failure. If you would like
to ignore such step errors and continue executing the rest of the steps in
the task, you can specify `onError` for such a `step`.

`onError` can be set to either `continue` or `fail` as part of the
step definition. If `onError` is set to `continue`, the entrypoint sets the
original failed exit code of the script in the container terminated state.
A `step` with `onError` set to `continue` does not fail the `taskRun` and
continues executing the rest of the steps in a task.

This is an alpha feature. The `enable-api-fields` feature flag must be set to
`"alpha"` to specify `onError` for a `step`.

This commit includes following changes:

* Changing entrypoint to include three new flags `onError`, `stepMetadataDir`, and
`stepMetadataDirLink`.
* Adding one new function as part of the runner CreateDirWithSymlink
* Creating a volume `/tekton/steps/`
* Supporting a path variable $(steps.step-<stepName>.exitCode.path) and
$(steps.step-unnamed-<stepIndex>.exitCode.path)
* API spec `onError` while defining a step
* Writing exitCode at /tekton/steps/step-<step-name>/exitCode or
/tekton/steps/step-unnamed-<step-index>/exitCode
* Set the exitCode of a terminated state to a non-zero exit code
* Doc, unit test, and examples for this feature
@pritidesai pritidesai changed the title TEP-0040 implementation - specifying onError for a step TEP-0040 implementation - specifying onError in a step Aug 10, 2021
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/entrypoint/post_writer.go 0.0% 77.8% 77.8
internal/builder/v1beta1/step.go 42.1% 38.1% -4.0
pkg/apis/pipeline/v1beta1/task_validation.go 97.0% 97.0% 0.1
pkg/entrypoint/entrypointer.go 66.1% 71.0% 4.9
pkg/pod/entrypoint.go 85.9% 87.2% 1.3
pkg/pod/pod.go 87.1% 86.9% -0.2
pkg/pod/status.go 91.8% 90.8% -1.0
pkg/reconciler/taskrun/resources/apply.go 99.1% 99.2% 0.0
pkg/reconciler/taskrun/taskrun.go 77.5% 77.6% 0.1

to access the result and run with the resolved value.

Now, a step can fail before initializing a result and the `pipeline` can ignore such step failure. But, the `pipeline`
will fail with `InvalidTaskResultReference` if it has a task consuming that task result. For example, any task
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, I hadn't even considered this case. Good point!

@ghost
Copy link

ghost commented Aug 11, 2021

/lgtm

@tekton-robot tekton-robot assigned ghost Aug 11, 2021
@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 11, 2021
@tekton-robot tekton-robot merged commit 92746a2 into tektoncd:main Aug 11, 2021
pritidesai added a commit to pritidesai/pipeline that referenced this pull request Sep 21, 2021
Pipeline 0.27 was released in August 2021 which introduced an alpha feature to
specify onError in a step. This feature was implemented in the PR tektoncd#4106:
tektoncd#4106

To enable this feature, the Tekton Pipeline deployment had to set
enable-api-fields to alpha.

After discussing this in the API WG:
https://docs.google.com/document/d/17PodAxG8hV351fBhSu7Y_OIPhGTVgj6OJ2lPphYYRpU/edit#heading=h.ig94pf1w10xc
We have sent an email out to collect the feedback from the users.  @skaegi is
looking forward to this promotion.

This commit is updating task validation to no longer require enable-api-fields
is set to alpha.
The documentation is updated to moved it out of the alpha section.
The example yamls moved out of alpha.
@pritidesai pritidesai mentioned this pull request Sep 21, 2021
5 tasks
pritidesai added a commit to pritidesai/pipeline that referenced this pull request Sep 21, 2021
Pipeline 0.27 was released in August 2021 which introduced an alpha feature to
specify onError in a step. This feature was implemented in the PR tektoncd#4106:
tektoncd#4106

To enable this feature, the Tekton Pipeline deployment had to set
enable-api-fields to alpha.

After discussing this in the API WG:
https://docs.google.com/document/d/17PodAxG8hV351fBhSu7Y_OIPhGTVgj6OJ2lPphYYRpU/edit#heading=h.ig94pf1w10xc
We have sent an email out to collect the feedback from the users.  @skaegi is
looking forward to this promotion.

This commit is updating task validation to no longer require enable-api-fields
is set to alpha.
The documentation is updated to moved it out of the alpha section.
The example yamls moved out of alpha.
Updated e2e to not include alpha flag.
pritidesai added a commit to pritidesai/pipeline that referenced this pull request Sep 28, 2021
Pipeline 0.27 was released in August 2021 which introduced an alpha feature to
specify onError in a step. This feature was implemented in the PR tektoncd#4106:
tektoncd#4106

To enable this feature, the Tekton Pipeline deployment had to set
enable-api-fields to alpha.

After discussing this in the API WG:
https://docs.google.com/document/d/17PodAxG8hV351fBhSu7Y_OIPhGTVgj6OJ2lPphYYRpU/edit#heading=h.ig94pf1w10xc
We have sent an email out to collect the feedback from the users.  @skaegi is
looking forward to this promotion.

This commit is updating task validation to no longer require enable-api-fields
is set to alpha.
The documentation is updated to moved it out of the alpha section.
The example yamls moved out of alpha.
Updated e2e to not include alpha flag.
github-actions bot pushed a commit to sspreitzer/tektoncd-pipeline that referenced this pull request Oct 14, 2021
Pipeline 0.27 was released in August 2021 which introduced an alpha feature to
specify onError in a step. This feature was implemented in the PR tektoncd#4106:
tektoncd#4106

To enable this feature, the Tekton Pipeline deployment had to set
enable-api-fields to alpha.

After discussing this in the API WG:
https://docs.google.com/document/d/17PodAxG8hV351fBhSu7Y_OIPhGTVgj6OJ2lPphYYRpU/edit#heading=h.ig94pf1w10xc
We have sent an email out to collect the feedback from the users.  @skaegi is
looking forward to this promotion.

This commit is updating task validation to no longer require enable-api-fields
is set to alpha.
The documentation is updated to moved it out of the alpha section.
The example yamls moved out of alpha.
Updated e2e to not include alpha flag.
chenbh pushed a commit to chenbh/pipeline that referenced this pull request Oct 27, 2021
Pipeline 0.27 was released in August 2021 which introduced an alpha feature to
specify onError in a step. This feature was implemented in the PR tektoncd#4106:
tektoncd#4106

To enable this feature, the Tekton Pipeline deployment had to set
enable-api-fields to alpha.

After discussing this in the API WG:
https://docs.google.com/document/d/17PodAxG8hV351fBhSu7Y_OIPhGTVgj6OJ2lPphYYRpU/edit#heading=h.ig94pf1w10xc
We have sent an email out to collect the feedback from the users.  @skaegi is
looking forward to this promotion.

This commit is updating task validation to no longer require enable-api-fields
is set to alpha.
The documentation is updated to moved it out of the alpha section.
The example yamls moved out of alpha.
Updated e2e to not include alpha flag.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants