Skip to content
This repository has been archived by the owner on Jun 12, 2023. It is now read-only.

Ability to add a "cleaner" step that runs no-matter the build status #133

Closed
byroot opened this issue Dec 11, 2015 · 12 comments
Closed

Ability to add a "cleaner" step that runs no-matter the build status #133

byroot opened this issue Dec 11, 2015 · 12 comments

Comments

@byroot
Copy link

byroot commented Dec 11, 2015

Context: Since we use intensive parallelism (up to ~200), getting the full list of errors is quite annoying. You have to navigate between many job outputs etc.

I'm prototyping something were we store the failure reports in a datastore and print them out as a final step.

Unfortunately this final step isn't run if any of the previous ones failed. I could silence failures but then i'd be afraid things could break and nobody would notice.

Idea:

What would be great, would be to define jobs with soft_fail: true, which would mean that they don't impact the job state if they fail.

@keithpitt @toolmantim thoughts?

@Soleone
Copy link

Soleone commented Dec 16, 2015

Thanks for looking at this!

This comes up a lot and if you have 100 test runs in parallel and 25 failures, it's quite tedious to look at each one individually to find the test files that contain failures.

Right now you have to click each test run, scroll down, find the stacktrace, parse with your eyes to find the filename and test name and then scroll back up, click the next red test run and so on.

The optimal workflow would be to have some sort of table that displays each failure (and optionally amounts of failures) right at the top.

@byroot
Copy link
Author

byroot commented Dec 16, 2015

Reading back my feature request, I think OsftFail it's not quite what I'd want.

I'd like for some kind of alternative waiter, to be executed even if previous jobs failed. It wouldn't change the final build status though.

@toolmantim
Copy link

Just thinking this through, I wonder if actually silencing them (capturing the exit status yourself, storing it along with your other stuff, and then exit 0'ing) is actually the better solution. Then you have another step that checks your data and passes/fails based upon that, and outputs the unified error list?.

@byroot
Copy link
Author

byroot commented Dec 17, 2015

Yes it's what I was doing initially, but I stopped when I realized it had a bunch of bad downsides. I mean it's not impossible to do it that way, but it way more fragile that if we had the equivalent of a ensure block.

@keithpitt
Copy link
Member

I've been thinking of a new step type over the last few months. It's a type of cleaner, that will run at the end of the build, regardless of whether it passed or failed. Would that work do you think?

@byroot
Copy link
Author

byroot commented Dec 17, 2015

That would be perfect

@SomeoneWeird
Copy link

+1 for cleaner step

@toolmantim toolmantim changed the title Feature Request: Soft failures Ability to add a "cleaner" step that runs no-matter the build status Feb 3, 2016
@gazmania
Copy link

this is essentially a 'finally' block isn't it? It could be used for more than just cleanup so i'd maybe suggest is is called finally or something as it should be a familiar construct to most.

@keithpitt
Copy link
Member

👋 We shipped an option on the wait step that does exactly this.

Usually you define a wait step like so:

steps:
  - command: "test.sh"
  - wait
  - command: "another.sh"

In the above example, if test.sh fails, so does the build. The pipeline stops and nothing continues.

We added a continue_on_failure option to the wait step:

steps:
  - command: "test.sh"
  - wait: ~
    continue_on_failure: true
  - command: "another.sh"

Now if test.sh fails, the build will continue and another.sh will run. I've just pushed an example on how to use this step along with how to use it to annotate builds with JUnit test failure information:

https://github.com/buildkite/rspec-junit-example

@KJTsanaktsidis
Copy link

@keithpitt The wait step changes don't actually solve the issue in this ticket. If you have something like this...

steps:
  - command: 'echo "Failure!"; exit 1'
  - wait
  - command: 'echo "Failure 2!"; exit 2'
  - wait: ~
    continue_on_failure: true
  -command: 'echo "Do some cleanup"'

The cleanup step won't happen, because the step it is supposed to continue from never ran because of the first wait. To appropriately address cleanup scenarios, the cleanup step needs to run regardless of which steps ran.

@avaly
Copy link

avaly commented Dec 1, 2017

I recently got hit by the same issue that @KJTsanaktsidis is describing. I was having something similar to:

steps:
  - label: test
    command: make test
  - wait
  - label: deploy
    command: make deploy
    branches: 'production'
  - wait: ~
    continue_on_failure: true
  - label: cleanup
    command: make clean

For more complex projects, where you can't really upload artifacts from one step to the other (e.g. multi docker images building), such a final "clean" step, that would run regardless of the previous commands results, would be a good addition.

For my own project, I had to modify a bit my CI workflows and managed to perform clean-up in a pre-exit local hook script. But for more complex projects, I think this would not be sufficient.

@KJTsanaktsidis
Copy link

I ended up having our clean-up step also clean up any failed builds that came before it. Not elegant, but worked for our purposes (destroying Cloudformation stacks used as part of integration tests; the cleanup is just there to stop setting money on fire).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants