Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to re-trigger failed build with the same input versions #413

Open
davewalter opened this issue May 2, 2016 · 55 comments

Comments

@davewalter
Copy link

@davewalter davewalter commented May 2, 2016

When using the new version: every configuration for a get task, it is possible to arrive at a state where you have multiple builds of the same job running at the same time. If an earlier build fails, there is no way to re-trigger it with the same set of inputs. We haven't been able to determine a useful workaround; setting serial: true doesn't really help in this scenario, because the next build will start as soon as the first one fails.

It would be helpful if there were a way to re-trigger the job with the same inputs as a particular build (failed or otherwise).

Let us know if you need more details on this scenario or our desired fix. Thanks!

@davewalter and @rmasand

@vito

This comment has been minimized.

Copy link
Member

@vito vito commented Jul 15, 2016

We're thinking about splitting today's trigger build button (+). It is primarily used for three things today:

  1. Impatience: I just pushed something or know that someone just published something, and I want the build to run now.
  2. Retrying a build (this issue): I want to re-run the current build, either to see if it's flaky or to retry because something outside the build failed (e.g. github, a deployment, etc.).
  3. Triggering a job that only ever manually runs, e.g. shipping a product after you've written release notes.

The flaw with case 1 is that there's a race condition. In the time between you loading the page and clicking the +, Concourse may have already found your stuff and queued a build. Now you have two, which is annoying.

The flaw with case 2 is you can only do it with the latest build, and also if you triggered a bunch, new versions may come in, potentially invalidating your flakiness trial. You could set version to a particular version in your pipeline, but that's annoying.

Case 3 pretty much works, but you don't know what versions it'll use until you run it. See #269

So, I think we should split + into two buttons. One that lives on the job, "sync", which will make sure everything's up-to-date and then queue up a build if it should (i.e. one's not queued already; same semantics as auto-triggering). The other button would be associated with a particular build of the job, and would re-trigger it with the same inputs. This covers cases 1 and 2.

The third case needs some more thinking since a "sync" button alone doesn't intuitively seem like enough given that the build only manually triggers.

@vito vito changed the title Ability to ReTrigger Failed Job With Same Version of Resources Ability to re-trigger failed build with the same input versions Jul 15, 2016
@concourse-bot concourse-bot added unscheduled and removed scheduled labels Jul 22, 2016
@endzyme

This comment has been minimized.

Copy link

@endzyme endzyme commented Aug 15, 2016

+1 this would be a great, and much needed, feature for Concourse CI

@vito vito added the help wanted label Oct 19, 2016
@charlieoleary

This comment has been minimized.

Copy link
Contributor

@charlieoleary charlieoleary commented Nov 15, 2016

+1 As well, this is a pretty critical feature. We've sort of circumvented it with empty commits (since we're using the PR resource), but it would be ideal to simply retry a failed job with the same inputs.

@ahelal

This comment has been minimized.

Copy link
Contributor

@ahelal ahelal commented Nov 15, 2016

Would love to see that. We are doing crazy stuff to try to trigger old commits. Is this open for external people to help ? and if so how ?

@primalmotion

This comment has been minimized.

Copy link

@primalmotion primalmotion commented Dec 3, 2016

same thing here, I feel that concourse has everything to be able to do this fairly easily. It would work perfectly with the pull request resource.

@tracker-common

This comment has been minimized.

Copy link

@tracker-common tracker-common commented Dec 14, 2016

+1000000 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍👍 👍👍👍 👍👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍👍 👍👍👍 👍👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍👍 👍👍👍 👍👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍👍 👍👍👍 👍👍

@gabro

This comment has been minimized.

Copy link
Contributor

@gabro gabro commented Feb 8, 2017

Same here, if you guys are busy can we help somehow?

@miromode

This comment has been minimized.

Copy link

@miromode miromode commented Mar 8, 2017

+1

1 similar comment
@kei-yamazaki

This comment has been minimized.

Copy link
Contributor

@kei-yamazaki kei-yamazaki commented Apr 10, 2017

👍

@timrchavez

This comment has been minimized.

Copy link
Contributor

@timrchavez timrchavez commented Apr 12, 2017

👍 This would go a long way to making Concourse a more viable choice for us.

@ls-yann-david

This comment has been minimized.

Copy link

@ls-yann-david ls-yann-david commented Apr 27, 2017

Pleaseeeeee 👍

@VanAxe

This comment has been minimized.

Copy link

@VanAxe VanAxe commented Apr 27, 2017

Retrigerring jobs would be a lot more elegant than empty commits! 😝 👍

@olhtbr

This comment has been minimized.

Copy link

@olhtbr olhtbr commented Apr 27, 2017

👍

@vito vito removed the unscheduled label May 8, 2017
@tsantero

This comment has been minimized.

Copy link

@tsantero tsantero commented May 9, 2017

is this issue currently being actively worked on internally? I need this asap, but also don't want to duplicate work. similarly, are there any contributor guidelines I should read?

@vito vito removed the help wanted label May 9, 2017
@vito vito added this to the Staging milestone May 17, 2017
@vito vito added the workflows label May 17, 2017
@vito

This comment has been minimized.

Copy link
Member

@vito vito commented Apr 2, 2019

This is top-of-the-backlog now as we'll need it for #3602. The days of pinning and re-triggering and forgetting to un-pin are numbered!

Note: we might go for a quick-and-dirty version of this which directly replaces the build being re-triggered. In the future we'll want to keep track of each run of the build, but for the sake of unblocking #3602 quickly I think we should just start with this minimum viable solution as it has little to no implications on the UI/build ordering/etc.

@vito

This comment has been minimized.

Copy link
Member

@vito vito commented Apr 9, 2019

Notes from IPM:

  • Preserve inputs on re-triggered build (duh) - make sure the scheduler/build starter doesn't re-compute them or use next_build_inputs
  • We'll need a button in the UI for re-triggering (@Lindsayauchin), in addition to the existing trigger-build button
  • We'll need to clear out the build events for the old build and "re-set" the build back to pending state (and reset whatever other build data is appropriate, i.e. start/end time)
  • Also: clear out any outputs for the build, otherwise a re-trigger from green to red could result in "phantom outputs" satisfying passed constraints
  • Re-compute build plan based on current pipeline config
    • If anyone is worried about this let us know - it's way easier to implement this way, and we figure there may be cases where a pipeline config change was made to fix the errant build anyway, in which case we'd want to pick up the new config.

We'll also spike on creating a new build instead of replacing it. They may actually be roughly the same difficulty. If we do this instead, we don't have to reset anything or clear out any outputs/etc.

@vito vito added this to the v6.0.0 milestone May 10, 2019
@StevenArmstrong

This comment has been minimized.

Copy link

@StevenArmstrong StevenArmstrong commented May 12, 2019

Could the trigger buttons be moved out from under the job and put on the main page somehow? It could have a trigger icon with a prompt or something? I say this as one of the main feedbacks our product lead gets of our concourse pipelines is around users calling the interface awkward and complaining they have to click through the green/red square of a job to trigger it. To mitigate this we have had to have a job called trigger for production deployments so users know how to trigger a production deployment. As a result of the clicking around we have even had requests to build a UI on top of concourse to make it easier and more dev friendly for roll forward and rollback triggers :(

@vito

This comment has been minimized.

Copy link
Member

@vito vito commented May 13, 2019

@StevenArmstrong So having a specific 'retrigger' button on failed builds in the pipeline? Sounds reasonable, though I would argue people should probably be clicking into the build first and understanding the failure rather than blindly re-triggering. 🤔 In any case we may want to discuss that as a separate issue that we can address after we take our first crack at this. :)

@hstenzel

This comment has been minimized.

Copy link

@hstenzel hstenzel commented May 13, 2019

One question I have is how we can retrigger a build if we no longer have the log from the original?

@vito

This comment has been minimized.

Copy link
Member

@vito vito commented May 13, 2019

@hstenzel Build logs are purely cosmetic, the actual information regarding which versions/etc. are used is kept in the database and so re-triggering will still work.

@hstenzel

This comment has been minimized.

Copy link

@hstenzel hstenzel commented May 13, 2019

@vito

This comment has been minimized.

Copy link
Member

@vito vito commented May 13, 2019

@StevenArmstrong

This comment has been minimized.

Copy link

@StevenArmstrong StevenArmstrong commented May 14, 2019

@vito it wasn't really just for retrigger. It was more if you are doing a redesign on what the + buttons or other buttons do it would be really good if they weren't nested under the job as users have said they find it confusing having to click through to manually initiate a new build when deploying to production. Instead I was suggesting having them a layer up as an icon on the pipeline page beside each job. You could then have a confirmation pop up if someone clicks it by mistake to confirm to build with latest or retrigger or cancel. This way you wouldn't need multiple buttons simply 1 button to build with multiple sub options without having to click through to the log to trigger anything. It's something that is frequently fed back about the UI from our users.

@Lindsayauchin

This comment has been minimized.

Copy link
Contributor

@Lindsayauchin Lindsayauchin commented May 14, 2019

@StevenArmstrong interesting idea. I think that from pipeline we have observed, like the (rabbit MQ team at Pivotal below) an action button to trigger a job on the pipeline page is just not scalable.

Screen Shot 2019-05-13 at 5 34 04 PM

We are thinking about the user pains around triggering a build with the work being done on the resource version. You can follow a related issue #3403 to see our progress on the UX changes.

@hstenzel

This comment has been minimized.

Copy link

@hstenzel hstenzel commented May 14, 2019

If I understand correctly, to retrigger a specific job I'd scroll left/right on the build page to find the correct job?

Also, I'd potentially want to retrigger a successful job too, thinking about the case of rebuilding an artifact that was accidentally lost.

Perhaps these questions are really more about the UI related to the feature.

@vito

This comment has been minimized.

Copy link
Member

@vito vito commented May 14, 2019

@hstenzel Yep - to re-trigger a build you would do so from the build's page. There's no such thing as re-triggering a job - you can trigger a new build of a job, but the re- part of re-trigger means you're running an already-existing build for a second time with the same inputs.

You will be able to re-trigger a build regardless of whether it succeeded or failed; they both have their use case: re-trigger a succeeded build to detect flakes, re-trigger failed build to allow artifacts to continue along the pipeline.

@StevenArmstrong

This comment has been minimized.

Copy link

@StevenArmstrong StevenArmstrong commented May 14, 2019

@Lindsayauchin most people use pipeline groups on concourse to visualise big pipelines with many jobs. So I still think in combination with pipeline groups it could still be viable to have the trigger icons on the pipeline page.

@vito vito added enhancement and removed needs-validation labels Jul 29, 2019
@vito vito added this to To do in Algorithm v3 Jul 30, 2019
@vito vito moved this from To do to In progress in Algorithm v3 Jul 30, 2019
@vito vito moved this from In progress to To do in Algorithm v3 Jul 30, 2019
@vito vito added this to To do in Build re-running Jul 30, 2019
@vito vito removed this from To do in Algorithm v3 Jul 30, 2019
@vito vito removed this from Planned in Build Page Redesign Jul 30, 2019
@vito vito moved this from To do to End Goals in Build re-running Aug 6, 2019
@Lindsayauchin

This comment has been minimized.

Copy link
Contributor

@Lindsayauchin Lindsayauchin commented Aug 6, 2019

This has evolved into the Build re-triggering track of work. Iterative designs have been moved to smaller sliced stories and can be found in the Build re-triggering project here: https://github.com/concourse/concourse/projects/24

@vito

This comment has been minimized.

Copy link
Member

@vito vito commented Oct 27, 2019

For those following along: this has been implemented and will be in v6.0! I don't have an ETA yet since v6.0 includes very substantial internal changes that we're doing due diligence to test out. We're considering shipping a beta release first.

@clarafu clarafu referenced this issue Nov 6, 2019
0 of 10 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Build re-running
  
End Goals
You can’t perform that action at this time.