Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to re-trigger failed build with the same input versions #413

Closed
davewalter opened this issue May 2, 2016 · 56 comments
Closed

Ability to re-trigger failed build with the same input versions #413

davewalter opened this issue May 2, 2016 · 56 comments
Labels
core/build-plan enhancement release/documented Documentation and release notes have been updated. web-ui
Milestone

Comments

@davewalter
Copy link

When using the new version: every configuration for a get task, it is possible to arrive at a state where you have multiple builds of the same job running at the same time. If an earlier build fails, there is no way to re-trigger it with the same set of inputs. We haven't been able to determine a useful workaround; setting serial: true doesn't really help in this scenario, because the next build will start as soon as the first one fails.

It would be helpful if there were a way to re-trigger the job with the same inputs as a particular build (failed or otherwise).

Let us know if you need more details on this scenario or our desired fix. Thanks!

@davewalter and @rmasand

@vito
Copy link
Member

vito commented Jul 15, 2016

We're thinking about splitting today's trigger build button (+). It is primarily used for three things today:

  1. Impatience: I just pushed something or know that someone just published something, and I want the build to run now.
  2. Retrying a build (this issue): I want to re-run the current build, either to see if it's flaky or to retry because something outside the build failed (e.g. github, a deployment, etc.).
  3. Triggering a job that only ever manually runs, e.g. shipping a product after you've written release notes.

The flaw with case 1 is that there's a race condition. In the time between you loading the page and clicking the +, Concourse may have already found your stuff and queued a build. Now you have two, which is annoying.

The flaw with case 2 is you can only do it with the latest build, and also if you triggered a bunch, new versions may come in, potentially invalidating your flakiness trial. You could set version to a particular version in your pipeline, but that's annoying.

Case 3 pretty much works, but you don't know what versions it'll use until you run it. See #269

So, I think we should split + into two buttons. One that lives on the job, "sync", which will make sure everything's up-to-date and then queue up a build if it should (i.e. one's not queued already; same semantics as auto-triggering). The other button would be associated with a particular build of the job, and would re-trigger it with the same inputs. This covers cases 1 and 2.

The third case needs some more thinking since a "sync" button alone doesn't intuitively seem like enough given that the build only manually triggers.

@vito vito changed the title Ability to ReTrigger Failed Job With Same Version of Resources Ability to re-trigger failed build with the same input versions Jul 15, 2016
@endzyme
Copy link

endzyme commented Aug 15, 2016

+1 this would be a great, and much needed, feature for Concourse CI

@charlieoleary
Copy link
Contributor

+1 As well, this is a pretty critical feature. We've sort of circumvented it with empty commits (since we're using the PR resource), but it would be ideal to simply retry a failed job with the same inputs.

@ahelal
Copy link
Contributor

ahelal commented Nov 15, 2016

Would love to see that. We are doing crazy stuff to try to trigger old commits. Is this open for external people to help ? and if so how ?

@primalmotion
Copy link

same thing here, I feel that concourse has everything to be able to do this fairly easily. It would work perfectly with the pull request resource.

@tracker-common
Copy link

+1000000 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍👍 👍👍👍 👍👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍👍 👍👍👍 👍👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍👍 👍👍👍 👍👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍 👍👍 👍👍👍 👍👍

@gabro
Copy link
Contributor

gabro commented Feb 8, 2017

Same here, if you guys are busy can we help somehow?

@miromode
Copy link

miromode commented Mar 8, 2017

+1

1 similar comment
@kei-yamazaki
Copy link
Contributor

👍

@timrchavez
Copy link
Contributor

👍 This would go a long way to making Concourse a more viable choice for us.

@ls-yann-david
Copy link

Pleaseeeeee 👍

@VanAxe
Copy link

VanAxe commented Apr 27, 2017

Retrigerring jobs would be a lot more elegant than empty commits! 😝 👍

@olhtbr
Copy link

olhtbr commented Apr 27, 2017

👍

@vito vito removed the unscheduled label May 8, 2017
@tsantero
Copy link

tsantero commented May 9, 2017

is this issue currently being actively worked on internally? I need this asap, but also don't want to duplicate work. similarly, are there any contributor guidelines I should read?

@vito vito removed the help wanted label May 9, 2017
@vito vito added this to the Staging milestone May 17, 2017
@vito vito added the workflows label May 17, 2017
@vito
Copy link
Member

vito commented Apr 9, 2019

Notes from IPM:

  • Preserve inputs on re-triggered build (duh) - make sure the scheduler/build starter doesn't re-compute them or use next_build_inputs
  • We'll need a button in the UI for re-triggering (@Lindsayauchin), in addition to the existing trigger-build button
  • We'll need to clear out the build events for the old build and "re-set" the build back to pending state (and reset whatever other build data is appropriate, i.e. start/end time)
  • Also: clear out any outputs for the build, otherwise a re-trigger from green to red could result in "phantom outputs" satisfying passed constraints
  • Re-compute build plan based on current pipeline config
    • If anyone is worried about this let us know - it's way easier to implement this way, and we figure there may be cases where a pipeline config change was made to fix the errant build anyway, in which case we'd want to pick up the new config.

We'll also spike on creating a new build instead of replacing it. They may actually be roughly the same difficulty. If we do this instead, we don't have to reset anything or clear out any outputs/etc.

@vito vito added this to the v6.0.0 milestone May 10, 2019
@StevenArmstrong
Copy link

Could the trigger buttons be moved out from under the job and put on the main page somehow? It could have a trigger icon with a prompt or something? I say this as one of the main feedbacks our product lead gets of our concourse pipelines is around users calling the interface awkward and complaining they have to click through the green/red square of a job to trigger it. To mitigate this we have had to have a job called trigger for production deployments so users know how to trigger a production deployment. As a result of the clicking around we have even had requests to build a UI on top of concourse to make it easier and more dev friendly for roll forward and rollback triggers :(

@vito
Copy link
Member

vito commented May 13, 2019

@StevenArmstrong So having a specific 'retrigger' button on failed builds in the pipeline? Sounds reasonable, though I would argue people should probably be clicking into the build first and understanding the failure rather than blindly re-triggering. 🤔 In any case we may want to discuss that as a separate issue that we can address after we take our first crack at this. :)

@hstenzel
Copy link

One question I have is how we can retrigger a build if we no longer have the log from the original?

@vito
Copy link
Member

vito commented May 13, 2019

@hstenzel Build logs are purely cosmetic, the actual information regarding which versions/etc. are used is kept in the database and so re-triggering will still work.

@hstenzel
Copy link

hstenzel commented May 13, 2019 via email

@vito
Copy link
Member

vito commented May 13, 2019 via email

@StevenArmstrong
Copy link

@vito it wasn't really just for retrigger. It was more if you are doing a redesign on what the + buttons or other buttons do it would be really good if they weren't nested under the job as users have said they find it confusing having to click through to manually initiate a new build when deploying to production. Instead I was suggesting having them a layer up as an icon on the pipeline page beside each job. You could then have a confirmation pop up if someone clicks it by mistake to confirm to build with latest or retrigger or cancel. This way you wouldn't need multiple buttons simply 1 button to build with multiple sub options without having to click through to the log to trigger anything. It's something that is frequently fed back about the UI from our users.

@Lindsayauchin
Copy link
Contributor

Lindsayauchin commented May 14, 2019

@StevenArmstrong interesting idea. I think that from pipeline we have observed, like the (rabbit MQ team at Pivotal below) an action button to trigger a job on the pipeline page is just not scalable.

Screen Shot 2019-05-13 at 5 34 04 PM

We are thinking about the user pains around triggering a build with the work being done on the resource version. You can follow a related issue #3403 to see our progress on the UX changes.

@hstenzel
Copy link

If I understand correctly, to retrigger a specific job I'd scroll left/right on the build page to find the correct job?

Also, I'd potentially want to retrigger a successful job too, thinking about the case of rebuilding an artifact that was accidentally lost.

Perhaps these questions are really more about the UI related to the feature.

@vito
Copy link
Member

vito commented May 14, 2019

@hstenzel Yep - to re-trigger a build you would do so from the build's page. There's no such thing as re-triggering a job - you can trigger a new build of a job, but the re- part of re-trigger means you're running an already-existing build for a second time with the same inputs.

You will be able to re-trigger a build regardless of whether it succeeded or failed; they both have their use case: re-trigger a succeeded build to detect flakes, re-trigger failed build to allow artifacts to continue along the pipeline.

@StevenArmstrong
Copy link

@Lindsayauchin most people use pipeline groups on concourse to visualise big pipelines with many jobs. So I still think in combination with pipeline groups it could still be viable to have the trigger icons on the pipeline page.

@vito vito added this to To do in Algorithm v3 Jul 30, 2019
@vito vito moved this from To do to In progress in Algorithm v3 Jul 30, 2019
@vito vito moved this from In progress to To do in Algorithm v3 Jul 30, 2019
@vito vito added this to To do in Build re-running Jul 30, 2019
@vito vito removed this from To do in Algorithm v3 Jul 30, 2019
@vito vito moved this from To do to End Goals in Build re-running Aug 6, 2019
@Lindsayauchin
Copy link
Contributor

Lindsayauchin commented Aug 6, 2019

This has evolved into the Build re-triggering track of work. Iterative designs have been moved to smaller sliced stories and can be found in the Build re-triggering project here: https://github.com/concourse/concourse/projects/24

@vito
Copy link
Member

vito commented Oct 27, 2019

For those following along: this has been implemented and will be in v6.0! I don't have an ETA yet since v6.0 includes very substantial internal changes that we're doing due diligence to test out. We're considering shipping a beta release first.

@vito
Copy link
Member

vito commented Jan 7, 2020

Closing this out!

As implemented in v6.0, re-running a build will create a new build named e.g. "123.1", which will apprear adjacent to the original build in the build history. This placement in the history reflects the scheduler's iteration order for passed constraints - i.e. if you re-run a very old build it won't suddenly propagate the older versions downstream if there are "newer" successful builds after the re-run.

The new build will run with a newly constructed build plan based on the pipeline's current configuration, using the versions of each input from the original build. This should work in the common case of re-triggering recent flakes, but it can fail if the configuration has changed such that new get steps have been added, or the old version is no longer available because the resource changed. Hope that's good enough for MVP!

In the future we plan to fix this by having re-runs run with the the exact build plan that the original ran with, rather than constructing a new plan based on the current configuration. This is going to be tracked in a separate "epic", Build Lifecycle.

@vito vito closed this as completed Jan 7, 2020
@clarafu clarafu added the release/documented Documentation and release notes have been updated. label Feb 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core/build-plan enhancement release/documented Documentation and release notes have been updated. web-ui
Projects
No open projects
Build re-running
  
End Goals
Development

No branches or pull requests