Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-27039] New step: milestone #1

Merged
merged 15 commits into from Sep 20, 2016

Conversation

@amuniz
Copy link
Member

commented Apr 11, 2016

(Moved from jenkinsci/pipeline-plugin#355)

Based on discussion on JENKINS-27039 (and also covers JENKINS-32829).

The implementation is based on the stage step code and @jglick's proposal in the linked JIRA. There is currently some duplicated code which will be there until concurrency is deprecated in stage.

This step has been designed to accomplish the following:

  1. Builds pass through the milestone in order (taking the build number as sorter field).
  2. Older builds reaching a milestone will not proceed (they are aborted) if a newer one already passed the milestone.
  3. Under no circumstances a build will be aborted by any event produced by a build which is behind (in the milestones terms). Note that this is not a feature itself, but just noting it as it behaves very different than stage concurrency: 1 did.
  4. Once a build passes a milestone, any older build that passed the previous milestone but not this one is aborted.

I used this Pipeline script for manual testing (thanks @stephenc):

milestone 1
node {
    def delay = new Random().nextInt(60)
    echo "Sleeping for ${delay}s..."
    sleep time: delay, unit: 'SECONDS'
}
milestone 2
node {
    def delay = new Random().nextInt(60)
    echo "Sleeping for ${delay}s..."
    sleep time: delay, unit: 'SECONDS'
}
  • Base implementation of milestone step
  • Automated tests
  • Clean up non existing milestones in config file ($JENKINS_HOME/org.jenkinsci.plugins.workflow.support.steps.MilestoneStep.xml)
  • Update the CD stock demo to use milestone and lock in tandem (it requires upstream jenkinsci/lockable-resources-plugin#25 to be merged and released, it will be done separately)
  • Allow to specify the ordinal as a parameter (or auto-generate it otherwise)

@reviewbybees

@amuniz amuniz referenced this pull request Apr 11, 2016
3 of 3 tasks complete
@reviewbybees

This comment has been minimized.

Copy link

commented Apr 11, 2016

This pull request originates from a CloudBees employee. At CloudBees, we require that all pull requests be reviewed by other CloudBees employees before we seek to have the change accepted. If you want to learn more about our process please see this explanation.

@jenkinsadmin

This comment has been minimized.

Copy link
Member

commented Apr 12, 2016

Thank you for this pull request! Please check this document for how the Jenkins project handles pull requests.

@michaelneale

This comment has been minimized.

Copy link
Member

commented Apr 12, 2016

I like the look of this one much better

pom.xml Outdated
<dependency>
<groupId>org.jenkins-ci.plugins.workflow</groupId>
<artifactId>workflow-job</artifactId>
<version>2.1-SNAPSHOT</version> <!-- Because of https://github.com/jenkinsci/workflow-job-plugin/pull/1 -->

This comment has been minimized.

Copy link
@jglick

jglick Apr 12, 2016

Member
Could not find artifact org.jenkins-ci.plugins.workflow:workflow-job:jar:2.1-SNAPSHOT in repo.jenkins-ci.org

I released 2.1, so update to it.

(For the future, if you want to use an unreleased dependency, just try: mvn -f workflow-job-plugin clean deploy)

This comment has been minimized.

Copy link
@jglick

jglick Apr 12, 2016

Member

2.1 has been released.

This comment has been minimized.

Copy link
@amuniz

amuniz Apr 13, 2016

Author Member

(For the future, if you want to use an unreleased dependency, just try: mvn -f workflow-job-plugin clean deploy)

Yeah. I didn't want to do it because it seems a bit risky to me. What would happen if some other PR is depending on a SNAPSHOT previously deployed? My deploy could interfere in that one.

This comment has been minimized.

Copy link
@jglick

jglick Apr 13, 2016

Member

True; if you are concerned about it, you can depend on a time-stamped snapshot.

LOGGER.log(Level.FINE, "save: {0}", getMilestonesByOrdinalByJob());
}

private static XmlFile getConfigFile() throws IOException {

This comment has been minimized.

Copy link
@jglick

jglick Apr 12, 2016

Member

🐜 If you moved this into MilestoneStep.DescriptorImpl, you would not need to reimplement this standard method. That would be the logical place to put the load and save methods.

@Extension
public static final class DescriptorImpl extends AbstractStepDescriptorImpl {

static Map<String, Map<Integer, Milestone>> milestonesByOrdinalByJob;

This comment has been minimized.

Copy link
@jglick

jglick Apr 12, 2016

Member

🐛 do not keep static state. Make this an instance field of DescriptorImpl. That ensures that state does not “leak” across tests (or, indeed, across Jenkins instances sharing a JVM, if and when we implement that).

}
}

static final class Milestone {

This comment has been minimized.

Copy link
@jglick

jglick Apr 12, 2016

Member

🐜 If you are going to make a dedicated plugin, you may as well make this a top-level type.

* Records that a build was canceled because it reached a milestone but a newer build already passed it, or
* a newer build {@link Milestone#wentAway(Run)} from the last milestone the build passed.
*/
public static final class CanceledCause extends CauseOfInterruption {

This comment has been minimized.

Copy link
@jglick

jglick Apr 12, 2016

Member

🐜 ditto, better as a top-level type

@Override
public Void run() throws Exception {
if (step.getLabel() != null) {
node.addAction(new LabelAction(step.getLabel()));

This comment has been minimized.

Copy link
@jglick

jglick Apr 12, 2016

Member

The label still feels superfluous to me, but if you want to keep it, I would suggest also adding StageAction. That will cause it to be picked up by jenkinsci/workflow-api-plugin#1 in what I think is the way you would expect.

This comment has been minimized.

Copy link
@jglick

jglick Apr 20, 2016

Member

(resolved)

@jglick

This comment has been minimized.

Copy link
Member

commented Apr 12, 2016

I guess this is still a werk-in-progress (I am not an administrator of the repository so cannot define labels in it), given that there is an open TODO item to allow the ordinal to be specified explicitly?

for (FlowNode n : walker) {

if (parallelDetectionEnabled <= 0 && n.getAction(ThreadNameAction.class) != null) {
listener.getLogger().println("Milestone step found inside parallel, it's not possible to grant ordering in this case.");

This comment has been minimized.

Copy link
@jglick

jglick Apr 12, 2016

Member

🐜 it is redundant to print to the log if you are about to throw an AbortException with essentially the same message.

if (n instanceof BlockEndNode) {
parallelDetectionEnabled++;
} else if (n instanceof BlockStartNode && !(n instanceof FlowStartNode)) {
parallelDetectionEnabled--;

This comment has been minimized.

Copy link
@jglick

jglick Apr 12, 2016

Member

This smells like something better handled by jenkinsci/workflow-api-plugin#1.

This comment has been minimized.

Copy link
@amuniz

amuniz Apr 13, 2016

Author Member

Probably, but as I said in jenkinsci/workflow-api-plugin#1 I'm not sure how that new iterator is traversing the graph. It can be changed later anyways as this is completely internal to this step.

This comment has been minimized.

Copy link
@jglick

jglick Apr 13, 2016

Member

I'm not sure how that new iterator is traversing the graph

Up and back. I think it is exactly what you want here. Something along the lines of

FlowNodeSerialWalker.Iterator it = new FlowNodeSerialWalker(node).iterator();
while (it.hasNext()) {
    FlowNode n = it.next();
    if (n.getAction(ThreadNameAction.class) != null) {
        assert it.isAncestor() : "if we found this, we must be inside a branch";
        throw new AbortException("");
    }
    OrdinalAction a = n.getAction(OrdinalAction.class);
    if (a != null) {
        previousOrdinal = a.ordinal;
        break;
    }
}

This comment has been minimized.

Copy link
@amuniz

amuniz Apr 13, 2016

Author Member

Ok. I will try it when it is released (added a TODO for that).

This comment has been minimized.

Copy link
@jglick

jglick Apr 21, 2016

Member

FlowNodeSerialWalker would not work here I think because of:

node {
  milestone()
}
node {
  milestone()
}

From the second milestone, we need to find the first, which is not “up and back”.

This comment has been minimized.

Copy link
@jglick

jglick Sep 19, 2016

Member

Could now use the new APIs from @svanoort.

}

private static class OrdinalAction extends InvisibleAction {
Integer ordinal;

This comment has been minimized.

Copy link
@jglick

jglick Apr 12, 2016

Member

🐜 why Integer not int?

This comment has been minimized.

Copy link
@jglick

jglick Apr 12, 2016

Member

And why not final?

@CheckForNull
private static Integer getLastOrdinalInBuild(Run<?, ?> r) {
int lastMilestoneOrdinal = 0;
FlowExecutionOwner owner = ((FlowExecutionOwner.Executable) r).asFlowExecutionOwner();

This comment has been minimized.

Copy link
@jglick

jglick Apr 12, 2016

Member

🐜 Better to take FlowExecutionOwner.Executable as the parameter type so the cast immediately follows the instanceof check.

List<FlowNode> heads = owner.get().getCurrentHeads();
if (heads.size() == 1) {
FlowGraphWalker walker = new FlowGraphWalker();
walker.addHead(heads.get(0));

This comment has been minimized.

Copy link
@jglick

jglick Apr 12, 2016

Member

Some reason not to use the FlowGraphWalker(FlowExecution) constructor?

This comment has been minimized.

Copy link
@jglick

jglick Apr 12, 2016

Member

I guess either works in this case.

This comment has been minimized.

Copy link
@amuniz

amuniz Apr 13, 2016

Author Member

No special reason, this was more intuitive for me at the time of use it.

return "Milestone[inSight=" + inSight + "]";
}

public void pass(StepContext context, Run<?, ?> build) {

This comment has been minimized.

Copy link
@jglick

jglick Apr 12, 2016

Member

🐜 unused parameter

"}\n" +
"milestone()\n"));
WorkflowRun b1 = p.scheduleBuild2(0).waitForStart();
story.j.assertBuildStatus(Result.FAILURE, story.j.waitForCompletion(b1));

This comment has been minimized.

Copy link
@jglick

jglick Apr 12, 2016

Member

Can be simplified to

WorkflowRun b1 = story.j.assertBuildStatus(Result.FAILURE, p.scheduleBuild2(0).get());
"}\n" +
"milestone()\n"));
WorkflowRun b3 = p.scheduleBuild2(0).waitForStart();
story.j.assertBuildStatusSuccess(story.j.waitForCompletion(b3));

This comment has been minimized.

Copy link
@jglick

jglick Apr 12, 2016

Member

Similarly, can be simplified to

story.j.assertBuildStatusSuccess(p.scheduleBuild2(0));
@jglick

This comment has been minimized.

Copy link
Member

commented Apr 12, 2016

Looking good overall; ready for integration testing in the demo image. IIRC @HRMPW was still concerned about the step name; what were the alternatives? juncture?

@amuniz

This comment has been minimized.

Copy link
Member Author

commented Apr 13, 2016

I guess this is still a w[o]rk-in-progress

Yes, the last TODO was added later and forgot to add the label. Done.

@lsglick

This comment has been minimized.

Copy link

commented Apr 20, 2016

No, that plugin is not compatible with Pipeline.

The GitHub Integration Plugin does work with Pipeline just fine. But I'll take a look at GitHub Organization Folder to see if it could also satisfy our needs. Thanks for the help!

@@ -37,29 +67,65 @@ public void setLabel(String label) {
this.label = label;
}

@DataBoundSetter
public void setOrdinal(Integer ordinal) {

This comment has been minimized.

Copy link
@jglick

jglick Apr 20, 2016

Member

🐜 I would put this in the @DataBoundConstructor so you can use

milestone 1

as a shortcut for

milestone ordinal: 1
@@ -58,6 +76,7 @@
public Void run() throws Exception {
if (step.getLabel() != null) {
node.addAction(new LabelAction(step.getLabel()));
node.addAction(new MilestoneAction(step.getLabel()));

This comment has been minimized.

Copy link
@jglick

jglick Apr 20, 2016

Member

Could be collapsed into one action extending LabelAction and implementsing StageAction.

}

@DataBoundSetter
public void setOrdinal(Integer ordinal) {

This comment has been minimized.

Copy link
@jglick

jglick Apr 20, 2016

Member

🐜 I would put this in the @DataBoundConstructor so you can use

milestone 1

as a shortcut for

milestone ordinal: 1

This comment has been minimized.

Copy link
@jglick

jglick May 17, 2016

Member

(You can accept a null ordinal constructor parameter, which should still allow milestone() to work.)

This comment has been minimized.

Copy link
@amuniz

amuniz May 17, 2016

Author Member

Ok, will do that. Thanks.

public Void run() throws Exception {
if (step.getLabel() != null) {
node.addAction(new LabelAction(step.getLabel()));
node.addAction(new MilestoneAction(step.getLabel()));

This comment has been minimized.

Copy link
@jglick

jglick Apr 20, 2016

Member

🐜 Could be collapsed into one action extending LabelAction and implementsing StageAction.

}

@Override public String getShortDescription() {
return "Superseded by " + getNewerBuild().getDisplayName();

This comment has been minimized.

Copy link
@jglick

jglick May 19, 2016

Member

🐜 This will throw a NullPointerException if that build gets deleted (before the older one—unlikely but possible).

This comment has been minimized.

Copy link
@amuniz

amuniz May 20, 2016

Author Member

Good catch, fixed.


private static final long serialVersionUID = 1;

private final String newerBuild;

This comment has been minimized.

Copy link
@jglick

jglick May 19, 2016

Member

Would be better to store an int build number, since we know it will be from the same job anyway. Unfortunately CauseOfInterruption has no method indicating when it was attached/loaded to a Run via InterruptedBuildAction; cf. how CauseAction is a RunAction2 so it can call API methods onLoad and onAddedTo.

b2.getOneOffExecutor().doStop();
story.j.waitForMessage("Finished: ABORTED", b2);

assertTrue(b1.isBuilding()); // #1 shuould not be cancelled

This comment has been minimized.

Copy link
@jglick

jglick Jul 1, 2016

Member

And why not, exactly?

This comment has been minimized.

Copy link
@amuniz

amuniz Jul 1, 2016

Author Member
[...]
milestone 1
input
milestone 2
[...]

Imagine builds #25 and #26 are waiting for input, perhaps you check that #26 (which is newer) is going to deploy a broken version of the app, but #25 is ok. So probably want to abort #26 but keep #25 alive to proceed it (even being older).

If #26 is ok and you approve it, then #25 is automatically killed (as soon as #26 passes the next milestone after input).

This comment has been minimized.

Copy link
@jglick

jglick Jul 1, 2016

Member

I suppose your thinking is that b1 had some changes eligible for “production” or whatever, and b2 had some newer ones, but when the user decided to kill off b2 for whatever reason, b1 is still valid. Perhaps; but you are only detecting that the status is ABORTED, not why it has that status. You are certainly not verifying that it was manually aborted.

What if b2 had a major rewrite of the code that renders b1 obsolete, but the Jenkinsfile happened to have a too-short timeout to deal with all the new test cases, leading to b2 becoming aborted on build timeout? Surely this is not conceptually different from b2 ending with FAILURE due to a broken test run by some test runner that just returns a nonzero exit code, or equivalently b2 being marked UNSTABLE by JUnitResultArchiver and the script then returning prior to a production stage based on that test result.

In any of these cases, b2 did not go all the way through, so someone will have to correct the problem and trigger b3. We do not know for sure whether it would make sense to continue with b1. If you want to err on the side of preserving b1, then I think you just have to abandon the virtual terminal milestone. Perhaps nothing is lost by that simplification: assuming you have an explicit milestone before any “go to production” steps, you are still ensuring that you do not clobber newer bits with older bits, right?

This comment has been minimized.

Copy link
@jglick

jglick Jul 1, 2016

Member

BTW MilestoneStep/help.html does not seem to have ever documented the virtual terminal milestone behavior. In any event this help file probably needs to be rewritten for clarity.

This comment has been minimized.

Copy link
@amuniz

amuniz Jul 1, 2016

Author Member

Perhaps nothing is lost by that simplification: assuming you have an explicit milestone before any “go to production” steps, you are still ensuring that you do not clobber newer bits with older bits, right?

Right, but at least Milestone#wentAway needs to be called to clean up the status.

This comment has been minimized.

Copy link
@jglick

jglick Jul 1, 2016

Member

at least Milestone#wentAway needs to be called to clean up the status

So can we do that regardless of the result of the exited build? If so, this feels like the right fix to make.

This comment has been minimized.

Copy link
@amuniz

amuniz Jul 1, 2016

Author Member

Not sure, I'd have to test it.

Currently the call to exit is making this test to pass, and we want that use case (it is the whole point of JENKINS-27039)

public static final class Listener extends RunListener<Run<?,?>> {
@Override public void onCompleted(Run<?,?> r, TaskListener listener) {
if (!(r instanceof FlowExecutionOwner.Executable) || ((FlowExecutionOwner.Executable) r).asFlowExecutionOwner() == null
|| Result.ABORTED.equals(r.getResult())) {

This comment has been minimized.

Copy link
@jglick

jglick Jul 1, 2016

Member

🐛 this seems very arbitrary. We let older builds keep running when a newer build ends with ABORTED but not some other status? Why? What about newer builds that end in FAILURE? Or NOT_BUILT?

This comment has been minimized.

Copy link
@jglick

jglick Jul 1, 2016

Member

Or that just returned at some point early but were STABLE?

This comment has been minimized.

Copy link
@jglick

jglick Jul 1, 2016

Member

I guess it comes down to whether you consider the virtual terminal milestone to act like

// explicit program
milestone ∞

or like

try {
  // explicit program
} finally {
  milestone ∞
}

This comment has been minimized.

Copy link
@jglick

jglick Jul 1, 2016

Member

…though even under the first assumption, you cannot reliably distinguish based solely on Result whether a build ran to the “end” of the script, you can only guess. A script can return at any point without affecting result; and it can set currentBuild.result directly to some value, rather than relying on it being set implicitly by Pipeline infrastructure based on a Throwable. For that matter, an uncaught FlowInterruptedException can encode any Result, not just ABORTED.

This comment has been minimized.

Copy link
@amuniz

amuniz Jul 1, 2016

Author Member

I considered ABORTED a special case as input step kills builds as ABORTED. See my explanation in a previous comment.

This comment has been minimized.

Copy link
@jglick

jglick Jul 1, 2016

Member

Well, a lot of things can produce a result of ABORTED.

@jglick

This comment has been minimized.

Copy link
Member

commented Jul 1, 2016

Since there is no credible estimate at this point for when jenkinsci/pipeline-stage-step-plugin#4 will be ready for release, yet this PR is essentially complete (modulo my latest comments) and seems to be working in jenkinsci/workflow-aggregator-plugin#3 if you ignore the stage view and only pay attention to the textual log (and Pipeline Steps), we should consider releasing milestone and that portion of the demo PR which still uses non-blocky stage.

@lsglick

This comment has been minimized.

Copy link

commented Aug 1, 2016

We're really excited to use this! Any timeframe when it will get released?

@jglick

This comment has been minimized.

Copy link
Member

commented Aug 18, 2016

Any timeframe when it will get released?

Not currently; got hung up on the virtual terminal milestone issue. Need to find someone to finish this up.

jglick added a commit to jglick/workflow-aggregator-plugin that referenced this pull request Aug 30, 2016
@witokondoria

This comment has been minimized.

Copy link
Member

commented Sep 6, 2016

Could "A final milestone has to be implicitly defined" be accepted as a workaround?

@jglick

This comment has been minimized.

Copy link
Member

commented Sep 7, 2016

@amuniz promised to circle back to this soon.

@amuniz

This comment has been minimized.

Copy link
Member Author

commented Sep 18, 2016

Promised, and done :)

@jglick
jglick approved these changes Sep 19, 2016
Copy link
Member

left a comment

🐝; does it work well in the context of the CD demo?

@@ -85,7 +85,7 @@ public Void run() throws Exception {
* Gets the next ordinal and throw {@link AbortException} the milestone lives inside a parallel step branch.
*/
private synchronized int processOrdinal() throws AbortException {
// TODO: use FlowNodeSerialWalker when released
// TODO: use FlowNodeSerialWalker when released (if possible)

This comment has been minimized.

Copy link
@jglick

jglick Sep 19, 2016

Member

Rather will need to be whatever is now in workflow-api.

assertTrue(b1.isBuilding()); // #1 shuould not be cancelled
SemaphoreStep.success("wait/1", null);
story.j.assertBuildStatus(Result.SUCCESS, story.j.waitForCompletion(b1));
story.j.assertBuildStatus(Result.NOT_BUILT, story.j.waitForCompletion(b1)); // #1 shuould be cancelled

This comment has been minimized.

Copy link
@jglick
@amuniz

This comment has been minimized.

Copy link
Member Author

commented Sep 20, 2016

does it work well in the context of the CD demo?

Yes, tested.

@amuniz amuniz merged commit 802b461 into jenkinsci:master Sep 20, 2016

1 check passed

Jenkins This pull request looks good
Details
@amuniz amuniz referenced this pull request Sep 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.