Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIXED JENKINS-35160] - Job deletion: Wait up to 15 seconds for interrupted builds to complete #2789

Merged
merged 6 commits into from Apr 14, 2017
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
68 changes: 53 additions & 15 deletions core/src/main/java/hudson/model/Job.java
Expand Up @@ -24,6 +24,7 @@
package hudson.model;

import com.infradna.tool.bridge_method_injector.WithBridgeMethods;
import hudson.AbortException;
import hudson.BulkChange;
import hudson.EnvVars;
import hudson.Extension;
Expand All @@ -36,6 +37,7 @@
import hudson.model.Fingerprint.RangeSet;
import hudson.model.PermalinkProjectAction.Permalink;
import hudson.model.listeners.ItemListener;
import hudson.model.queue.WorkUnit;
import hudson.search.QuickSilver;
import hudson.search.SearchIndex;
import hudson.search.SearchIndexBuilder;
Expand Down Expand Up @@ -70,10 +72,13 @@
import java.util.Collection;
import java.util.Collections;
import java.util.GregorianCalendar;
import java.util.Iterator;
import java.util.LinkedHashMap;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.SortedMap;
import java.util.concurrent.TimeUnit;
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.annotation.CheckForNull;
Expand Down Expand Up @@ -262,20 +267,6 @@ public void onCopied(Item src, Item item) {
}
}

@Override
protected void performDelete() throws IOException, InterruptedException {
// if a build is in progress. Cancel it.
RunT lb = getLastBuild();
if (lb != null) {
Executor e = lb.getExecutor();
if (e != null) {
e.interrupt();
// should we block until the build is cancelled?
}
}
super.performDelete();
}

/*package*/ TextFile getNextBuildNumberFile() {
return new TextFile(new File(this.getRootDir(), "nextBuildNumber"));
}
Expand Down Expand Up @@ -679,7 +670,54 @@ public void movedTo(DirectlyModifiableTopLevelItemGroup destination, AbstractIte
}
}

@Override public void delete() throws IOException, InterruptedException {
@Override
public void delete() throws IOException, InterruptedException {
checkPermission(DELETE);
// if a build is in progress. Cancel it.
if (this instanceof Queue.Task) {
// clear any items in the queue so they do not get picket up
Queue.getInstance().cancel((Queue.Task) this);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't work for the Promoted Builds and other such... exotic implementations

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better consider that architecturally deprecated.

// interrupt any builds in progress
Map<Queue.Executable, Executor> building = new LinkedHashMap<>();
for (Computer c : Jenkins.getInstance().getComputers()) {
for (Executor e : c.getOneOffExecutors()) {
WorkUnit workUnit = e.getCurrentWorkUnit();
if (workUnit != null && (workUnit.work == this || workUnit.work.getOwnerTask() == this)) {
building.put(e.getCurrentExecutable(), e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lacks logging

e.interrupt();
}
}
for (Executor e : c.getExecutors()) {
WorkUnit workUnit = e.getCurrentWorkUnit();
if (workUnit != null && (workUnit.work == this || workUnit.work.getOwnerTask() == this)) {
building.put(e.getCurrentExecutable(), e);
e.interrupt();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lacks logging

}
}
}
if (!building.isEmpty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 this method uses cached list of builds. If the new build gets started due to whatever reason, during the deletion cycle, these builds won't be noticed.

// give them 15 seconds or so to respond to the interrupt
long expiration = System.nanoTime() + TimeUnit.SECONDS.toNanos(15);
// comparison with executor.getCurrentExecutable() == computation currently should always be true
// as we no longer recycle Executors, but safer to future-proof in case we ever revisit recycling
while (!building.isEmpty() && expiration - System.nanoTime() > 0L) {
for (Iterator<Map.Entry<Queue.Executable, Executor>> iterator = building.entrySet().iterator();
iterator.hasNext(); ) {
Map.Entry<Queue.Executable, Executor> entry = iterator.next();
// comparison with executor.getCurrentExecutable() == executable currently should always be true
// as we no longer recycle Executors, but safer to future-proof in case we ever revisit recycling

if (!entry.getValue().isAlive() || entry.getKey() != entry.getValue().getCurrentExecutable()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be isActive(). E.g. if the executor is starting up/shutting down, we should not ignore the task check 🐛

iterator.remove();
}
}
Thread.sleep(50L);
}
if (!building.isEmpty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Risk of false negatives since this is a value from the past

throw new AbortException(Messages.Job_FailureToStopBuilds(building.size(), getFullDisplayName()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it also go to the system log?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see why. The user has gotten a failure message (I hope—test it?); the job has not been deleted; there is nothing for the admin to do.

}
}
}
super.delete();
Util.deleteRecursive(getBuildDir());
}
Expand Down
1 change: 1 addition & 0 deletions core/src/main/resources/hudson/model/Messages.properties
Expand Up @@ -179,6 +179,7 @@ Job.Pronoun=Project
Job.minutes=mins

Job.you_must_use_the_save_button_if_you_wish=You must use the Save button if you wish to rename a job.
Job.FailureToStopBuilds=Failed to interrupt and stop {0,choice,1#{0,number,integer} build|1<{0,number,integer} builds} of {0}
Label.GroupOf=group of {0}
Label.InvalidLabel=invalid label
Label.ProvisionedFrom=Provisioned from {0}
Expand Down