Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the deadlock in DependencyQueueTaskDispatcher #325

Merged
merged 1 commit into from Sep 13, 2017

Conversation

Projects
None yet
2 participants
@Jimilian
Copy link
Contributor

commented Sep 4, 2017

During work on #319. We faced the problem:

Found one Java-level deadlock:
=============================
"Handling GET / from 10.124.9.47 : RequestHandlerThread[#1871] View/index.jelly View/sidepanel.jelly":
  waiting to lock monitor 0x00007fa5544b49a8 (object 0x000000015a71e718, a com.sonyericsson.hudson.plugins.gerrit.trigger.gerritnotifier.ToGerritRunListener),
  which is held by "Executor #3 for Flow control x1 : executing team-CI/test-download-python #12643"
"Executor #3 for Flow control x1 : executing test-python #12643":
  waiting to lock monitor 0x00007fa5641ee128 (object 0x000000016fb44578, a com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritTrigger$RunningJobs),
  which is held by "Gerrit Worker EventThread_2"
"Gerrit Worker EventThread_2":
  waiting for ownable synchronizer 0x000000014020ffc0, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
  which is held by "AtmostOneTaskExecutor[hudson.model.Queue$1@53f304a7] [#50195]"
"AtmostOneTaskExecutor[hudson.model.Queue$1@53f304a7] [#50195]":
  waiting to lock monitor 0x00007fa5544b49a8 (object 0x000000015a71e718, a com.sonyericsson.hudson.plugins.gerrit.trigger.gerritnotifier.ToGerritRunListener),
  which is held by "Executor #3 for Flow control x1 : executing team-CI/test-download-python #12643"

Java stack information for the threads listed above:
===================================================
"Handling GET / from 10.124.9.47 : RequestHandlerThread[#1871] View/index.jelly View/sidepanel.jelly":
    at com.sonyericsson.hudson.plugins.gerrit.trigger.gerritnotifier.ToGerritRunListener.isProjectTriggeredAndIncomplete(ToGerritRunListener.java:199)
    - waiting to lock <0x000000015a71e718> (a com.sonyericsson.hudson.plugins.gerrit.trigger.gerritnotifier.ToGerritRunListener)
    at com.sonyericsson.hudson.plugins.gerrit.trigger.dependency.DependencyQueueTaskDispatcher.getBlockingDependencyProjects(DependencyQueueTaskDispatcher.java:207)
    at com.sonyericsson.hudson.plugins.gerrit.trigger.dependency.DependencyQueueTaskDispatcher.canRun(DependencyQueueTaskDispatcher.java:185)
    at hudson.model.Queue$BlockedItem.getCauseOfBlockage(Queue.java:2418)
    at hudson.model.Queue$Item.getWhy(Queue.java:2105)
    at sun.reflect.GeneratedMethodAccessor706.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.commons.jexl.util.PropertyExecutor.execute(PropertyExecutor.java:125)
    at org.apache.commons.jexl.util.introspection.UberspectImpl$VelGetterImpl.invoke(UberspectImpl.java:314)
    at org.apache.commons.jexl.parser.ASTArrayAccess.evaluateExpr(ASTArrayAccess.java:185)
    at org.apache.commons.jexl.parser.ASTIdentifier.execute(ASTIdentifier.java:75)
    at org.apache.commons.jexl.parser.ASTReference.execute(ASTReference.java:83)
    at org.apache.commons.jexl.parser.ASTReference.value(ASTReference.java:57)
    at org.apache.commons.jexl.parser.ASTReferenceExpression.value(ASTReferenceExpression.java:51)
    at org.apache.commons.jexl.ExpressionImpl.evaluate(ExpressionImpl.java:80)
    at hudson.ExpressionFactory2$JexlExpression.evaluate(ExpressionFactory2.java:74)
    at org.apache.commons.jelly.expression.ExpressionSupport.evaluateRecurse(ExpressionSupport.java:61)
    at org.apache.commons.jelly.expression.ExpressionSupport.evaluateAsString(ExpressionSupport.java:46)
    at org.apache.commons.jelly.expression.CompositeExpression.evaluateAsString(CompositeExpression.java:256)
    at org.kohsuke.stapler.jelly.ReallyStaticTagLibrary$1.buildAttributes(ReallyStaticTagLibrary.java:111)
    at org.kohsuke.stapler.jelly.ReallyStaticTagLibrary$1.run(ReallyStaticTagLibrary.java:95)
    at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
    at org.apache.commons.jelly.TagSupport.invokeBody(TagSupport.java:161)
    at org.apache.commons.jelly.tags.core.WhenTag.doTag(WhenTag.java:46)
    at org.apache.commons.jelly.impl.TagScript.run(TagScript.java:269)
    at org.apache.commons.jelly.impl.ScriptBlock.run(ScriptBlock.java:95)
    ...
    at org.kohsuke.stapler.Stapler.invoke(Stapler.java:876)
        at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
    at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949)
    at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011)
    at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
    at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
    at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
    at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:668)
    at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
    at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

"Executor #3 for Flow control x1 : executing test-python #12643":
    at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritTrigger$RunningJobs.remove(GerritTrigger.java:2306)
    - waiting to lock <0x000000016fb44578> (a com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritTrigger$RunningJobs)
    at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritTrigger.notifyBuildEnded(GerritTrigger.java:758)
    at com.sonyericsson.hudson.plugins.gerrit.trigger.gerritnotifier.ToGerritRunListener.onCompleted(ToGerritRunListener.java:132)
    - locked <0x000000015a71e718> (a com.sonyericsson.hudson.plugins.gerrit.trigger.gerritnotifier.ToGerritRunListener)
    at hudson.model.listeners.RunListener.fireCompleted(RunListener.java:201)
    at hudson.model.Run.execute(Run.java:1783)
    at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
    at hudson.model.ResourceController.execute(ResourceController.java:98)
    at hudson.model.Executor.run(Executor.java:410)
"Gerrit Worker EventThread_2":
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000014020ffc0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
    at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
    at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
    at hudson.model.Queue.cancel(Queue.java:703)
    at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritTrigger$RunningJobs.cancelJob(GerritTrigger.java:2246)
    at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritTrigger$RunningJobs.scheduled(GerritTrigger.java:2208)
    - locked <0x000000016fb44578> (a com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.GerritTrigger$RunningJobs)
    at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.EventListener.schedule(EventListener.java:203)
    at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.EventListener.schedule(EventListener.java:164)
    at com.sonyericsson.hudson.plugins.gerrit.trigger.hudsontrigger.EventListener.gerritEvent(EventListener.java:106)
    at com.sonymobile.tools.gerrit.gerritevents.GerritHandler.notifyListener(GerritHandler.java:328)
    at com.sonymobile.tools.gerrit.gerritevents.GerritHandler.notifyListeners(GerritHandler.java:296)
    at com.sonyericsson.hudson.plugins.gerrit.trigger.JenkinsAwareGerritHandler.notifyListeners(JenkinsAwareGerritHandler.java:77)
    at com.sonymobile.tools.gerrit.gerritevents.workers.AbstractGerritEventWork.perform(AbstractGerritEventWork.java:46)
    at com.sonymobile.tools.gerrit.gerritevents.workers.AbstractJsonObjectWork.perform(AbstractJsonObjectWork.java:77)
    at com.sonymobile.tools.gerrit.gerritevents.workers.StreamEventsStringWork.perform(StreamEventsStringWork.java:67)
    at com.sonymobile.tools.gerrit.gerritevents.workers.EventThread.run(EventThread.java:66)
    at com.sonyericsson.hudson.plugins.gerrit.trigger.SystemEventThread.run(SystemEventThread.java:66)
"AtmostOneTaskExecutor[hudson.model.Queue$1@53f304a7] [#50195]":
    at com.sonyericsson.hudson.plugins.gerrit.trigger.gerritnotifier.ToGerritRunListener.isProjectTriggeredAndIncomplete(ToGerritRunListener.java:199)
    - waiting to lock <0x000000015a71e718> (a com.sonyericsson.hudson.plugins.gerrit.trigger.gerritnotifier.ToGerritRunListener)
    at com.sonyericsson.hudson.plugins.gerrit.trigger.dependency.DependencyQueueTaskDispatcher.getBlockingDependencyProjects(DependencyQueueTaskDispatcher.java:207)
    at com.sonyericsson.hudson.plugins.gerrit.trigger.dependency.DependencyQueueTaskDispatcher.canRun(DependencyQueueTaskDispatcher.java:185)
    at hudson.model.Queue.isBuildBlocked(Queue.java:1156)
    at hudson.model.Queue.maintain(Queue.java:1441)
    at hudson.model.Queue$1.call(Queue.java:295)
    at hudson.model.Queue$1.call(Queue.java:292)
    at jenkins.util.AtmostOneTaskExecutor$1.call(AtmostOneTaskExecutor.java:101)
    at jenkins.util.AtmostOneTaskExecutor$1.call(AtmostOneTaskExecutor.java:91)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at hudson.remoting.AtmostOneThreadExecutor$Worker.run(AtmostOneThreadExecutor.java:110)
    at java.lang.Thread.run(Thread.java:745)

Found 1 deadlock.

This PR should fix the problem, but it's suspicious that UI (jelly) queries the Queue directly...

@Jimilian Jimilian force-pushed the Jimilian:fix_deadlock branch from c52bf7c to 298f2eb Sep 4, 2017

@rsandell rsandell merged commit 68422f1 into jenkinsci:master Sep 13, 2017

1 check passed

continuous-integration/jenkins/pr-merge This commit looks good
Details

@Jimilian Jimilian deleted the Jimilian:fix_deadlock branch Sep 22, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.