Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YARN-11191 Global Scheduler refreshQueue cause deadLock #4726

Open
wants to merge 2 commits into
base: trunk
Choose a base branch
from

Conversation

yb12138
Copy link

@yb12138 yb12138 commented Aug 10, 2022

Description of PR

How was this patch tested?

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@luoyuan3471
Copy link
Contributor

luoyuan3471 commented Aug 10, 2022

@yb12138 I want to ask two question:

  • @OverRide
  • public List getChildQueuesByTryLock() {
  • try {
  •  while (!readLock.tryLock()){
    
  •    LockSupport.parkNanos(10000);
    
  •  }
    
  •  return new ArrayList<CSQueue>(childQueues);
    
  • } finally {
  •  readLock.unlock();
    
  • }
  • }

1.Though you use tryLock and park, so refresh queue thread switch to block state, but this thread still hold PremmptionManager lock ,so scheduler thread still can't allocate new container. Is it right?

2.Does this issue related to global Scheduler or just the preemption function?

Looking forward to your reply, thanks!

@yb12138
Copy link
Author

yb12138 commented Aug 10, 2022

@luoyuan3471
1.The key to deadlock is that refresh thread can‘t acquire csqueue read lock. The read lock request is blocked by a write lock (as: https://bugs.openjdk.org/browse/JDK-6893626).so i use tryLock to break the condition.The PremmptionManager lock will be released soon after refresh thread gets csqueue read lock.
2.just preemption, but global scheduler increases the chance

@luoyuan3471
Copy link
Contributor

CapacityScheduler.refreshQueue: hold: PremmptionManager.writeLock
require: csqueue.readLock

CapacityScheduler.schedule: hold: csqueue.readLock
require: PremmptionManager.readLock

other thread(completeContainer,release Resource,etc.): require: csqueue.writeLock

@yb12138
schedule thread hold csqueue.readLock and it is blocked by PremmptionManager.readLock , and PremmptionManager.writeLock is hold by refreshQueue thread, seems refreshQueue have no chance to get csqueue.readLock.

Very sorry, I'm still a little confused on this point. Can you explain more about it? Thank you!

@yb12138
Copy link
Author

yb12138 commented Aug 10, 2022

@luoyuan3471
未命名
you can see this image.
this problem will happen when refresh thread is calling PreemptionManager.refreshQueue and schedule thread is calling AbstractCSQueue.getTotalKillableResource.At this time, refresh thread will require csqueue.readLock,but csqueue.readLock will blocked by schedule thread and "other thread"( https://bugs.openjdk.org/browse/JDK-6893626 ).And schedule thread will require PremmptionManager.readLock,but this readLock will blocked by refresh thread held writeLock. so i use tryLock to make refresh thread get csqueue.readLock. Wait for the refresh thread complete PreemptionManager.refreshQueue,the schedule thread will get premmptionManager.readLock, then can allocate new container.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 50s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 41m 33s trunk passed
+1 💚 compile 1m 16s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 compile 1m 8s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 5s trunk passed
+1 💚 mvnsite 1m 13s trunk passed
+1 💚 javadoc 1m 7s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 0m 55s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 2m 19s trunk passed
+1 💚 shadedclient 24m 43s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 56s the patch passed
+1 💚 compile 1m 6s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javac 1m 6s the patch passed
+1 💚 compile 0m 55s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 0m 55s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 46s /results-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 95 unchanged - 0 fixed = 97 total (was 95)
+1 💚 mvnsite 1m 0s the patch passed
+1 💚 javadoc 0m 49s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 0m 41s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 2m 11s the patch passed
+1 💚 shadedclient 24m 45s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 102m 29s hadoop-yarn-server-resourcemanager in the patch passed.
+1 💚 asflicense 0m 42s The patch does not generate ASF License warnings.
211m 50s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4726/1/artifact/out/Dockerfile
GITHUB PR #4726
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux f7c0894bff81 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 65650a2
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4726/1/testReport/
Max. process+thread count 891 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4726/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@luoyuan3471
Copy link
Contributor

@luoyuan3471 未命名 you can see this image. this problem will happen when refresh thread is calling PreemptionManager.refreshQueue and schedule thread is calling AbstractCSQueue.getTotalKillableResource.At this time, refresh thread will require csqueue.readLock,but csqueue.readLock will blocked by schedule thread and "other thread"( https://bugs.openjdk.org/browse/JDK-6893626 ).And schedule thread will require PremmptionManager.readLock,but this readLock will blocked by refresh thread held writeLock. so i use tryLock to make refresh thread get csqueue.readLock. Wait for the refresh thread complete PreemptionManager.refreshQueue,the schedule thread will get premmptionManager.readLock, then can allocate new container.


Do you mean readLock.tryLock() will make readLock place first ,though a write lock request is already in the head of waiting queue? @yb12138

@yb12138
Copy link
Author

yb12138 commented Aug 11, 2022

@luoyuan3471
yes! ReadLock.lock() will call function(tryAcquireShared), this method will check whether the waiting queue first node is shared.
if false, lock method will be blocked even if sharedCount(c) >0.But readLock.tryLock() do not need the check, it can get the csqueue readLock directly.
you can diff tryReadLock() and Lock().

@luoyuan3471
Copy link
Contributor

@luoyuan3471 yes! ReadLock.lock() will call function(tryAcquireShared), this method will check whether the waiting queue first node is shared. if false, lock method will be blocked even if sharedCount(c) >0.But readLock.tryLock() do not need the check, it can get the csqueue readLock directly. you can diff tryReadLock() and Lock().

Thanks for your explanation. I checked the code. You're right!

@luoyuan3471
Copy link
Contributor

@luoyuan3471 1.The key to deadlock is that refresh thread can‘t acquire csqueue read lock. The read lock request is blocked by a write lock (as: https://bugs.openjdk.org/browse/JDK-6893626).so i use tryLock to break the condition.The PremmptionManager lock will be released soon after refresh thread gets csqueue read lock. 2.just preemption, but global scheduler increases the chance

For 2, Why does Global Scheduler increase the chance of this dead lock case?

@shellwjl
Copy link

Global Scheduler has muti-threads to assign containers normally.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 42s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 37m 54s trunk passed
+1 💚 compile 1m 10s trunk passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04
+1 💚 compile 1m 6s trunk passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 2m 4s trunk passed
+1 💚 mvnsite 1m 18s trunk passed
+1 💚 javadoc 1m 13s trunk passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 59s trunk passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 2m 22s trunk passed
+1 💚 shadedclient 23m 46s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 56s the patch passed
+1 💚 compile 1m 7s the patch passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04
+1 💚 javac 1m 7s the patch passed
+1 💚 compile 0m 57s the patch passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07
+1 💚 javac 0m 57s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 44s /results-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 8 new + 138 unchanged - 0 fixed = 146 total (was 138)
+1 💚 mvnsite 1m 4s the patch passed
+1 💚 javadoc 0m 45s the patch passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 44s the patch passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 2m 2s the patch passed
+1 💚 shadedclient 21m 36s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 99m 12s hadoop-yarn-server-resourcemanager in the patch passed.
+1 💚 asflicense 0m 43s The patch does not generate ASF License warnings.
201m 55s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4726/3/artifact/out/Dockerfile
GITHUB PR #4726
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 4e2b917b27d3 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 281c169
Default Java Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4726/3/testReport/
Max. process+thread count 964 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4726/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 57s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 40m 31s trunk passed
+1 💚 compile 1m 24s trunk passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04
+1 💚 compile 1m 16s trunk passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 5s trunk passed
+1 💚 mvnsite 1m 16s trunk passed
+1 💚 javadoc 1m 10s trunk passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 55s trunk passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 2m 28s trunk passed
+1 💚 shadedclient 23m 21s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 57s the patch passed
+1 💚 compile 1m 1s the patch passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04
+1 💚 javac 1m 1s the patch passed
+1 💚 compile 0m 54s the patch passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07
+1 💚 javac 0m 54s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 49s /results-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 8 new + 138 unchanged - 0 fixed = 146 total (was 138)
+1 💚 mvnsite 0m 58s the patch passed
+1 💚 javadoc 0m 42s the patch passed with JDK Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 45s the patch passed with JDK Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 2m 4s the patch passed
+1 💚 shadedclient 21m 44s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 103m 9s hadoop-yarn-server-resourcemanager in the patch passed.
+1 💚 asflicense 0m 49s The patch does not generate ASF License warnings.
207m 54s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4726/2/artifact/out/Dockerfile
GITHUB PR #4726
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 11f01de99262 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 0547b17
Default Java Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.16+8-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_342-8u342-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4726/2/testReport/
Max. process+thread count 991 (vs. ulimit of 5500)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4726/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@Override
public List<CSQueue> getChildQueuesByTryLock() {
try {
while (!readLock.tryLock()){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just a regular lock()?

@@ -25,10 +25,7 @@
import org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueue;
import org.apache.hadoop.yarn.util.resource.Resources;

import java.util.Collections;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid.

@@ -3026,4 +3030,69 @@ public void testReservedContainerLeakWhenMoveApplication() throws Exception {
Assert.assertEquals(0, desQueue.getUsedResources().getMemorySize());
rm1.close();
}
@Test
public void testRefreshQueueWithOpenPreemption() throws Exception {
CapacitySchedulerConfiguration csConf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The line limit is 100 chars so this should fit.

mgr.init(conf);
MockRM rm1 = new MockRM(csConf);
CapacityScheduler scheduler=(CapacityScheduler) rm1.getResourceScheduler();
PreemptionManager preemptionManager = scheduler.getPreemptionManager();;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

;;

YarnConfiguration conf=new YarnConfiguration(csConf);
conf.setClass(YarnConfiguration.RM_SCHEDULER, CapacityScheduler.class,
ResourceScheduler.class);
RMNodeLabelsManager mgr=new NullRMNodeLabelsManager();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spaces

csConf.setUserLimitFactor("root.a", 100);

YarnConfiguration conf=new YarnConfiguration(csConf);
conf.setClass(YarnConfiguration.RM_SCHEDULER, CapacityScheduler.class,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 line

} catch (InterruptedException e) {
e.printStackTrace();
}
preemptionManager.getKillableContainers("a",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 line

Thread refreshQueueThread = new Thread(()->{
preemptionManager.getWriteLock().lock();
try {
Thread.sleep(1000 * 10);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spaces

@@ -3026,4 +3030,69 @@ public void testReservedContainerLeakWhenMoveApplication() throws Exception {
Assert.assertEquals(0, desQueue.getUsedResources().getMemorySize());
rm1.close();
}
@Test
public void testRefreshQueueWithOpenPreemption() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a description explaining the locking part.

@tomicooler
Copy link
Contributor

Hi @yb12138 @goiri,

is there any plan to fix the remaining review comments and get this merged?
If there is no time, may I take over? (I'll create a new PR)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants