Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-16534. Split FsDatasetImpl from block pool locks to volume grain locks. #4141

Closed

Conversation

MingXiangLi
Copy link
Contributor

https://issues.apache.org/jira/browse/HDFS-15382 have split lock to block pool grain and do some prepare.This pr is the last part of volume lock.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 38s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 4 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 40m 25s trunk passed
+1 💚 compile 1m 25s trunk passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 compile 1m 18s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 3s trunk passed
+1 💚 mvnsite 1m 28s trunk passed
+1 💚 javadoc 1m 6s trunk passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 29s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 43s trunk passed
+1 💚 shadedclient 24m 37s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 18s the patch passed
+1 💚 compile 1m 19s the patch passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javac 1m 19s the patch passed
+1 💚 compile 1m 17s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 1m 17s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 52s /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 135 unchanged - 0 fixed = 136 total (was 135)
+1 💚 mvnsite 1m 20s the patch passed
+1 💚 javadoc 0m 57s the patch passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 26s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 40s the patch passed
+1 💚 shadedclient 26m 5s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 378m 31s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 1m 11s The patch does not generate ASF License warnings.
491m 55s
Reason Tests
Failed junit tests hadoop.hdfs.server.namenode.TestHostsFiles
hadoop.hdfs.server.blockmanagement.TestNodeCount
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
hadoop.hdfs.TestFileAppend4
hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness
hadoop.hdfs.server.namenode.TestUpgradeDomainBlockPlacementPolicy
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
hadoop.hdfs.server.namenode.sps.TestStoragePolicySatisfierWithStripedFile
hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
hadoop.hdfs.server.datanode.TestBlockReplacement
hadoop.hdfs.server.datanode.TestBlockRecovery2
hadoop.hdfs.server.datanode.TestDataNodeMetrics
hadoop.hdfs.server.mover.TestStorageMover
hadoop.hdfs.TestSetrepIncreasing
hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks
hadoop.hdfs.TestPread
hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage
hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport
hadoop.hdfs.TestClientProtocolForPipelineRecovery
hadoop.hdfs.server.blockmanagement.TestPendingReconstruction
hadoop.hdfs.TestEncryptedTransfer
hadoop.hdfs.TestCrcCorruption
hadoop.hdfs.server.datanode.TestDeleteBlockPool
hadoop.hdfs.server.datanode.TestTransferRbw
hadoop.hdfs.server.datanode.TestDataNodeTcpNoDelay
hadoop.hdfs.TestBlockStoragePolicy
hadoop.hdfs.server.balancer.TestBalancer
hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics
hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics
hadoop.hdfs.server.namenode.TestProcessCorruptBlocks
hadoop.hdfs.TestMissingBlocksAlert
hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
hadoop.hdfs.TestFileAppendRestart
hadoop.hdfs.server.datanode.fsdataset.impl.TestWriteToReplica
hadoop.hdfs.server.namenode.TestReconstructStripedBlocks
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4141/1/artifact/out/Dockerfile
GITHUB PR #4141
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 8edfa5f0330a 4.15.0-161-generic #169-Ubuntu SMP Fri Oct 15 13:41:54 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 8e2e66fd915bedb439c7d6ad14125e7762e5c0ac
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4141/1/testReport/
Max. process+thread count 2717 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4141/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@MingXiangLi MingXiangLi changed the title HDFS-15382. Split one FsDatasetImpl lock to volume grain locks. HDFS-16534. Split FsDatasetImpl from block pool locks to volume grain locks. Apr 9, 2022
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 12m 24s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 4 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 38m 41s trunk passed
+1 💚 compile 1m 28s trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 compile 1m 22s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 6s trunk passed
+1 💚 mvnsite 1m 32s trunk passed
+1 💚 javadoc 1m 6s trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 30s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 29s trunk passed
+1 💚 shadedclient 22m 51s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 17s the patch passed
+1 💚 compile 1m 20s the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javac 1m 20s the patch passed
+1 💚 compile 1m 14s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 1m 14s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 53s /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 135 unchanged - 0 fixed = 136 total (was 135)
+1 💚 mvnsite 1m 23s the patch passed
+1 💚 javadoc 0m 53s the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 26s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 18s the patch passed
+1 💚 shadedclient 22m 38s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 242m 34s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 44s The patch does not generate ASF License warnings.
360m 25s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4141/2/artifact/out/Dockerfile
GITHUB PR #4141
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 136d49f8e6b1 4.15.0-161-generic #169-Ubuntu SMP Fri Oct 15 13:41:54 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 315565e794d209e64bcd881fd110c646d605d533
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4141/2/testReport/
Max. process+thread count 3471 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4141/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@Hexiaoqiao Hexiaoqiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that there are different method which is not improve the lock level from BlockPool level to Volume level, such as onCompleteLazyPersist, evictBlocks and others, any consideration?

@@ -629,6 +634,9 @@ public void removeVolumes(
synchronized (this) {
for (String storageUuid : storageToRemove) {
storageMap.remove(storageUuid);
for (String bp : volumeMap.getBlockPoolList()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to synchronized this segment? IMO, lock remove is thread safe here, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may add lot of locks. All method related add/remove locks is get synchronized first.

.getNumBytes());
FsVolumeImpl v = (FsVolumeImpl) ref.getVolume();
ReplicaInPipeline newReplicaInfo;
FsVolumeReference ref = volumes.getNextVolume(storageType, storageId, b
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is thread safe without any ReadWriteLock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

volumes. getNextVolume() is thread safe, and no need protected by dataset lock.

@Hexiaoqiao
Copy link
Contributor

Please fix checkstyle as Yetus suggest /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt

@MingXiangLi
Copy link
Contributor Author

MingXiangLi commented Apr 11, 2022

1、some method is not good change to volume lock, and if split to volume lock it have to get locks and release locks sequence(like acquire lock1, lock2, lock3, release lock 3 lock2 lock1).So just acquire block pool lock is enough.
2、some method like contains() is no need to acquire volume lock.Now it acquire block pool lock read lock, so it no need to acquire block pool lock read lock and volume lock.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 35s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 4 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 38m 35s trunk passed
+1 💚 compile 1m 29s trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 compile 1m 22s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 5s trunk passed
+1 💚 mvnsite 1m 33s trunk passed
+1 💚 javadoc 1m 7s trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 32s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 32s trunk passed
+1 💚 shadedclient 23m 55s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 31s the patch passed
+1 💚 compile 1m 30s the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javac 1m 30s the patch passed
+1 💚 compile 1m 22s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 1m 22s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 58s the patch passed
+1 💚 mvnsite 1m 25s the patch passed
+1 💚 javadoc 1m 1s the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 29s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 21s the patch passed
+1 💚 shadedclient 23m 3s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 241m 57s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 50s The patch does not generate ASF License warnings.
350m 24s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4141/3/artifact/out/Dockerfile
GITHUB PR #4141
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux ae792b971a1e 4.15.0-161-generic #169-Ubuntu SMP Fri Oct 15 13:41:54 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 6f4ffd6
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4141/3/testReport/
Max. process+thread count 3225 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4141/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@Hexiaoqiao
Copy link
Contributor

LGTM +1 from my side. I would like to check into trunk if no furthermore feedback while 5 work days later.

1、some method is not good change to volume lock, and if split to volume lock it have to get locks and release locks sequence(like acquire lock1, lock2, lock3, release lock 3 lock2 lock1).So just acquire block pool lock is enough.
2、some method like contains() is no need to acquire volume lock.Now it acquire block pool lock read lock, so it no need to acquire block pool lock read lock and volume lock.

I would like to clarify this explanation. This improvement only change part of methods from block pool level lock to volume level lock rather all. Because,
A. No Improvement. Such as for add/remove Volume method, Lock level 0 (block pool level lock) is enough and safe to execute logic. Some others is similar.
or
B. No Necessary. For some query request, acquire block pool level read lock has safe-guarded, so it is not necessary to acquire volume level read lock again.

Thanks @MingXiangLi for your works.

asfgit pushed a commit that referenced this pull request Apr 17, 2022
… locks. (#4141) Contributed by limingxiang.

Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
@Hexiaoqiao
Copy link
Contributor

Committed to trunk. Thanks @MingXiangLi for your contributions.

@Hexiaoqiao Hexiaoqiao closed this Apr 17, 2022
HarshitGupta11 pushed a commit to HarshitGupta11/hadoop that referenced this pull request Nov 28, 2022
… locks. (apache#4141) Contributed by limingxiang.

Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants