Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-15610 Reduced datanode upgrade/hardlink thread from 12 to 6 #2365

Merged
merged 1 commit into from
Oct 8, 2020

Conversation

karthikhw
Copy link
Contributor

@karthikhw karthikhw commented Oct 6, 2020

Apache jira:
https://issues.apache.org/jira/browse/HDFS-15610

Description:
There is a kernel overhead on datanode upgrade. If datanode with millions of blocks and 10+ disks then block-layout migration will be super expensive during its hardlink operation. Slowness is observed when running with large hardlink threads(dfs.datanode.block.id.layout.upgrade.threads, default is 12 thread for each disk) and its runs for 2+ hours.

I.e 10*12=120 threads (for 10 disks)

Small test:

RHEL7, 32 cores, 20 GB RAM, 8 GB DN heap

dfs.datanode.block.id.layout.upgrade.threads | Blocks | Disks | Time taken
12 | 3.3 Million | 1 | 2 minutes and 59 seconds
6 | 3.3 Million | 1 | 2 minutes and 35 seconds
3 | 3.3 Million | 1 | 2 minutes and 51 seconds

Tried same test twice and 95% is accurate (only a few sec difference on each iteration). Using 6 thread is faster than 12 thread because of its overhead.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 8s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 32m 22s trunk passed
+1 💚 compile 1m 23s trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 compile 1m 8s trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚 checkstyle 0m 54s trunk passed
+1 💚 mvnsite 1m 50s trunk passed
+1 💚 shadedclient 19m 17s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 54s trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 javadoc 1m 26s trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+0 🆗 spotbugs 3m 19s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 3m 17s trunk passed
_ Patch Compile Tests _
+1 💚 mvninstall 1m 12s the patch passed
+1 💚 compile 1m 12s the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 javac 1m 12s the patch passed
+1 💚 compile 1m 4s the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚 javac 1m 4s the patch passed
+1 💚 checkstyle 0m 48s the patch passed
+1 💚 mvnsite 1m 11s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 xml 0m 1s The patch has no ill-formed XML file.
+1 💚 shadedclient 15m 36s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 48s the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 javadoc 1m 23s the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚 findbugs 3m 19s the patch passed
_ Other Tests _
-1 ❌ unit 115m 28s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 37s The patch does not generate ASF License warnings.
207m 26s
Reason Tests
Failed junit tests hadoop.hdfs.server.datanode.TestBPOfferService
hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier
hadoop.hdfs.TestDFSShell
hadoop.hdfs.TestFileChecksum
hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped
hadoop.hdfs.TestFileChecksumCompositeCrc
Subsystem Report/Notes
Docker ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2365/1/artifact/out/Dockerfile
GITHUB PR #2365
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml
uname Linux df8ebce07039 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 1cfe591
Default Java Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2365/1/testReport/
Max. process+thread count 2857 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2365/1/console
versions git=2.17.1 maven=3.6.0 findbugs=4.0.6
Powered by Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 7s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 32m 1s trunk passed
+1 💚 compile 1m 16s trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 compile 1m 8s trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚 checkstyle 0m 53s trunk passed
+1 💚 mvnsite 1m 18s trunk passed
+1 💚 shadedclient 17m 40s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 50s trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 javadoc 1m 24s trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+0 🆗 spotbugs 3m 8s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 3m 6s trunk passed
_ Patch Compile Tests _
+1 💚 mvninstall 1m 8s the patch passed
+1 💚 compile 1m 11s the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 javac 1m 11s the patch passed
+1 💚 compile 1m 3s the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚 javac 1m 3s the patch passed
+1 💚 checkstyle 0m 47s the patch passed
+1 💚 mvnsite 1m 10s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 xml 0m 1s The patch has no ill-formed XML file.
+1 💚 shadedclient 15m 45s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 45s the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 javadoc 1m 22s the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚 findbugs 3m 12s the patch passed
_ Other Tests _
-1 ❌ unit 109m 15s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 35s The patch does not generate ASF License warnings.
198m 35s
Reason Tests
Failed junit tests hadoop.hdfs.web.TestWebHDFS
hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier
hadoop.hdfs.TestDFSShell
hadoop.hdfs.TestFileChecksum
hadoop.hdfs.TestFileChecksumCompositeCrc
hadoop.hdfs.TestSafeModeWithStripedFile
Subsystem Report/Notes
Docker ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2365/3/artifact/out/Dockerfile
GITHUB PR #2365
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml
uname Linux 98fe6dbf219b 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 82522d6
Default Java Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2365/3/testReport/
Max. process+thread count 3199 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2365/3/console
versions git=2.17.1 maven=3.6.0 findbugs=4.0.6
Powered by Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@lokeshj1703 lokeshj1703 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karthikhw Thanks for working on this! The changes look good to me. +1.

@lokeshj1703
Copy link
Contributor

The failed tests pass locally.

@lokeshj1703 lokeshj1703 merged commit 735e85a into apache:trunk Oct 8, 2020
asfgit pushed a commit that referenced this pull request Mar 24, 2021
jojochuang pushed a commit to jojochuang/hadoop that referenced this pull request May 23, 2023
…che#2365)

(cherry picked from commit 735e85a)
Change-Id: I666ebdcd3f2397a894fdac3690108e74cbb67035
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants