Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-17743: Added support for random datanode ordering in getBlockLocations() #7447

Open
wants to merge 2 commits into
base: trunk
Choose a base branch
from

Conversation

weisong44
Copy link
Member

@weisong44 weisong44 commented Mar 1, 2025

Description of PR

In our environment, we aren't able to rely on data locality due to various reasons, overall the current implementation of sorting by network distance didn't improve overall performance, it also caused unnecessary concentration of load on specific datanodes. We would like to request for a random policy that randomly order datanodes, we expect this policy to improve load distribution. The new policy should be configurable, and it is disabled. When enabled, it replaces the current network topology based datanode ordering.

How was this patch tested?

Unit test & mvn clean install

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? See subject
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? N/A
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0? N/A
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files? N/A

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 50s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 6m 22s Maven dependency ordering for branch
+1 💚 mvninstall 37m 37s trunk passed
+1 💚 compile 18m 32s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 14m 58s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 4m 39s trunk passed
+1 💚 mvnsite 5m 11s trunk passed
-1 ❌ javadoc 1m 22s /branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt hadoop-hdfs in trunk failed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.
+1 💚 javadoc 2m 41s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 5m 52s trunk passed
+1 💚 shadedclient 43m 44s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 31s Maven dependency ordering for patch
+1 💚 mvninstall 2m 0s the patch passed
+1 💚 compile 16m 37s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 16m 37s the patch passed
+1 💚 compile 15m 0s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 15m 0s the patch passed
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 💚 checkstyle 4m 29s the patch passed
+1 💚 mvnsite 3m 6s the patch passed
-1 ❌ javadoc 1m 22s /patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.
+1 💚 javadoc 2m 46s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 6m 11s the patch passed
+1 💚 shadedclient 42m 23s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 14m 28s hadoop-common in the patch passed.
+1 💚 unit 1m 47s hadoop-hdfs in the patch passed.
+1 💚 asflicense 1m 1s The patch does not generate ASF License warnings.
255m 57s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7447/1/artifact/out/Dockerfile
GITHUB PR #7447
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux b5cf1a478597 5.15.0-131-generic #141-Ubuntu SMP Fri Jan 10 21:18:28 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / c7738f3
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7447/1/testReport/
Max. process+thread count 1257 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7447/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 4s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 6m 46s Maven dependency ordering for branch
+1 💚 mvninstall 32m 53s trunk passed
+1 💚 compile 16m 2s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 14m 0s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 4m 34s trunk passed
+1 💚 mvnsite 3m 4s trunk passed
-1 ❌ javadoc 1m 16s /branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt hadoop-hdfs in trunk failed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.
+1 💚 javadoc 2m 39s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 5m 48s trunk passed
+1 💚 shadedclient 37m 48s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 33s Maven dependency ordering for patch
+1 💚 mvninstall 2m 13s the patch passed
+1 💚 compile 17m 0s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 17m 0s the patch passed
+1 💚 compile 15m 39s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 15m 39s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 4m 48s the patch passed
+1 💚 mvnsite 3m 6s the patch passed
-1 ❌ javadoc 1m 16s /patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.
+1 💚 javadoc 2m 50s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 6m 6s the patch passed
+1 💚 shadedclient 38m 8s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 14m 53s hadoop-common in the patch passed.
+1 💚 unit 1m 51s hadoop-hdfs in the patch passed.
+1 💚 asflicense 1m 1s The patch does not generate ASF License warnings.
237m 41s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7447/2/artifact/out/Dockerfile
GITHUB PR #7447
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 2e774f40e987 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 56e55cc
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7447/2/testReport/
Max. process+thread count 3152 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7447/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@weisong44 weisong44 changed the title Added support random datanode ordering in getBlockLocations() HDFS-17743: Added support random datanode ordering in getBlockLocations() Mar 3, 2025
@weisong44 weisong44 changed the title HDFS-17743: Added support random datanode ordering in getBlockLocations() HDFS-17743: Added support for random datanode ordering in getBlockLocations() Mar 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants