Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-16540 Data locality is lost when DataNode pod restarts in kubern… #4170

Merged
merged 1 commit into from
Apr 28, 2022

Conversation

huaxiangsun
Copy link
Contributor

@huaxiangsun huaxiangsun commented Apr 13, 2022

…etes

Description of PR

When Dn with the same uuid is registered with a different ip, host2DatanodeMap needs to be updated accordingly.

How was this patch tested?

Tested 3.3.2 with the patch on a eks cluster, restarted the pod hosting DataNode and HBase region server. After that, doing a major compaction of Hbase region, made sure that locality is kept.

There is also a new unittest case added.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 17m 3s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 41m 35s trunk passed
+1 💚 compile 1m 31s trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 compile 1m 20s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 5s trunk passed
+1 💚 mvnsite 1m 31s trunk passed
+1 💚 javadoc 1m 7s trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 31s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 40s trunk passed
+1 💚 shadedclient 26m 14s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 18s the patch passed
+1 💚 compile 1m 25s the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javac 1m 25s the patch passed
+1 💚 compile 1m 15s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 1m 15s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 53s the patch passed
+1 💚 mvnsite 1m 20s the patch passed
+1 💚 javadoc 0m 54s the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 26s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 32s the patch passed
+1 💚 shadedclient 26m 6s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 341m 40s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 41s The patch does not generate ASF License warnings.
474m 5s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/1/artifact/out/Dockerfile
GITHUB PR #4170
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 98cf76256475 4.15.0-162-generic #170-Ubuntu SMP Mon Oct 18 11:38:05 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 1316ff0eada1e29dec8ca56ab266c9bcbe60051c
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/1/testReport/
Max. process+thread count 2175 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@saintstack saintstack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@ndimiduk ndimiduk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small comments, otherwise looks good to me.

NameNode.stateChangeLog.info("BLOCK* registerDatanode: " + nodeS
+ " is replaced by " + nodeReg + " with the same storageID "
+ nodeReg.getDatanodeUuid());
+ nodeReg.getDatanodeUuid() + ", updateHost2DatanodeMap: " + updateHost2DatanodeMap);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this extra information needed at the INFO level log? I understand that having the value printed is helpful during development, but I don't think it's meaningful to an operator.

Also, if you're here to change a log message, maybe also change it to use the format string version instead of string concatenation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, agree. Let me undo this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to upload the patch which does not log updateHost2DatanodeMap.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 58s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 15m 45s Maven dependency ordering for branch
+1 💚 mvninstall 28m 6s trunk passed
+1 💚 compile 24m 43s trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 compile 20m 54s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 3m 55s trunk passed
+1 💚 mvnsite 25m 47s trunk passed
-1 ❌ javadoc 1m 30s /branch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt root in trunk failed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.
+1 💚 javadoc 8m 21s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 39m 0s trunk passed
+1 💚 shadedclient 57m 59s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 28s Maven dependency ordering for patch
+1 💚 mvninstall 25m 40s the patch passed
+1 💚 compile 24m 26s the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javac 24m 26s the patch passed
+1 💚 compile 21m 29s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 21m 29s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 4m 15s the patch passed
+1 💚 mvnsite 20m 28s the patch passed
-1 ❌ javadoc 1m 30s /patch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt root in the patch failed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.
+1 💚 javadoc 8m 27s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 39m 20s the patch passed
+1 💚 shadedclient 58m 8s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 1064m 10s /patch-unit-root.txt root in the patch passed.
+1 💚 asflicense 2m 12s The patch does not generate ASF License warnings.
1431m 51s
Reason Tests
Failed junit tests hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector
hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/4/artifact/out/Dockerfile
GITHUB PR #4170
Optional Tests dupname asflicense codespell compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle
uname Linux c73e326f6bae 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / ccce180d3ac535495ab8937cf6165542c3a89b46
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/4/testReport/
Max. process+thread count 2214 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs . U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 56s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 16m 1s Maven dependency ordering for branch
+1 💚 mvninstall 28m 2s trunk passed
+1 💚 compile 24m 51s trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 compile 21m 32s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 4m 29s trunk passed
+1 💚 mvnsite 19m 53s trunk passed
-1 ❌ javadoc 1m 37s /branch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt root in trunk failed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.
+1 💚 javadoc 8m 28s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 38m 52s trunk passed
+1 💚 shadedclient 57m 54s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 27s Maven dependency ordering for patch
+1 💚 mvninstall 26m 0s the patch passed
+1 💚 compile 24m 27s the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javac 24m 27s the patch passed
+1 💚 compile 21m 38s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 21m 38s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 4m 20s the patch passed
+1 💚 mvnsite 19m 30s the patch passed
-1 ❌ javadoc 1m 26s /patch-javadoc-root-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt root in the patch failed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.
+1 💚 javadoc 8m 58s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 39m 23s the patch passed
+1 💚 shadedclient 57m 59s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 1054m 18s /patch-unit-root.txt root in the patch passed.
+1 💚 asflicense 2m 16s The patch does not generate ASF License warnings.
1417m 34s
Reason Tests
Failed junit tests hadoop.hdfs.TestReplaceDatanodeFailureReplication
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/5/artifact/out/Dockerfile
GITHUB PR #4170
Optional Tests dupname asflicense codespell compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle
uname Linux 95b69934c683 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / f73d13a
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/5/testReport/
Max. process+thread count 2519 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs . U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4170/5/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@huaxiangsun
Copy link
Contributor Author

I run the failed test [.TestReplaceDatanodeFailureReplication.testWithOnlyLastDatanodeIsAlive] locally multiple times, it passed.

@huaxiangsun
Copy link
Contributor Author

Any more comments? Thanks.

Copy link
Contributor

@tomscut tomscut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@saintstack
Copy link
Contributor

I'll merge later today (unless someone else beats me to it).

@saintstack saintstack merged commit bda0881 into apache:trunk Apr 28, 2022
saintstack added a commit that referenced this pull request Apr 28, 2022
…n kubernetes (#4170)"

Revert to add the '.' after HDFS-16540 so commit message format matches
precedent

This reverts commit bda0881.
saintstack added a commit that referenced this pull request Apr 28, 2022
…netes (#4170)

This reverts the previous commit 4e47eb6
undone so I could reapply with the '.' after the HDFS-16540 as is done
in all other commits.
saintstack added a commit to saintstack/hadoop that referenced this pull request Apr 28, 2022
saintstack added a commit to saintstack/hadoop that referenced this pull request May 15, 2022
saintstack added a commit that referenced this pull request May 16, 2022
@Hexiaoqiao
Copy link
Contributor

Sorry for late response. Just found that this PR involved unrelated changes '.BUILDING.txt.swp' under root path of project. If no other concerns I would like to remove it for a while.

HarshitGupta11 pushed a commit to HarshitGupta11/hadoop that referenced this pull request Nov 28, 2022
…etes (apache#4170)

When DN with the same UUID is registered with a different IP, host2DatanodeMap needs to be updated accordingly.
HarshitGupta11 pushed a commit to HarshitGupta11/hadoop that referenced this pull request Nov 28, 2022
…n kubernetes (apache#4170)"

Revert to add the '.' after HDFS-16540 so commit message format matches
precedent

This reverts commit bda0881.
HarshitGupta11 pushed a commit to HarshitGupta11/hadoop that referenced this pull request Nov 28, 2022
…netes (apache#4170)

This reverts the previous commit 4e47eb6
undone so I could reapply with the '.' after the HDFS-16540 as is done
in all other commits.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants