Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-16498. Fix NPE for checkBlockReportLease #4057

Closed
wants to merge 5 commits into from

Conversation

tomscut
Copy link
Contributor

@tomscut tomscut commented Mar 9, 2022

JIRA: HDFS-16498.

During the restart of Namenode, a Datanode is not registered, but this Datanode triggers FBR, which causes NPE.

image

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 1s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 39m 50s trunk passed
+1 💚 compile 1m 40s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 compile 1m 29s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 9s trunk passed
+1 💚 mvnsite 1m 39s trunk passed
+1 💚 javadoc 1m 6s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 33s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 35s trunk passed
+1 💚 shadedclient 26m 33s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 23s the patch passed
+1 💚 compile 1m 29s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javac 1m 29s the patch passed
+1 💚 compile 1m 18s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 1m 18s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 53s the patch passed
+1 💚 mvnsite 1m 25s the patch passed
+1 💚 javadoc 0m 58s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 30s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
-1 ❌ spotbugs 3m 49s /new-spotbugs-hadoop-hdfs-project_hadoop-hdfs.html hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+1 💚 shadedclient 26m 28s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 373m 41s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 38s The patch does not generate ASF License warnings.
489m 56s
Reason Tests
SpotBugs module:hadoop-hdfs-project/hadoop-hdfs
Load of known null value in org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.checkBlockReportLease(BlockReportContext, DatanodeID) At BlockManager.java:in org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.checkBlockReportLease(BlockReportContext, DatanodeID) At BlockManager.java:[line 2755]
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/1/artifact/out/Dockerfile
GITHUB PR #4057
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 514bc3c25701 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 19:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / a7373dd0f53a983d83c45c7842fefa05939af296
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/1/testReport/
Max. process+thread count 2148 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 17m 16s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 36m 36s trunk passed
+1 💚 compile 1m 30s trunk passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 compile 1m 20s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 0s trunk passed
+1 💚 mvnsite 1m 28s trunk passed
+1 💚 javadoc 1m 3s trunk passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 28s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 29s trunk passed
+1 💚 shadedclient 25m 52s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 22s the patch passed
+1 💚 compile 1m 25s the patch passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javac 1m 25s the patch passed
+1 💚 compile 1m 15s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 1m 15s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 54s the patch passed
+1 💚 mvnsite 1m 23s the patch passed
+1 💚 javadoc 0m 55s the patch passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 24s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
-1 ❌ spotbugs 3m 32s /new-spotbugs-hadoop-hdfs-project_hadoop-hdfs.html hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+1 💚 shadedclient 25m 53s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 353m 53s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 39s The patch does not generate ASF License warnings.
480m 44s
Reason Tests
SpotBugs module:hadoop-hdfs-project/hadoop-hdfs
Load of known null value in org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.checkBlockReportLease(BlockReportContext, DatanodeID) At BlockManager.java:in org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.checkBlockReportLease(BlockReportContext, DatanodeID) At BlockManager.java:[line 2755]
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/2/artifact/out/Dockerfile
GITHUB PR #4057
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux ad116df5d767 4.15.0-162-generic #170-Ubuntu SMP Mon Oct 18 11:38:05 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 16bfb1ac3acb304e5c8a73741798172a84e8f26b
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/2/testReport/
Max. process+thread count 2124 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 53s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 36m 12s trunk passed
+1 💚 compile 1m 30s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 compile 1m 23s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 0s trunk passed
+1 💚 mvnsite 1m 30s trunk passed
+1 💚 javadoc 1m 3s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 32s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 31s trunk passed
+1 💚 shadedclient 26m 8s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 20s the patch passed
+1 💚 compile 1m 26s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javac 1m 26s the patch passed
+1 💚 compile 1m 16s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 1m 16s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 53s the patch passed
+1 💚 mvnsite 1m 19s the patch passed
+1 💚 javadoc 0m 56s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 29s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 30s the patch passed
+1 💚 shadedclient 26m 4s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 377m 3s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 49s The patch does not generate ASF License warnings.
487m 35s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/3/artifact/out/Dockerfile
GITHUB PR #4057
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 729aa612cad7 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 19:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / c2eac05
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/3/testReport/
Max. process+thread count 2105 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 26s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 39m 14s trunk passed
+1 💚 compile 1m 49s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 compile 1m 41s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 12s trunk passed
+1 💚 mvnsite 1m 51s trunk passed
+1 💚 javadoc 1m 19s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 43s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 45s trunk passed
+1 💚 shadedclient 29m 46s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 40s the patch passed
+1 💚 compile 1m 44s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javac 1m 44s the patch passed
+1 💚 compile 1m 25s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 1m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 59s the patch passed
+1 💚 mvnsite 1m 36s the patch passed
+1 💚 javadoc 1m 4s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 39s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 47s the patch passed
+1 💚 shadedclient 29m 24s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 375m 36s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 38s The patch does not generate ASF License warnings.
500m 21s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/4/artifact/out/Dockerfile
GITHUB PR #4057
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux d7a4f3557755 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 19:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 5d84c9b
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/4/testReport/
Max. process+thread count 2114 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@tomscut
Copy link
Contributor Author

tomscut commented Mar 15, 2022

Hi @jojochuang @tasanuma @ayushtkn @Hexiaoqiao @ferhui , could you please review this PR. Thanks.

LOG.debug("Datanode {} is attempting to report but not register yet.",
LOG.warn("Datanode {} is attempting to report but not register yet.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this some normal stuff, do we need to change this to warn?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ayushtkn for your review. This log output is generated when the NN is restarted, and it does seem normal. I'll change it back later.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 12s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 36m 37s trunk passed
+1 💚 compile 1m 28s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 compile 1m 21s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 1s trunk passed
+1 💚 mvnsite 1m 29s trunk passed
+1 💚 javadoc 1m 3s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 35s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 27s trunk passed
+1 💚 shadedclient 25m 47s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 19s the patch passed
+1 💚 compile 1m 27s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javac 1m 27s the patch passed
+1 💚 compile 1m 20s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 1m 20s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 54s the patch passed
+1 💚 mvnsite 1m 24s the patch passed
+1 💚 javadoc 0m 55s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 28s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 32s the patch passed
+1 💚 shadedclient 25m 42s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 350m 17s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 37s The patch does not generate ASF License warnings.
460m 45s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/5/artifact/out/Dockerfile
GITHUB PR #4057
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux bae9ed77552e 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 19:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 6b12706
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/5/testReport/
Max. process+thread count 2124 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/5/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@tomscut
Copy link
Contributor Author

tomscut commented Mar 21, 2022

Hi @ayushtkn @Hexiaoqiao @ferhui , please take a look at this. Thanks.

@tomscut tomscut requested a review from ayushtkn March 21, 2022 07:57
Copy link
Member

@ayushtkn ayushtkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a quick look, prod change makes sense to me, the datanodeManager.getDatanode(nodeID) methods shows it can register null, if the node isn't found.

The test is little complex and isn't showing some actual scenario. If this issue is caused by some delay or race condition, will mocking something to create some delays and so help?

Try to get a test which shows the actual scenario, I found it really hard to follow this test.
If nothing helps, the worst would be add some comments explaining things in details in the test

Comment on lines 2755 to 2757
final UnregisteredNodeException e = new UnregisteredNodeException(nodeID, null);
NameNode.stateChangeLog.error("BLOCK* NameSystem.getDatanode: " + e.getLocalizedMessage());
throw e;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share me the log message and the exception trace post this. We have passed null here in the exception. I feel like it can lead to something like:
Node null is expected to serve this storage

Which doesn't make sense to me. May be some more appropriate message should be there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share me the log message and the exception trace post this. We have passed null here in the exception. I feel like it can lead to something like: Node null is expected to serve this storage

Which doesn't make sense to me. May be some more appropriate message should be there.

@ayushtkn Yes, you are right. The log will be: Node null is expected to serve this storage. I passed node in last commit, but there is a spotbug: link. So I passed null. This is just for adaptation UnregisteredNodeException constructor.

Do you have any suggestions for this. Thank you very much.

@tomscut
Copy link
Contributor Author

tomscut commented Mar 21, 2022

Had a quick look, prod change makes sense to me, the datanodeManager.getDatanode(nodeID) methods shows it can register null, if the node isn't found.

The test is little complex and isn't showing some actual scenario. If this issue is caused by some delay or race condition, will mocking something to create some delays and so help?

Try to get a test which shows the actual scenario, I found it really hard to follow this test. If nothing helps, the worst would be add some comments explaining things in details in the test

Thank you @ayushtkn very much for your review and detailed suggestions. I will update the code.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 57s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 35m 59s trunk passed
+1 💚 compile 1m 32s trunk passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 compile 1m 22s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 0s trunk passed
+1 💚 mvnsite 1m 31s trunk passed
+1 💚 javadoc 1m 4s trunk passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 35s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 24s trunk passed
+1 💚 shadedclient 25m 43s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 19s the patch passed
+1 💚 compile 1m 25s the patch passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javac 1m 25s the patch passed
+1 💚 compile 1m 18s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 1m 18s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 53s the patch passed
+1 💚 mvnsite 1m 22s the patch passed
+1 💚 javadoc 0m 56s the patch passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 30s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 26s the patch passed
+1 💚 shadedclient 25m 47s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 444m 6s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 52s The patch does not generate ASF License warnings.
553m 59s
Reason Tests
Failed junit tests hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
hadoop.hdfs.server.namenode.TestFileTruncate
hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes
hadoop.hdfs.server.namenode.TestFsck
hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR
hadoop.hdfs.server.namenode.TestCheckpoint
hadoop.hdfs.server.namenode.ha.TestHASafeMode
hadoop.hdfs.server.namenode.TestAddBlock
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/6/artifact/out/Dockerfile
GITHUB PR #4057
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux bbfd86f52e7c 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 19:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 1558c10
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/6/testReport/
Max. process+thread count 2607 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/6/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@tomscut
Copy link
Contributor Author

tomscut commented Mar 22, 2022

Hi @ayushtkn , I fixed the problem you mentioned, please have a look. Thanks.

@tomscut
Copy link
Contributor Author

tomscut commented Mar 24, 2022

Hi @Hexiaoqiao @tasanuma @ferhui , could you also please review this? Thanks.

Copy link
Contributor

@Hexiaoqiao Hexiaoqiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tomscut involve me here. It makes sense to me generally. Leave one nit comment inline. cc @ayushtkn Would you mind to take another review?

@@ -2751,6 +2751,13 @@ public boolean checkBlockReportLease(BlockReportContext context,
return true;
}
DatanodeDescriptor node = datanodeManager.getDatanode(nodeID);
if (node == null) {
final UnregisteredNodeException e = new UnregisteredNodeException(nodeID, null);
NameNode.stateChangeLog.error("BLOCK* NameSystem.getDatanode: " + "Data node " + nodeID +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused if we need log here which upper method also do that when meet UnregisteredNodeException. Although it is DEBU level now, we could improve that to WARN, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused if we need log here which upper method also do that when meet UnregisteredNodeException. Although it is DEBU level now, we could improve that to WARN, right?

Thanks @Hexiaoqiao for your review. I will wait for @ayushtkn reply and decide whether to update again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomscut you can remove this exception from here, considering it is being logged in the upper method.

Although it is DEBU level now, we could improve that to WARN, right?

I think debug is also fine, this doesn't denote any problematic stuff, this can happen in normal scenario as well and the datanode will register subsequently

Copy link
Contributor Author

@tomscut tomscut Mar 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomscut you can remove this exception from here, considering it is being logged in the upper method.

Although it is DEBU level now, we could improve that to WARN, right?

I think debug is also fine, this doesn't denote any problematic stuff, this can happen in normal scenario as well and the datanode will register subsequently

Hi @ayushtkn, do you mean I just remove this log?

NameNode.stateChangeLog.error("BLOCK* NameSystem.getDatanode: " + "Data node " + nodeID +
          " is attempting to report storage ID " + nodeID.getDatanodeUuid() +
          ". But this node is not registered.");

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeps,
But if @Hexiaoqiao wants to change the log in the upper method to log. I am good with that as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeps, But if @Hexiaoqiao wants to change the log in the upper method to log. I am good with that as well

Thank you very much for your comments. I will update the code.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 58s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 39m 50s trunk passed
+1 💚 compile 1m 44s trunk passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 compile 1m 35s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 0m 59s trunk passed
+1 💚 mvnsite 1m 32s trunk passed
+1 💚 javadoc 1m 4s trunk passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 35s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 33s trunk passed
+1 💚 shadedclient 27m 12s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 35s the patch passed
+1 💚 compile 1m 39s the patch passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javac 1m 39s the patch passed
+1 💚 compile 1m 30s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 1m 30s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 0s the patch passed
+1 💚 mvnsite 1m 40s the patch passed
+1 💚 javadoc 1m 8s the patch passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 42s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 4m 1s the patch passed
+1 💚 shadedclient 29m 54s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 408m 45s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 41s The patch does not generate ASF License warnings.
530m 29s
Reason Tests
Failed junit tests hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/7/artifact/out/Dockerfile
GITHUB PR #4057
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 89d183d6eac2 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 19:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 207024b
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/7/testReport/
Max. process+thread count 2120 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4057/7/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@Hexiaoqiao
Copy link
Contributor

@tomscut Please check if the failed unit test is related with this changes.

@tomscut
Copy link
Contributor Author

tomscut commented Mar 27, 2022

@tomscut Please check if the failed unit test is related with this changes.

Hi @Hexiaoqiao , the failed unit test is unrelated to the change, and has been run locally multiple times with good results.

Copy link
Contributor

@Hexiaoqiao Hexiaoqiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Let's wait if @ayushtkn has any furthermore comments.

Copy link
Member

@ayushtkn ayushtkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

asfgit pushed a commit that referenced this pull request Mar 30, 2022
…omscut.

Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
@Hexiaoqiao
Copy link
Contributor

Committed to trunk. Thanks @tomscut for your contribution! Thanks @ayushtkn for your reviews!

@Hexiaoqiao Hexiaoqiao closed this Mar 30, 2022
@tomscut
Copy link
Contributor Author

tomscut commented Mar 30, 2022

Thanks @Hexiaoqiao and @ayushtkn .

HarshitGupta11 pushed a commit to HarshitGupta11/hadoop that referenced this pull request Nov 28, 2022
…d by tomscut.

Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
jojochuang pushed a commit to jojochuang/hadoop that referenced this pull request May 23, 2023
… Contributed by tomscut.

Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
(cherry picked from commit 6eea28c)
Change-Id: I757513e09f242ff2efbb2ef81e294a3000fd4b39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants