Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-17462. RBF: Fix NPE in Router concat when trg is an empty file. #6722

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from

Conversation

fannaihao
Copy link

@fannaihao fannaihao commented Apr 11, 2024

Description of PR

When trg of Router concat is an empty file, it will trigger NPE in Router, and the concat will fail, example:
image

This is because when trg is an empty file, NameNode will return lastLocatedBlock as null in the response of getBlockLocations. And Router will not check null of lastLocatedBlock returned, instead Router will use it to get block pool id directly.
Trg of concat is an empty file should be allowed in router since this case is supported by concat of NameNode.
This PR fix this NPE exception.

How was this patch tested?

Test cases are:
image

For code changes:

If lastLocatedBlock returned from getBlockLocations is null in Router concat, it will not be used to get block pool id.
In this case, the block pool id check of trg will be delayed, i.e., concat continues to get and check block pool id of files in src, and only check them.
And the check of trg block pool id can be achieved in following steps, i.e., getLocationForPath and the request of concat forwarded to NameNode.
And exceptions will be thrown if block pool id of trg is not match with the block pool id of any file in src.

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

Copy link
Member

@ayushtkn ayushtkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add a test case for the change. I am not sure why are you going with expectedBlockPoolid thing? the getBlockLocations() call succeeded that means the file actually exists on one of the cluster, so you have a way to get the exact block pool id.

Maybe a better approach would be to do a getFileInfo() and verify if they are in same namespace?

@fannaihao
Copy link
Author

@ayushtkn Thanks for the comment! I will add tests for this.
Yes, I also think it's better to have a direct way to get the block pool id of an empty file then verify, but I failed to find such methods..
I checked getFileInfo, seems it does not return info about block pool id? If I miss something, please let me know.
The expectedBlockPoolId thing is somehow indirect, but it works right..

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
_ Prechecks _
+1 💚 dupname 0m 00s No case conflicting files found.
+0 🆗 spotbugs 0m 00s spotbugs executables are not available.
+0 🆗 codespell 0m 00s codespell was not available.
+0 🆗 detsecrets 0m 00s detect-secrets was not available.
+1 💚 @author 0m 00s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 00s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 87m 10s trunk passed
+1 💚 compile 4m 55s trunk passed
+1 💚 checkstyle 4m 33s trunk passed
+1 💚 mvnsite 4m 53s trunk passed
+1 💚 javadoc 4m 31s trunk passed
+1 💚 shadedclient 137m 05s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 3m 03s the patch passed
+1 💚 compile 2m 24s the patch passed
+1 💚 javac 2m 24s the patch passed
+1 💚 blanks 0m 00s The patch has no blanks issues.
+1 💚 checkstyle 2m 08s the patch passed
+1 💚 mvnsite 2m 27s the patch passed
+1 💚 javadoc 2m 16s the patch passed
+1 💚 shadedclient 151m 08s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 asflicense 5m 26s The patch does not generate ASF License warnings.
398m 52s
Subsystem Report/Notes
GITHUB PR #6722
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname MINGW64_NT-10.0-17763 81e6d3c33564 3.4.10-87d57229.x86_64 2024-02-14 20:17 UTC x86_64 Msys
Build tool maven
Personality /c/hadoop/dev-support/bin/hadoop.sh
git revision trunk / cff78d9
Default Java Azul Systems, Inc.-1.8.0_332-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6722/1/testReport/
modules C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6722/1/console
versions git=2.44.0.windows.1
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@Hexiaoqiao Hexiaoqiao changed the title HDFS-17462. Fix NPE in Router concat when trg is an empty ile HDFS-17462. RBF: Fix NPE in Router concat when trg is an empty file. Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants