-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix HDFS copy logic #5218
Fix HDFS copy logic #5218
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw copy method is used in HadoopSegmentGenerationJobRunner
, where it copies the newly generated segments from staging dir to the final output dir. But why do the new segments store in different sub-dirs that makes you make this change?
...t-file-system/pinot-hdfs/src/main/java/org/apache/pinot/plugin/filesystem/HadoopPinotFS.java
Show resolved
Hide resolved
For all other pinot fs, the behavior keeps same as directory copying retains sub-dir structure. Otherwise, what if there are files with same name? If user wants to just copy all the files recursively from a directory and flatten to a new directory, then user should list all the files then do file copy to dest dir. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation! Minor comments but LGTM.
...t-file-system/pinot-hdfs/src/main/java/org/apache/pinot/plugin/filesystem/HadoopPinotFS.java
Show resolved
Hide resolved
cfd0b2c
to
71584a9
Compare
71584a9
to
a208fb0
Compare
Codecov Report
@@ Coverage Diff @@
## master #5218 +/- ##
============================================
+ Coverage 65.90% 66.16% +0.25%
============================================
Files 1052 1067 +15
Lines 54170 54311 +141
Branches 8078 8074 -4
============================================
+ Hits 35702 35935 +233
+ Misses 15819 15741 -78
+ Partials 2649 2635 -14
Continue to review full report at Codecov.
|
Current hdfs cp will copy all the files from src dir to dest dir as a flatten layout.
This fix will ensure the dest directory could retain the same structure as src directory.
For HadoopPinotFS.copy(src,dest) behavior:
Before:
=>
New logic:
=>