-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HADOOP-17742. fix distcp fail when copying to ftp filesystem #3071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
🎊 +1 overall
This message was automatically generated. |
|
@steveloughran hi, would you help me review the commit |
steveloughran
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've commented on the distcp side of the changes,.
Regarding FTP
- That Test suite MUST be restored. That's where the entire semantics of rename in the FS are validated.
- Looking at the history of the code you are cutting, it went in because FTP couldn't rename files into different directories (HADOOP-9361).
What has changed since then?
AFAIK,. the FTP API used by the client has to change to the directory and rename by short name only, not the full the path.
| Path root = target.equals(targetWorkPath) ? targetWorkPath.getParent() | ||
| : targetWorkPath; | ||
| Path tempFile = new Path(root, ".distcp.tmp." + | ||
| String tempFilePrefix = DistCpUtils.getTargetTempFilePrefix(fileSystem); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assuming the Path is resolved, the FS Scheme is that of Path.toURI.getScheme(); no need to pass the FS around.
| public static final String CLASS_INSTANTIATION_ERROR_MSG = | ||
| "Unable to instantiate "; | ||
|
|
||
| public static final String TARGET_TEMP_FILE_PREFIX_COMMA = ".distcp.tmp."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- use DOT instead of COMMA here.
- add javadoc to explain what it does
|
|
||
| public static final String TARGET_TEMP_FILE_PREFIX_COMMA = ".distcp.tmp."; | ||
|
|
||
| public static final String TARGET_TEMP_FILE_PREFIX = "distcp.tmp."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add _FTP and javadocs to explain why it is needed
| throws IOException { | ||
| LOG.info("Copying {} to {}", source.getPath(), target); | ||
|
|
||
| final Configuration configuration = context.getConfiguration(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you just get the schema in getTempFile() from the path, no need to pass in FileSystem, so no need to move this up.
| } | ||
|
|
||
| /** | ||
| * Return the target temp file prefix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add . on javadoc sentence to keep javadoc happy
| String tempFilePrefix = DistCpUtils.getTargetTempFilePrefix(targetFS); | ||
| FileStatus[] tempFiles = targetFS.globStatus( | ||
| new Path(targetWorkPath, ".distcp.tmp." + jobId.replaceAll("job","attempt") + "*")); | ||
| new Path(targetWorkPath, tempFilePrefix + jobId.replaceAll("job","attempt") + "*")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably need to cut the line down to keep it under 80 chars width
Thanks, I will update the distCp side of the changes. Regarding FTP, the test suite could be need to modify. After my test using commons-net:3.6, the rename can be work using the full path, just only have the parent directory, and don't have the limit of source path and target path have common parent directory. So, I think the rename method of FTPFilesystem should be modified so that it has the same behavior as other file systems. |
…pache#3042) Contributed by kaifeiYi (yikf). Signed-off-by: Mingliang Liu <liuml07@apache.org> Reviewed-by: Steve Loughran <stevel@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
…doop-mapreduce-project (apache#3074)
…op-common, hadoop-tools and cloud-storage projects (apache#3072)
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
more information:
https://issues.apache.org/jira/browse/HADOOP-17742#