New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-16158. DistCp to support checksum validation when copy blocks in parallel #919
Conversation
💔 -1 overall
This message was automatically generated. |
null, taskAttemptContext); | ||
try { | ||
committer.commitJob(jobContext); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whitespace:end of line
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/util/DistCpUtils.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/util/DistCpUtils.java
Outdated
Show resolved
Hide resolved
💔 -1 overall
This message was automatically generated. |
@kai33 could you check if a rebase is needed? |
I'll take a look
…On Thu, Aug 8, 2019, 11:30 AM Wei-Chiu Chuang ***@***.***> wrote:
@kai33 <https://github.com/kai33> could you check if a rebase is needed?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#919?email_source=notifications&email_token=AA3MZOYFDM5TDBWCYCCKKDLQDOHN7A5CNFSM4HVDYYSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD32KCUY#issuecomment-519348563>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA3MZO2LT5744TOZSV44HIDQDOHN7ANCNFSM4HVDYYSA>
.
|
💔 -1 overall
This message was automatically generated. |
02ad72a
to
f26004c
Compare
Thanks. Retriggering the precommit to verify |
🎊 +1 overall
This message was automatically generated. |
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/util/DistCpUtils.java
Show resolved
Hide resolved
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
…in parallel (#919) * DistCp to support checksum validation when copy blocks in parallel * address review comments * add checksums comparison test for combine mode (cherry picked from commit c765584) (cherry picked from commit b3c14d4) Conflicts: hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/util/TestDistCpUtils.java
…opy blocks in parallel (apache#919) * DistCp to support checksum validation when copy blocks in parallel * address review comments * add checksums comparison test for combine mode (cherry picked from commit c765584) (cherry picked from commit b3c14d4) Conflicts: hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/util/TestDistCpUtils.java (cherry picked from commit c1a2b29) Conflicts: hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/util/TestDistCpUtils.java Change-Id: I5cb8890af0744a89edafd034d8db0a5834308b38
…in parallel (apache#919) * DistCp to support checksum validation when copy blocks in parallel * address review comments * add checksums comparison test for combine mode
…in parallel (apache#919) * DistCp to support checksum validation when copy blocks in parallel * address review comments * add checksums comparison test for combine mode (cherry picked from commit c765584) (cherry picked from commit b3c14d4) Conflicts: hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/util/TestDistCpUtils.java
…in parallel (apache#919) * DistCp to support checksum validation when copy blocks in parallel * address review comments * add checksums comparison test for combine mode (cherry picked from commit c765584) (cherry picked from commit b3c14d4) Conflicts: hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/util/TestDistCpUtils.java
HADOOP-16158
Copying blocks in parallel (enabled when blocks per chunk > 0) is a great DistCp improvement that can hugely speed up copying big files.
But its checksum validation is skipped, e.g. in
RetriableFileCopyCommand.java
and this could result in checksum/data mismatch without notifying developers/users (e.g. HADOOP-16049).
I'd like to provide a patch to add the checksum validation.