Skip to content

Conversation

@symious
Copy link
Contributor

@symious symious commented Apr 27, 2023

Description of PR

We encountered some errors of mismatch checksum during Distcp jobs. It took us some time to figure out that checksum type is different.

Adding error logs shall help us to figure out such problems faster.

How was this patch tested?

Add unit test.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 50s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 42m 13s trunk passed
+1 💚 compile 0m 30s trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 0m 27s trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 checkstyle 0m 28s trunk passed
+1 💚 mvnsite 0m 31s trunk passed
+1 💚 javadoc 0m 31s trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 0m 24s trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 spotbugs 0m 59s trunk passed
+1 💚 shadedclient 23m 5s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 24s the patch passed
+1 💚 compile 0m 24s the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 0m 24s the patch passed
+1 💚 compile 0m 21s the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 javac 0m 21s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 14s /results-checkstyle-hadoop-tools_hadoop-distcp.txt hadoop-tools/hadoop-distcp: The patch generated 2 new + 20 unchanged - 0 fixed = 22 total (was 20)
+1 💚 mvnsite 0m 24s the patch passed
+1 💚 javadoc 0m 19s the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 0m 17s the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 spotbugs 0m 48s the patch passed
+1 💚 shadedclient 23m 10s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 14m 50s hadoop-distcp in the patch passed.
+1 💚 asflicense 0m 32s The patch does not generate ASF License warnings.
113m 56s
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5603/1/artifact/out/Dockerfile
GITHUB PR #5603
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 2addaf850a03 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / b1060a8
Default Java Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5603/1/testReport/
Max. process+thread count 632 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5603/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Member

@ayushtkn ayushtkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checkstyle has complains

@symious
Copy link
Contributor Author

symious commented Apr 27, 2023

@ayushtkn Thank you for the review. Updated the PR, PTAL.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 46s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 42m 19s trunk passed
+1 💚 compile 0m 30s trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 0m 27s trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 checkstyle 0m 27s trunk passed
+1 💚 mvnsite 0m 31s trunk passed
+1 💚 javadoc 0m 32s trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 0m 23s trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 spotbugs 0m 58s trunk passed
+1 💚 shadedclient 23m 7s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 34s the patch passed
+1 💚 compile 0m 25s the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 0m 25s the patch passed
+1 💚 compile 0m 21s the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 javac 0m 21s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 16s the patch passed
+1 💚 mvnsite 0m 24s the patch passed
+1 💚 javadoc 0m 18s the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 0m 17s the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 spotbugs 0m 48s the patch passed
+1 💚 shadedclient 23m 19s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 14m 34s hadoop-distcp in the patch passed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
114m 6s
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5603/2/artifact/out/Dockerfile
GITHUB PR #5603
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux be8441455bf1 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / a832a9b
Default Java Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5603/2/testReport/
Max. process+thread count 619 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5603/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Comment on lines 601 to 602
LOG.error("Checksum not equal. Source checksum: {}, target checksum: {}",
sourceChecksum, targetChecksum);
Copy link
Member

@ayushtkn ayushtkn Apr 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error seems to much for this case, info should be more than enough for these cases, and I don't think we need to put the checksum in the log, rather put the sourcePath and targetPath, or at worst both path & checksum

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log checksums at debug maybe

anyway, doesn't a checksum mismatch mean "source file needs copying again". so really the thing to log is not the mismatch but why the copy is taking place

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented. it is way too noisy for any cloud upload right now.

would be good here for you to run the s3 and abfs distcp contract tests and look at their output, or do a distcp from hdfs to one of these stores and see what the logs look like.

I am with @ayushtkn here: we must save ERROR log messages to where distcp actually fails

Comment on lines 601 to 602
LOG.error("Checksum not equal. Source checksum: {}, target checksum: {}",
sourceChecksum, targetChecksum);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log checksums at debug maybe

anyway, doesn't a checksum mismatch mean "source file needs copying again". so really the thing to log is not the mismatch but why the copy is taking place

// comparison that took place and return not compatible.
// else if matched, return compatible with the matched result.
if (sourceChecksum == null || targetChecksum == null) {
LOG.error("Checksum incompatible. Source checksum: {}, target checksum: {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not unusual against object stores, as all stores which don't have hdfs-+compatible checksums disable them so that distcp to cloud stores don't blow up. If you logged at error then there'd be an entry for every single copy.

propose: use a LogExactlyOnce at info to day source or target fs doesn't support checksums and that they should use -skipCrc

fs, new Path(sourceBase + srcFilename), null,
fs, new Path(targetBase + srcFilename),
sourceCurrStatus.getLen()));
assertTrue(log.getOutput().contains("Checksum not equal"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asserts to use Assert's assertThat(log.getOutput).contains(...) for better message

@symious
Copy link
Contributor Author

symious commented May 3, 2023

The issue we met was not "source file needs copying again", but source file uploaded with a different checksum type.

Because it's distcp, so the temp file was removed after the exception. We need to first skip the crc check, then run "hadoop fs -checksum" on the source file and target file to find out the root cause of this one.

The initial idea of this addition of log was to help others with the same issue to save to process of redundant works and just check the log for mismatch reasons.

@steveloughran @ayushtkn

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 45s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 41m 34s trunk passed
+1 💚 compile 0m 30s trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 0m 27s trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 checkstyle 0m 27s trunk passed
+1 💚 mvnsite 0m 32s trunk passed
+1 💚 javadoc 0m 30s trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 0m 24s trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 spotbugs 1m 1s trunk passed
+1 💚 shadedclient 23m 19s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 24s the patch passed
+1 💚 compile 0m 24s the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
-1 ❌ javac 0m 24s /results-compile-javac-hadoop-tools_hadoop-distcp-jdkUbuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1.txt hadoop-tools_hadoop-distcp-jdkUbuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 generated 1 new + 15 unchanged - 0 fixed = 16 total (was 15)
+1 💚 compile 0m 21s the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
-1 ❌ javac 0m 21s /results-compile-javac-hadoop-tools_hadoop-distcp-jdkPrivateBuild-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09.txt hadoop-tools_hadoop-distcp-jdkPrivateBuild-1.8.0_362-8u362-ga-0ubuntu120.04.1-b09 with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu120.04.1-b09 generated 1 new + 13 unchanged - 0 fixed = 14 total (was 13)
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 14s the patch passed
+1 💚 mvnsite 0m 24s the patch passed
+1 💚 javadoc 0m 18s the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 0m 17s the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 spotbugs 0m 48s the patch passed
+1 💚 shadedclient 23m 20s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 15m 7s hadoop-distcp in the patch passed.
+1 💚 asflicense 0m 41s The patch does not generate ASF License warnings.
113m 53s
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5603/3/artifact/out/Dockerfile
GITHUB PR #5603
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux cd9be08ad945 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 462b340
Default Java Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5603/3/testReport/
Max. process+thread count 530 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5603/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 38s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 47m 2s trunk passed
+1 💚 compile 0m 32s trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 0m 31s trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 checkstyle 0m 52s trunk passed
+1 💚 mvnsite 0m 35s trunk passed
+1 💚 javadoc 0m 34s trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 0m 24s trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 spotbugs 1m 10s trunk passed
+1 💚 shadedclient 20m 23s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 25s the patch passed
+1 💚 compile 0m 27s the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 0m 27s the patch passed
+1 💚 compile 0m 23s the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 javac 0m 23s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 14s the patch passed
+1 💚 mvnsite 0m 27s the patch passed
+1 💚 javadoc 0m 18s the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 0m 18s the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 spotbugs 0m 51s the patch passed
+1 💚 shadedclient 20m 25s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 14m 43s hadoop-distcp in the patch passed.
+1 💚 asflicense 0m 37s The patch does not generate ASF License warnings.
114m 3s
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5603/4/artifact/out/Dockerfile
GITHUB PR #5603
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 2b0a37429a9a 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / c99bcea
Default Java Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5603/4/testReport/
Max. process+thread count 556 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5603/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor

Well, I'm afraid your specific problem does not match Dee why do use cases of uploading to stores without checksums. Now, I would I've been happier if distcp's -skipCrc option was required to copy data from an FS with checksums to one without, but it is not and to add it now would break so many people's workflows.

So what do we do here?

maybe: create counters of why files were copied, specifically

  • not found at destination
  • file length different
  • modtime
  • checksum

Then after a job you can see why files were copied from the host where the job was launched. Then if you want to know why there were issues such as checksums and modtimes, you can log out to debug. Obviously, this will be something to add to the distcp documentation.

Now: big warning. I am personally scared of distCp. It is a critical workflow tool and even use programmatically, yet it is surprisingly brittle. It is a running joke that's the last person two add any code to the module gets to field or support calls until someone else comes along. Thank you for volunteering! This also explains why we will be very reluctant/strict about taking on changes. Don't take it personally is as hey everyone gets that same grilling here.

@symious
Copy link
Contributor Author

symious commented May 4, 2023

Well, I'm afraid your specific problem does not match Dee why do use cases of uploading to stores without checksums. Now, I would I've been happier if distcp's -skipCrc option was required to copy data from an FS with checksums to one without, but it is not and to add it now would break so many people's workflows.

So what do we do here?

maybe: create counters of why files were copied, specifically

  • not found at destination
  • file length different
  • modtime
  • checksum

Then after a job you can see why files were copied from the host where the job was launched. Then if you want to know why there were issues such as checksums and modtimes, you can log out to debug. Obviously, this will be something to add to the distcp documentation.

Now: big warning. I am personally scared of distCp. It is a critical workflow tool and even use programmatically, yet it is surprisingly brittle. It is a running joke that's the last person two add any code to the module gets to field or support calls until someone else comes along. Thank you for volunteering! This also explains why we will be very reluctant/strict about taking on changes. Don't take it personally is as hey everyone gets that same grilling here.

Totally understand.

I have removed the error log if checksum is null for sourcePath and targetPath, so a FS without checksum won't be affected.

I have also changed the error log to INFO level if both checksum exists but not equal, or a DEBUG level will be more preferred? Personally I think INFO level can save us more time since we don't need to reconfig and restart the job.

Comment on lines 599 to 600
LOG.info("Checksum not equal. Source checksum: {}, target checksum: {}",
sourceChecksum, targetChecksum);
Copy link
Member

@ayushtkn ayushtkn May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you aren't putting the paths for which checksum mismatch happened, if there are thousands of file being copied and bunch of them log this.
How would you figure out whose checksum didn't match

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our failed jobs, the mismatch source and target path was printed by https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/util/DistCpUtils.java#L633.

Thanks for pointing that out. The paths have also been added to DistCpUtils.

fs, new Path(sourceBase + srcFilename), null,
fs, new Path(targetBase + srcFilename),
sourceCurrStatus.getLen()));
assertThat(log.getOutput(), containsString("Checksum not equal"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong import & assertThat.
This import

import static org.assertj.core.api.Assertions.assertThat;

and it works like this

        assertThat(log.getOutput()).contains("Checksum not equal");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied the hamcrest version from "org.apache.hadoop.conf.TestReconfiguration".

But changed to the Assertions.assertThat one.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 35s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 259m 35s trunk passed
+1 💚 compile 0m 32s trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 0m 27s trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 checkstyle 0m 32s trunk passed
+1 💚 mvnsite 0m 33s trunk passed
+1 💚 javadoc 0m 37s trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 0m 29s trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 spotbugs 0m 59s trunk passed
+1 💚 shadedclient 20m 17s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 22s the patch passed
+1 💚 compile 0m 22s the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 0m 22s the patch passed
+1 💚 compile 0m 20s the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 javac 0m 20s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 17s the patch passed
+1 💚 mvnsite 0m 23s the patch passed
+1 💚 javadoc 0m 21s the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 0m 19s the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚 spotbugs 0m 48s the patch passed
+1 💚 shadedclient 19m 57s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 14m 48s hadoop-distcp in the patch passed.
+1 💚 asflicense 0m 38s The patch does not generate ASF License warnings.
325m 41s
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5603/5/artifact/out/Dockerfile
GITHUB PR #5603
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 6a3f0a39b78f 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / c73e1da
Default Java Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5603/5/testReport/
Max. process+thread count 700 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5603/5/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@github-actions
Copy link
Contributor

We're closing this stale PR because it has been open for 100 days with no activity. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you feel like this was a mistake, or you would like to continue working on it, please feel free to re-open it and ask for a committer to remove the stale tag and review again.
Thanks all for your contribution.

@github-actions github-actions bot added the Stale label Oct 21, 2025
@github-actions github-actions bot closed this Oct 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants