Skip to content

Conversation

@atsaonerk
Copy link

…STCP_TOTAL_BYTES_COPIED".

The constant indicate number of bytes copied by distcp operation. It is exposed via configuration parameter through which user can obtain the value.

Description of PR

How was this patch tested?

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

…STCP_TOTAL_BYTES_COPIED".

The constant indicate number of bytes copied by distcp operation.
It is exposed via configuration parameter through which user can obtain the
value.
@atsaonerk atsaonerk changed the title Expose distcp counters to user via new DistCpConstants "CONF_LABEL_DI… HADOOP-18635 : Expose distcp counters to user via new DistCpConstants "CONF_LABEL_DI… Feb 15, 2023
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 58s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 43m 8s trunk passed
+1 💚 compile 0m 34s trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 compile 0m 31s trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 checkstyle 0m 33s trunk passed
+1 💚 mvnsite 0m 36s trunk passed
+1 💚 javadoc 0m 36s trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 javadoc 0m 29s trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 spotbugs 0m 58s trunk passed
+1 💚 shadedclient 23m 50s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 31s the patch passed
+1 💚 compile 0m 25s the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 javac 0m 25s the patch passed
+1 💚 compile 0m 23s the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 javac 0m 23s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 17s /results-checkstyle-hadoop-tools_hadoop-distcp.txt hadoop-tools/hadoop-distcp: The patch generated 3 new + 18 unchanged - 0 fixed = 21 total (was 18)
+1 💚 mvnsite 0m 25s the patch passed
+1 💚 javadoc 0m 21s the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 javadoc 0m 19s the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 spotbugs 0m 50s the patch passed
+1 💚 shadedclient 23m 28s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 15m 58s /patch-unit-hadoop-tools_hadoop-distcp.txt hadoop-distcp in the patch passed.
+1 💚 asflicense 0m 36s The patch does not generate ASF License warnings.
117m 35s
Reason Tests
Failed junit tests hadoop.tools.TestExternalCall
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5402/1/artifact/out/Dockerfile
GITHUB PR #5402
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 1c80ac3f44b1 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / a3dcbd6
Default Java Private Build-1.8.0_352-8u352-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5402/1/testReport/
Max. process+thread count 704 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5402/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 56s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 43m 32s trunk passed
+1 💚 compile 0m 34s trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 compile 0m 31s trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 checkstyle 0m 32s trunk passed
+1 💚 mvnsite 0m 36s trunk passed
+1 💚 javadoc 0m 36s trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 javadoc 0m 30s trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 spotbugs 0m 58s trunk passed
+1 💚 shadedclient 25m 17s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 33s the patch passed
+1 💚 compile 0m 28s the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 javac 0m 28s the patch passed
+1 💚 compile 0m 22s the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 javac 0m 22s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 17s /results-checkstyle-hadoop-tools_hadoop-distcp.txt hadoop-tools/hadoop-distcp: The patch generated 5 new + 29 unchanged - 0 fixed = 34 total (was 29)
+1 💚 mvnsite 0m 26s the patch passed
+1 💚 javadoc 0m 21s the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 javadoc 0m 20s the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 spotbugs 0m 57s the patch passed
+1 💚 shadedclient 24m 43s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 15m 21s /patch-unit-hadoop-tools_hadoop-distcp.txt hadoop-distcp in the patch passed.
+1 💚 asflicense 0m 38s The patch does not generate ASF License warnings.
120m 24s
Reason Tests
Failed junit tests hadoop.tools.TestExternalCall
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5402/2/artifact/out/Dockerfile
GITHUB PR #5402
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 6e382f48f842 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 2bce2f6
Default Java Private Build-1.8.0_352-8u352-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5402/2/testReport/
Max. process+thread count 566 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5402/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 5s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
-1 ❌ mvninstall 42m 15s /branch-mvninstall-root.txt root in trunk failed.
+1 💚 compile 0m 34s trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 compile 0m 30s trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 checkstyle 0m 32s trunk passed
+1 💚 mvnsite 0m 37s trunk passed
+1 💚 javadoc 0m 36s trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 javadoc 0m 29s trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 spotbugs 0m 58s trunk passed
+1 💚 shadedclient 24m 16s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 32s the patch passed
+1 💚 compile 0m 27s the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 javac 0m 27s the patch passed
+1 💚 compile 0m 22s the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 javac 0m 22s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 16s the patch passed
+1 💚 mvnsite 0m 26s the patch passed
+1 💚 javadoc 0m 20s the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 javadoc 0m 19s the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 spotbugs 0m 53s the patch passed
+1 💚 shadedclient 25m 34s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 16m 0s /patch-unit-hadoop-tools_hadoop-distcp.txt hadoop-distcp in the patch passed.
+1 💚 asflicense 0m 40s The patch does not generate ASF License warnings.
119m 28s
Reason Tests
Failed junit tests hadoop.tools.TestExternalCall
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5402/3/artifact/out/Dockerfile
GITHUB PR #5402
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux ac77e5e911bb 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 3b45742
Default Java Private Build-1.8.0_352-8u352-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5402/3/testReport/
Max. process+thread count 572 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5402/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a pretty inelegant way of passing results back, isn't it: updating the conf file passed in.

I would be a lot happier if an IOStatisticsSnapshot was built up and returned, but that is a much bigger bit of work. (I'd actually like all iostats to be collected from workers and handed back...)

if you are going this way and others are happy with the idea, all the various counters should be returned this way and the distcp docs updated to cover this.

ToolRunner.run(conf, distcpTool, args);
final long bytesCopied = NumberUtils.toLong(distcpTool.getConf().
get(CONF_LABEL_DISTCP_TOTAL_BYTES_COPIED), 0);
assertEquals("Bytes copied by distcp tool should match source file length",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert args are the wrong way round for the generated error messages.

Corrected order of assert args in TestDistCpSystem.testDistcpLargeFile test.
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 48s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 46m 5s trunk passed
+1 💚 compile 0m 30s trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 compile 0m 27s trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 checkstyle 0m 28s trunk passed
+1 💚 mvnsite 0m 32s trunk passed
+1 💚 javadoc 0m 33s trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 javadoc 0m 24s trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 spotbugs 0m 55s trunk passed
+1 💚 shadedclient 26m 17s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 30s the patch passed
+1 💚 compile 0m 24s the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 javac 0m 24s the patch passed
+1 💚 compile 0m 21s the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 javac 0m 21s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 15s the patch passed
+1 💚 mvnsite 0m 23s the patch passed
+1 💚 javadoc 0m 18s the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04
+1 💚 javadoc 0m 17s the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08
+1 💚 spotbugs 0m 49s the patch passed
+1 💚 shadedclient 26m 23s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 14m 52s hadoop-distcp in the patch passed.
+1 💚 asflicense 0m 32s The patch does not generate ASF License warnings.
123m 50s
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5402/4/artifact/out/Dockerfile
GITHUB PR #5402
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux a45725ac2cee 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 15669c1
Default Java Private Build-1.8.0_352-8u352-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5402/4/testReport/
Max. process+thread count 585 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5402/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor

I really don't like how the results come back.

I'm going to propose adding IOStatistics support to distcp so lined up for future work and to not modify the source config to suddenly become two way exchange of data

  1. DistCp to implement IOStatisticsSource
  2. until job finishes, getIOStatistics() to return null
  3. when job finished,

// to create a builder
IOStatisticsStore iostats = IOStatisticsBinding.iostatisticsStore()
  .withCounter(DISTCP_TOTAL_BYTES_COPIED)
  .build()
  
// then set the counter to the retrieved value
iostats.setCounter(DISTCP_TOTAL_BYTES_COPIED, <counter>)

This is extra work and you have to learn a new api, but

  • IOStatisticsAssertions has the asserts
  • IOStatisticsLogging has pretty printing
  • you can take an IOStatisticsSnapshot and send over the wire as json or java serialized object
  • lines it up perfectly for us collecting more detailed stats, not just from the workers (trickier...) but also cost of directory scanning, cleanup etc.

@github-actions
Copy link
Contributor

We're closing this stale PR because it has been open for 100 days with no activity. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you feel like this was a mistake, or you would like to continue working on it, please feel free to re-open it and ask for a committer to remove the stale tag and review again.
Thanks all for your contribution.

@github-actions github-actions bot added the Stale label Oct 26, 2025
@github-actions github-actions bot closed this Oct 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants