Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-17531. DistCp: Reduce memory usage on copying huge directories. (#2732). #2808

Merged
merged 2 commits into from
Mar 27, 2021

Conversation

ayushtkn
Copy link
Member

No description provided.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 27m 1s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 9 new or modified test files.
_ branch-3.3 Compile Tests _
+0 🆗 mvndep 13m 31s Maven dependency ordering for branch
+1 💚 mvninstall 23m 24s branch-3.3 passed
+1 💚 compile 18m 23s branch-3.3 passed
+1 💚 checkstyle 2m 47s branch-3.3 passed
+1 💚 mvnsite 2m 49s branch-3.3 passed
+1 💚 javadoc 2m 38s branch-3.3 passed
+1 💚 spotbugs 4m 21s branch-3.3 passed
+1 💚 shadedclient 18m 13s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 24s Maven dependency ordering for patch
+1 💚 mvninstall 1m 52s the patch passed
+1 💚 compile 17m 36s the patch passed
+1 💚 javac 17m 36s root generated 0 new + 1871 unchanged - 1 fixed = 1871 total (was 1872)
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 2m 45s root: The patch generated 0 new + 93 unchanged - 5 fixed = 93 total (was 98)
+1 💚 mvnsite 2m 51s the patch passed
+1 💚 xml 0m 1s The patch has no ill-formed XML file.
+1 💚 javadoc 2m 40s the patch passed
+1 💚 spotbugs 4m 55s the patch passed
+1 💚 shadedclient 18m 22s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 16m 53s hadoop-common in the patch passed.
+1 💚 unit 14m 53s hadoop-distcp in the patch passed.
+1 💚 unit 1m 59s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 50s The patch does not generate ASF License warnings.
201m 15s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2808/1/artifact/out/Dockerfile
GITHUB PR #2808
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell xml
uname Linux adbcf2c417ee 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision branch-3.3 / c4a2868
Default Java Private Build-1.8.0_282-8u282-b08-0ubuntu1~18.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2808/1/testReport/
Max. process+thread count 3143 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-distcp hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2808/1/console
versions git=2.17.1 maven=3.6.0 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@ayushtkn
Copy link
Member Author

Ran the S3A and optional HDFS tests:
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.tools.contract.OptionalTestHDFSContractDistCp
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 234.777 s - in org.apache.hadoop.tools.contract.OptionalTestHDFSContractDistCp
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0

[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.fs.contract.s3a.ITestS3AContractDistCp
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 767.429 s - in org.apache.hadoop.fs.contract.s3a.ITestS3AContractDistCp
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0

AWS Region: ap-south-1

return submit(EXECUTOR, () -> {
try (DurationInfo ignore =
new DurationInfo(LOG, false, "Creating %s", path)) {
createFile(fs, path, true, text.getBytes(Charsets.UTF_8));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

java.nio.charset.StandardCharsets.UTF_8 can be used instead of shaded guava.

Copy link
Member Author

@ayushtkn ayushtkn Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @aajisaka
This is actually a backport of #2732
Will it be ok, if I merge this as is and raise an Addendum PR for trunk as well?

I pulled it up from ITestPartialRenamesDeletes as is

@steveloughran
Copy link
Contributor

+1 on the patch from me, but if @aajisaka has some feedback then let's address that as a followup on trunk before the backport.

I think a followup-patch with the same JIRA ID would be enough for something that minor, and in -3.3 we'd just combine the pair

@ayushtkn
Copy link
Member Author

@aajisaka does that make sense? Let me know your thoughts, Plan to conclude this by tomorrow EOD.

@steveloughran
Copy link
Contributor

@ayushtkn have you got a PR for trunk for the change @aajisaka asked for?

…rectories. (apache#2820). Contributed by Ayush Saxena.

Signed-off-by: Steve Loughran <stevel@apache.org>
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 24m 59s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 9 new or modified test files.
_ branch-3.3 Compile Tests _
+0 🆗 mvndep 13m 43s Maven dependency ordering for branch
+1 💚 mvninstall 23m 8s branch-3.3 passed
+1 💚 compile 18m 14s branch-3.3 passed
+1 💚 checkstyle 2m 49s branch-3.3 passed
+1 💚 mvnsite 2m 52s branch-3.3 passed
+1 💚 javadoc 2m 39s branch-3.3 passed
+1 💚 spotbugs 4m 21s branch-3.3 passed
+1 💚 shadedclient 18m 12s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 21s Maven dependency ordering for patch
+1 💚 mvninstall 1m 53s the patch passed
+1 💚 compile 17m 28s the patch passed
+1 💚 javac 17m 28s root generated 0 new + 1948 unchanged - 1 fixed = 1948 total (was 1949)
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 2m 45s root: The patch generated 0 new + 93 unchanged - 5 fixed = 93 total (was 98)
+1 💚 mvnsite 2m 51s the patch passed
+1 💚 xml 0m 1s The patch has no ill-formed XML file.
+1 💚 javadoc 2m 36s the patch passed
+1 💚 spotbugs 5m 12s the patch passed
+1 💚 shadedclient 18m 35s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 16m 58s hadoop-common in the patch passed.
+1 💚 unit 14m 11s hadoop-distcp in the patch passed.
+1 💚 unit 1m 56s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 49s The patch does not generate ASF License warnings.
198m 41s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2808/2/artifact/out/Dockerfile
GITHUB PR #2808
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell xml
uname Linux f54de7037819 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision branch-3.3 / 782ed0c
Default Java Private Build-1.8.0_282-8u282-b08-0ubuntu1~18.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2808/2/testReport/
Max. process+thread count 3143 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-distcp hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2808/2/console
versions git=2.17.1 maven=3.6.0 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@ayushtkn ayushtkn merged commit 9c9b16c into apache:branch-3.3 Mar 27, 2021
@ayushtkn
Copy link
Member Author

Merged both the main and the addendum commit as part of this PR, Thanx Everyone

@steveloughran
Copy link
Contributor

thanks!

jojochuang pushed a commit to jojochuang/hadoop that referenced this pull request May 23, 2023
… directories. (apache#2808). Contributed by Ayush Saxena.

* HADOOP-17531. DistCp: Reduce memory usage on copying huge directories. (apache#2732).

* HADOOP-17531.Addendum: DistCp: Reduce memory usage on copying huge directories. (apache#2820)

Signed-off-by: Steve Loughran <stevel@apache.org>
 Conflicts:
	hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/test/GenericTestUtils.java
	hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/contract/AbstractContractDistCpTest.java
(cherry picked from commit d86f94d18bd8b33cfc324b5638f12d9018c95d29)
Signed-off-by: Arpit Agarwal <aagarwal@cloudera.com>

Change-Id: Ieec8dbd96444dead3cd115f076a65444ca212a35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants