Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-18129: Change URI to String in INodeLink to reduce memory footprint of ViewFileSystem #3996

Closed
wants to merge 2 commits into from

Conversation

abhishekdas99
Copy link
Contributor

https://issues.apache.org/jira/browse/HADOOP-18129

Description of PR

How was this patch tested?

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 9s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 2s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 12m 50s Maven dependency ordering for branch
+1 💚 mvninstall 26m 47s trunk passed
+1 💚 compile 25m 58s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 compile 23m 35s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 4m 5s trunk passed
+1 💚 mvnsite 2m 13s trunk passed
+1 💚 javadoc 2m 3s trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 2m 22s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 22s trunk passed
+1 💚 shadedclient 26m 23s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 25s Maven dependency ordering for patch
+1 💚 mvninstall 1m 33s the patch passed
+1 💚 compile 28m 59s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javac 28m 59s the patch passed
+1 💚 compile 26m 28s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 26m 28s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 4m 20s /results-checkstyle-root.txt root: The patch generated 1 new + 179 unchanged - 1 fixed = 180 total (was 180)
+1 💚 mvnsite 2m 14s the patch passed
+1 💚 javadoc 1m 44s the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 2m 19s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 59s the patch passed
+1 💚 shadedclient 24m 4s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 31m 4s hadoop-common in the patch passed.
-1 ❌ unit 47m 58s /patch-unit-hadoop-hdfs-project_hadoop-hdfs-nfs.txt hadoop-hdfs-nfs in the patch passed.
+1 💚 asflicense 0m 48s The patch does not generate ASF License warnings.
310m 30s
Reason Tests
Failed junit tests hadoop.hdfs.nfs.nfs3.TestNfs3HttpServer
hadoop.hdfs.nfs.nfs3.TestClientAccessPrivilege
hadoop.hdfs.nfs.nfs3.TestViewfsWithNfs3
hadoop.hdfs.nfs.nfs3.TestRpcProgramNfs3
hadoop.hdfs.nfs.nfs3.TestReaddir
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3996/1/artifact/out/Dockerfile
GITHUB PR #3996
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 29f0024893b6 4.15.0-163-generic #171-Ubuntu SMP Fri Nov 5 11:55:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / a75520f0ec1fff39c4b58402bf4f331fe4dc9737
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3996/1/testReport/
Max. process+thread count 3143 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-nfs U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3996/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@ibuenros ibuenros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also comment on performance implications of this change? Specifically ViewFileSystem.getTargetFileSystemPaths may now require parsing strings into URIs on each usage, where does this impose a performance penalty?

}

public Path getMountedOnPath() {
return mountedOnPath;
}

public URI[] getTargetFileSystemURIs() {
return targetFileSystemURIs;
public String[] getTargetFileSystemPaths() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a public method, can we keep the old signature? We can transform from string to URI within the method itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added getTargetFileSystemURIs where we are returning URI[]. Also added method to return targetFileSystem Paths.

targetDirLinkList = new URI[1];
targetDirLinkList[0] = aTargetDirLink;
targetDirLinkList = new String[1];
targetDirLinkList[0] = new URI(aTargetDirLink).toString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of creating a URI and converting back to string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the previous implementation, we were creating the URI object from the target file system path. If the path is not a valid URI, the filesystem initialization should have failed. As we are not keeping the URI object anymore, this ensures we are not dealing with an invalid URI and new URI object validates the target filesystem path.

@abhishekdas99
Copy link
Contributor Author

Can you also comment on performance implications of this change? Specifically ViewFileSystem.getTargetFileSystemPaths may now require parsing strings into URIs on each usage, where does this impose a performance penalty?

The performance impact of this change where we are creating the URI object when its needed as well as creating a dummy URI object to check the validity of the path.
I created 100000 URI objects in less than a second ( It took 700 msec to be precise).

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 50s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 12m 31s Maven dependency ordering for branch
+1 💚 mvninstall 25m 54s trunk passed
+1 💚 compile 24m 27s trunk passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 compile 21m 0s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 3m 57s trunk passed
+1 💚 mvnsite 2m 14s trunk passed
+1 💚 javadoc 1m 43s trunk passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 2m 6s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 22s trunk passed
+1 💚 shadedclient 23m 43s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 24s Maven dependency ordering for patch
+1 💚 mvninstall 1m 19s the patch passed
+1 💚 compile 26m 25s the patch passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javac 26m 25s the patch passed
+1 💚 compile 24m 42s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 24m 42s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 4m 1s /results-checkstyle-root.txt root: The patch generated 1 new + 179 unchanged - 1 fixed = 180 total (was 180)
+1 💚 mvnsite 2m 29s the patch passed
+1 💚 javadoc 1m 43s the patch passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 2m 12s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 4m 51s the patch passed
+1 💚 shadedclient 25m 33s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 18m 51s hadoop-common in the patch passed.
+1 💚 unit 3m 24s hadoop-hdfs-nfs in the patch passed.
+1 💚 asflicense 0m 51s The patch does not generate ASF License warnings.
241m 51s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3996/2/artifact/out/Dockerfile
GITHUB PR #3996
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux c1e481352f5d 4.15.0-163-generic #171-Ubuntu SMP Fri Nov 5 11:55:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / d69c720
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3996/2/testReport/
Max. process+thread count 1286 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-nfs U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3996/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@omalley
Copy link
Contributor

omalley commented Mar 16, 2022

What do the numbers look like for the memory usage before and after this change?

@abhishekdas99
Copy link
Contributor Author

What do the numbers look like for the memory usage before and after this change?

For 40k mount points, the size of ViewFileSystem reduced from 62MB to 28MB.

Copy link
Contributor

@omalley omalley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

@omalley omalley closed this in da9970d Mar 18, 2022
omalley pushed a commit that referenced this pull request Mar 18, 2022
abhishekdas99 added a commit to abhishekdas99/hadoop that referenced this pull request Mar 18, 2022
…print of ViewFileSystem

Fixes apache#3996

(cherry picked from commit da9970d)
omalley pushed a commit that referenced this pull request Mar 22, 2022
…print of ViewFileSystem

Fixes #3996
Fixes #4083

(cherry picked from commit da9970d)
Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
HarshitGupta11 pushed a commit to HarshitGupta11/hadoop that referenced this pull request Nov 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants