Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8314569: (fs) Improve normalization of UnixPath for input with trailing slashes #15342

Closed
wants to merge 6 commits into from

Conversation

stsypanov
Copy link
Contributor

@stsypanov stsypanov commented Aug 18, 2023

Avoiding String.substring() call in UnixPath.normalize() can significantly reduce normalization costs for directories on Linux:

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(time = 2, iterations = 5)
@Measurement(time = 2, iterations = 5)
@Fork(value = 4, jvmArgs = "-Xmx1g")
public class FileToPathBenchmark {

    @Benchmark
    public Path toDirectoryPath(Data data) {
        return FileSystems.getDefault().getPath(data.directoryPath);
    }

    @State(Scope.Thread)
    public static class Data {
        private String directoryPath = "/tmp/tmp/tmp/";
    }
}

Results:

baseline

Benchmark                                                             Mode  Cnt     Score     Error   Units
FileToPathBenchmark.toDirectoryPath                                   avgt   50    40.524 ±   0.940   ns/op
FileToPathBenchmark.toDirectoryPath:·gc.alloc.rate                    avgt   50  3269.167 ± 101.018  MB/sec
FileToPathBenchmark.toDirectoryPath:·gc.alloc.rate.norm               avgt   50   175.213 ±   1.200    B/op
FileToPathBenchmark.toDirectoryPath:·gc.churn.G1_Eden_Space           avgt   50  3279.401 ± 104.252  MB/sec
FileToPathBenchmark.toDirectoryPath:·gc.churn.G1_Eden_Space.norm      avgt   50   175.756 ±   1.681    B/op
FileToPathBenchmark.toDirectoryPath:·gc.churn.G1_Survivor_Space       avgt   50     0.006 ±   0.001  MB/sec
FileToPathBenchmark.toDirectoryPath:·gc.churn.G1_Survivor_Space.norm  avgt   50    ≈ 10⁻³              B/op
FileToPathBenchmark.toDirectoryPath:·gc.count                         avgt   50  1412.000            counts
FileToPathBenchmark.toDirectoryPath:·gc.time                          avgt   50  4144.000                ms

patched

Benchmark                                                             Mode  Cnt     Score     Error   Units
FileToPathBenchmark.toDirectoryPath                                   avgt   50    33.134 ±   2.313   ns/op
FileToPathBenchmark.toDirectoryPath:·gc.alloc.rate                    avgt   50  2652.997 ± 152.758  MB/sec
FileToPathBenchmark.toDirectoryPath:·gc.alloc.rate.norm               avgt   50   115.210 ±   1.960    B/op
FileToPathBenchmark.toDirectoryPath:·gc.churn.G1_Eden_Space           avgt   50  2654.258 ± 157.862  MB/sec
FileToPathBenchmark.toDirectoryPath:·gc.churn.G1_Eden_Space.norm      avgt   50   115.200 ±   1.960    B/op
FileToPathBenchmark.toDirectoryPath:·gc.churn.G1_Survivor_Space       avgt   50     0.007 ±   0.001  MB/sec
FileToPathBenchmark.toDirectoryPath:·gc.churn.G1_Survivor_Space.norm  avgt   50    ≈ 10⁻⁴              B/op
FileToPathBenchmark.toDirectoryPath:·gc.count                         avgt   50  1320.000            counts
FileToPathBenchmark.toDirectoryPath:·gc.time                          avgt   50  3939.000                ms

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8314569: (fs) Improve normalization of UnixPath for input with trailing slashes (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/15342/head:pull/15342
$ git checkout pull/15342

Update a local copy of the PR:
$ git checkout pull/15342
$ git pull https://git.openjdk.org/jdk.git pull/15342/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 15342

View PR using the GUI difftool:
$ git pr show -t 15342

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/15342.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Aug 18, 2023

👋 Welcome back stsypanov! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Aug 18, 2023
@openjdk
Copy link

openjdk bot commented Aug 18, 2023

@stsypanov The following label will be automatically applied to this pull request:

  • nio

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the nio nio-dev@openjdk.org label Aug 18, 2023
@mlbridge
Copy link

mlbridge bot commented Aug 18, 2023

Webrevs

@RogerRiggs
Copy link
Contributor

How often is the only normalization the removal of a trailing "/"? It seems quite likely.
If that's the only normalization, then can the normalize call at line 94 could be replaced with a simple substring(0, n-1).
It would save the string buffer creation and copying.
This is pretty carefully written code so it needs a thorough review.

@AlanBateman
Copy link
Contributor

How often is the only normalization the removal of a trailing "/"? It seems quite likely.
If that's the only normalization, then can the normalize call at line 94 could be replaced with a simple substring(0, n-1).
It would save the string buffer creation and copying.
This is pretty carefully written code so it needs a thorough review.

The most common case is probably no duplicate or trailing slash, in which case this code won't be executed. The next most common is probably the trailing slash case, in which I think you have a good point, just need to be careful not to break the "/" case.

@stsypanov
Copy link
Contributor Author

@AlanBateman isn't "/" case handled in lines 108-109?

@AlanBateman
Copy link
Contributor

@AlanBateman isn't "/" case handled in lines 108-109?

I think Roger's suggestion means that the trailing case would be handled in normalizeAndCheck. There are many ways to do this, we just need to keep it maintainable.

@stsypanov
Copy link
Contributor Author

Updated the PR. Checked cases for normal path (requires no normalization) and root:

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(time = 2, iterations = 5)
@Measurement(time = 2, iterations = 5)
@Fork(value = 4, jvmArgs = "-Xmx1g")
public class FileToPathBenchmark {

    @Benchmark
    public Path toDirectoryPath(Data data) {
        return FileSystems.getDefault().getPath(data.directoryPath);
    }


    @Benchmark
    public Path toDirectoryPathDoubleSlash(Data data) {
        return FileSystems.getDefault().getPath(data.directoryPathWithDoubleSlash);
    }

    @Benchmark
    public Path toNormalPath(Data data) {
        return FileSystems.getDefault().getPath(data.normalPath);
    }

    @Benchmark
    public Path toRootPath(Data data) {
        return FileSystems.getDefault().getPath(data.rootPath);
    }

    @State(Scope.Thread)
    public static class Data {
        private final String rootPath = "/";
        private final String normalPath = "/tmp/tmp/tmp";
        private final String directoryPath = "/tmp/tmp/tmp/";
        private final String directoryPathWithDoubleSlash = "/tmp//tmp/tmp/";
    }
}

Results:

baseline

Benchmark                                                           Mode  Cnt     Score     Error   Units
FileToPathBenchmark.toDirectoryPath                                 avgt   20    44.993 ±   1.676   ns/op
FileToPathBenchmark.toDirectoryPath:·gc.alloc.rate                  avgt   20  3732.628 ± 137.446  MB/sec
FileToPathBenchmark.toDirectoryPath:·gc.alloc.rate.norm             avgt   20   176.000 ±   0.001    B/op
FileToPathBenchmark.toDirectoryPath:·gc.count                       avgt   20   234.000            counts
FileToPathBenchmark.toDirectoryPath:·gc.time                        avgt   20  1129.000                ms
FileToPathBenchmark.toDirectoryPathDoubleSlash                      avgt   20    58.963 ±   1.737   ns/op
FileToPathBenchmark.toDirectoryPathDoubleSlash:·gc.alloc.rate       avgt   20  2717.169 ±  78.664  MB/sec
FileToPathBenchmark.toDirectoryPathDoubleSlash:·gc.alloc.rate.norm  avgt   20   168.000 ±   0.001    B/op
FileToPathBenchmark.toDirectoryPathDoubleSlash:·gc.count            avgt   20   190.000            counts
FileToPathBenchmark.toDirectoryPathDoubleSlash:·gc.time             avgt   20   755.000                ms
FileToPathBenchmark.toNormalPath                                    avgt   20    14.017 ±   0.367   ns/op
FileToPathBenchmark.toNormalPath:·gc.alloc.rate                     avgt   20  2170.714 ±  52.098  MB/sec
FileToPathBenchmark.toNormalPath:·gc.alloc.rate.norm                avgt   20    32.000 ±   0.001    B/op
FileToPathBenchmark.toNormalPath:·gc.count                          avgt   20   163.000            counts
FileToPathBenchmark.toNormalPath:·gc.time                           avgt   20   695.000                ms
FileToPathBenchmark.toRootPath                                      avgt   20     5.679 ±   0.149   ns/op
FileToPathBenchmark.toRootPath:·gc.alloc.rate                       avgt   20  5372.970 ± 136.323  MB/sec
FileToPathBenchmark.toRootPath:·gc.alloc.rate.norm                  avgt   20    32.000 ±   0.001    B/op
FileToPathBenchmark.toRootPath:·gc.count                            avgt   20   259.000            counts
FileToPathBenchmark.toRootPath:·gc.time                             avgt   20  1105.000                ms

patched

Benchmark                                                           Mode  Cnt     Score     Error   Units
FileToPathBenchmark.toDirectoryPath                                 avgt   20    19.713 ±   0.338   ns/op
FileToPathBenchmark.toDirectoryPath:·gc.alloc.rate                  avgt   20  3093.231 ±  52.488  MB/sec
FileToPathBenchmark.toDirectoryPath:·gc.alloc.rate.norm             avgt   20    64.000 ±   0.001    B/op
FileToPathBenchmark.toDirectoryPath:·gc.count                       avgt   20   211.000            counts
FileToPathBenchmark.toDirectoryPath:·gc.time                        avgt   20   914.000                ms
FileToPathBenchmark.toDirectoryPathDoubleSlash                      avgt   20    51.747 ±   3.485   ns/op
FileToPathBenchmark.toDirectoryPathDoubleSlash:·gc.alloc.rate       avgt   20  2220.224 ± 141.430  MB/sec
FileToPathBenchmark.toDirectoryPathDoubleSlash:·gc.alloc.rate.norm  avgt   20   120.000 ±   0.001    B/op
FileToPathBenchmark.toDirectoryPathDoubleSlash:·gc.count            avgt   20   185.000            counts
FileToPathBenchmark.toDirectoryPathDoubleSlash:·gc.time             avgt   20   924.000                ms
FileToPathBenchmark.toNormalPath                                    avgt   20    14.052 ±   0.359   ns/op
FileToPathBenchmark.toNormalPath:·gc.alloc.rate                     avgt   20  2171.433 ±  54.694  MB/sec
FileToPathBenchmark.toNormalPath:·gc.alloc.rate.norm                avgt   20    32.000 ±   0.001    B/op
FileToPathBenchmark.toNormalPath:·gc.count                          avgt   20   164.000            counts
FileToPathBenchmark.toNormalPath:·gc.time                           avgt   20   852.000                ms
FileToPathBenchmark.toRootPath                                      avgt   20     5.922 ±   0.354   ns/op
FileToPathBenchmark.toRootPath:·gc.alloc.rate                       avgt   20  5169.978 ± 287.004  MB/sec
FileToPathBenchmark.toRootPath:·gc.alloc.rate.norm                  avgt   20    32.000 ±   0.001    B/op
FileToPathBenchmark.toRootPath:·gc.count                            avgt   20   245.000            counts
FileToPathBenchmark.toRootPath:·gc.time                             avgt   20  1244.000                ms

Co-authored-by: ExE Boss <3889017+ExE-Boss@users.noreply.github.com>
Copy link
Member

@bplb bplb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks all right.

Copy link
Contributor

@RogerRiggs RogerRiggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks

@openjdk openjdk bot changed the title 8314569: (fs) Improve normalization of UnixPath for directories 8314569: (fs) Improve normalization of UnixPath for directories(fs) Improve normalization of UnixPath for input with trailing slashes Aug 28, 2023
@openjdk
Copy link

openjdk bot commented Aug 28, 2023

@stsypanov This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8314569: (fs) Improve normalization of UnixPath for input with trailing slashes

Reviewed-by: alanb, bpb, rriggs

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 89 new commits pushed to the master branch:

  • 3dc266c: 8315070: RISC-V: Clean up platform dependent inline headers
  • 25f5df2: 8315073: Zero build on macOS fails after JDK-8303852
  • a03954e: 8309697: [TESTBUG] Remove "@requires vm.flagless" from jtreg vectorization tests
  • e5ea9aa: 8312570: [TESTBUG] Jtreg compiler/loopopts/superword/TestDependencyOffsets.java fails on 512-bit SVE
  • 1cb2cc6: 8308464: Shared array class should not always be loaded in boot loader
  • 69d1feb: 8315060: Out of tree incremental build fails with ccache
  • 8e2a533: 8315137: Add explicit override RecordComponentElement.asType()
  • b4b2fec: 8311081: KeytoolReaderP12Test.java fail on localized Windows platform
  • 31e2681: 8315071: Modify TrayIconScalingTest.java, PrintLatinCJKTest.java to use new PassFailJFrame's builder pattern usage
  • 21916f3: 8139208: [macosx] Issue with setExtendedState of JFrame
  • ... and 79 more: https://git.openjdk.org/jdk/compare/8939d15d92982300f090bc1c51f59550529eaaf3...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@AlanBateman, @bplb, @RogerRiggs) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Aug 28, 2023
@stsypanov
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Aug 29, 2023
@openjdk
Copy link

openjdk bot commented Aug 29, 2023

@stsypanov
Your change (at version 2b0ba76) is now ready to be sponsored by a Committer.

@openjdk openjdk bot removed sponsor Pull request is ready to be sponsored ready Pull request is ready to be integrated labels Aug 29, 2023
@AlanBateman
Copy link
Contributor

@stsypanov The title on the PR is currently:

"(fs) Improve normalization of UnixPath for directories(fs) Improve normalization of UnixPath for input with trailing slashes #15342"

Something got fat fingered somewhere. Can you fix it to match the JBS issue "(fs) Improve normalization of UnixPath for input with trailing slashes" and then we can sponsor?

@stsypanov stsypanov changed the title 8314569: (fs) Improve normalization of UnixPath for directories(fs) Improve normalization of UnixPath for input with trailing slashes 8314569: (fs) Improve normalization of UnixPath for input with trailing slashes Aug 29, 2023
@stsypanov
Copy link
Contributor Author

@AlanBateman done

@openjdk openjdk bot added sponsor Pull request is ready to be sponsored ready Pull request is ready to be integrated labels Aug 29, 2023
@AlanBateman
Copy link
Contributor

/sponsor

@openjdk
Copy link

openjdk bot commented Aug 29, 2023

Going to push as commit 93188bd.
Since your change was applied there have been 90 commits pushed to the master branch:

  • 8b8182d: 8315087: G1: Use uint for G1 flags indicating percentage
  • 3dc266c: 8315070: RISC-V: Clean up platform dependent inline headers
  • 25f5df2: 8315073: Zero build on macOS fails after JDK-8303852
  • a03954e: 8309697: [TESTBUG] Remove "@requires vm.flagless" from jtreg vectorization tests
  • e5ea9aa: 8312570: [TESTBUG] Jtreg compiler/loopopts/superword/TestDependencyOffsets.java fails on 512-bit SVE
  • 1cb2cc6: 8308464: Shared array class should not always be loaded in boot loader
  • 69d1feb: 8315060: Out of tree incremental build fails with ccache
  • 8e2a533: 8315137: Add explicit override RecordComponentElement.asType()
  • b4b2fec: 8311081: KeytoolReaderP12Test.java fail on localized Windows platform
  • 31e2681: 8315071: Modify TrayIconScalingTest.java, PrintLatinCJKTest.java to use new PassFailJFrame's builder pattern usage
  • ... and 80 more: https://git.openjdk.org/jdk/compare/8939d15d92982300f090bc1c51f59550529eaaf3...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Aug 29, 2023
@openjdk openjdk bot closed this Aug 29, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Aug 29, 2023
@openjdk
Copy link

openjdk bot commented Aug 29, 2023

@AlanBateman @stsypanov Pushed as commit 93188bd.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@stsypanov stsypanov deleted the unixpath-check-nn branch August 29, 2023 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated nio nio-dev@openjdk.org
5 participants